Life would be much simpler if applications running inside containers always behaved correctly. Unfortunately, as every sysadmin and developer knows, that is never the case. When things inevitably start going wrong you need diagnostic information to figure out how and why. Being able to gather useful information from your Docker container logs can mean the difference between a minor issue and a critical outage.
But there are numerous ways to record logs in Docker, most of which don’t exist for traditional application logging. Getting logging in Docker right involves choosing the right option for collecting logs and analyzing them holistically. If you’re managing large numbers of containers with Docker’s swarm mode, the importance of aggregating logs into a central docker logs location cannot be overstated—it’s the ultimate way to use the power of logging in Docker at scale.
With other systems, recording application log messages was done explicitly by writing those messages to the system logger. For the syslog daemon—the de facto system logger for many scenarios—doing so involved using the syslog() system call from your application.
Things are different in Docker. In Docker, everything written to stdout and stderr is implicitly sent to a logging driver. Logging drivers provide a mechanism to record text strings. The default driver is json-file, which formats all messages as JSON and writes them to a per-container text file.
Additionally, Docker has a number of logging drivers available. For example, if you’re collecting logs in Amazon CloudWatch Logs, you can use the awslogs logging driver to write log messages from your container directly to your Amazon CloudWatch Logs account.
Logging drivers can be configured per-node at the daemon level by using the log-driver option in the daemon.json file, or on a per-container basis with the –log-driver flag.
However, there’s one major caveat with most of the logging drivers that ship with Docker: you cannot use the docker logs command to inspect the logs. That command only works with the json-file and journald drivers. For all other drivers, you need to inspect the logs on the destination side.
For most users, the json-file logging driver or one of the other built-in drivers work well enough. But if you need total control over where your logs are sent, you need a way to route your logs—and logspout is just the tool for the job.
It’s possible to build more complex log management setups with logspout, a log router for Docker containers. Logspout is essentially a log routing appliance that directs log messages from your containers to a syslog server, but doesn’t provide any other features like searching through log history or log management—that’s the job of log management tools. Instead, logspout collects logs from other containers that use the json-file or journald logging drivers and routes them according to the configuration options set on the Docker command line.
Logspout is a powerful tool that can individually route stdout or stderr from any selection of containers to one or more logging destinations, such as SolarWinds® Loggly® and SolarWinds Papertrail™. All of these tools combine to form a logging design which starts with the logging driver in the Docker container, optionally includes a logspout container for routing in the middle, and finishes with a log management endpoint.
Here’s a simple example that sends all logs to a remote syslog server using the TLS-encrypted transport option:
$ docker run --name="logspout" \ --volume=/var/run/docker.sock:/var/run/docker.sock \ gliderlabs/logspout \ syslog+tls://logs.papertrailapp.com:55555
With all of those options now covered, we can move on to picking the best choice for managing your container logging. The simplest (and the one that’s right for most users) is either the json-file or journald logging drivers. journald offers some additional features over json-file, such as the ability to retrieve specific logs based on its fields, like querying a database. And of course, with these two drivers you can use the docker logs command to inspect container logs, and docker service logs to read Docker swarm logs.
If you’re running multiple containers, it probably makes more sense to store the logs remotely instead of on the local disk. Not only will this practice prevent them from being deleted if you delete your containers with docker rm, it also makes it easier to review the logs by keeping them in a central place. This is where log management tools really shine. All of them allow you to sift through reams of log data to diagnose and monitor your applications. To use these tools, you can use the Docker built-in logging drivers, such as gelf which works with Graylog and Fluentd, or a plugin logging driver.
And finally, if you just need complete control over which containers need to send their logs to one or more destinations, a tool such as logspout provides the features you’ll need. Through its environment variables you can configure logspout to do things such as not collect logs for specific containers (EXCLUDE_LABEL), specify the desired syslog format (SYSLOG_FORMAT), and enable multiline logging (MULTILINE_ENABLE_DEFAULT).
Whichever mechanism you pick, getting all your logs to a central location is only the start. Once they’re aggregated in one place, you need to actually use them to monitor your containers and holistically analyze any issues.
In the old days, you could SSH into your server and analyze the log files using nothing more than grep and awk. While this option is still available, the rise of microservices and Docker makes managing a large number of logs commonplace for most sysadmins and developers, who frequently have to deal with Docker swarm logging. Each container produces valuable log information which needs to be stored. Pooling all of these logs together and storing the Docker container logs in one location allows them to be easily analyzed for insights and diagnostic clues.
And it’s not just the container logs that need to be stored. Modern DevOps teams share responsibility for both the infrastructure and the application, so storing infrastructure management logs (such as from Docker Engine, UCP and DTR system logs, and containerized infrastructure services) alongside application logs enables DevOps teams to work collaboratively.
These aggregation and management tools turn your logs into a treasure trove of data via a range of features such as full-text search, log tailing and real-time monitoring, and alerting. There are a few things you can fine-tune in your Docker logging configuration to make the most of these features.
Once you’re aggregating logs for lots of containers, the volume of information can be overwhelming—especially if you are combing through the logs looking for something. Fortunately, Docker supports tagging logs with the first 12 characters of the container ID. Depending on the scale of your container deployment, though, a simple tag may not be descriptive enough. For added flexibility, you can customize the tag and even include container attributes to make it easier to search.
To customize the log driver output use the –log-opt tag option:
$ docker run --log-driver=syslog --log-opt syslog-address=udp://127.0.0.1:5514 --log-opt syslog-facility=daemon --log-opt tag=app01 --name logging-02 -ti -d
(If you’ve previously used the –log-opt syslog-tag option, you can replace it with the generic tag option which was added in 1.9.0 and works across all logging drivers.)
If you’re using multiple attributes in the tag, using a separator such as ‘-’ or ‘/’ can make it easier to see which part of the log message is the tag and which is the message.
Containers configured to send their logs to a remote destination are vulnerable to network issues, which can wreak havoc by making log tailing and real-time monitoring unusable. If you’re sending logs to a remote syslog daemon, you need to be aware of how network interruptions can affect your containers; the fallout ranges from dropped messages to hanging the container.
If network issues occur while the container is up and running, messages can be dropped, creating error messages such as:
Feb 22 11:03:28 app01 docker: time="2019-02-22T11:03:28.967438497+01:00" level=error msg="log01" for logger syslog: dial tcp 192.xxx.xxx.xxx:5140: getsockopt: connection refused"
For containers using the syslog logging driver with TCP, if Docker is unable to connect to the remote daemon the container will fail to start and hang. Hanging is mainly an issue when a container contacts the syslog server on startup. It’s possible to work around this problem by swapping TCP for UDP in your server URI, since UDP does not make the same delivery guarantees and can gracefully handle network packet failures.
Logspout offers its own mechanism for dealing with network issues—it has the INACTIVITY_TIMEOUT environment variable, which restarts the log stream when no activity has occurred for the specified period.
You can also minimize interference from network issues by setting up a syslog server on the host, or a dedicated syslog container to forward the logs to a remote server — this is how logspout works. With the dedicated logging container approach, any network issues only affect that one container and all others will function normally. (As an added bonus, dedicated logging containers allow you to scale up, by adding more containers when necessary.)
The ability to perform full-text searches on your logs allows you to take advantage of all that data, but this ability does come with a security risk. Application logs can contain sensitive data. Sending those messages unencrypted to remote destinations can potentially result in a data leak. You can encrypt your syslog connection using the syslog-address=tcp+tls –log-opt option (for example, –log-opt syslog-address=tcp+tls://192.168.1.3:514)
Thankfully, even though it’s possible to use syslog without TLS encryption, it’s rarely used that way and most tutorials and documentation suggest using it. But TLS does require TCP (it doesn’t work with any other network protocol) and, as discussed above, network issues with TCP connections can cause your containers to hang. The dedicated logging container approach, coupled with TLS encryption, provides a secure, robust way to deliver encrypted logs.
Alerts are an invaluable service for busy sysadmins, helping to minimize service disruption by automatically notifying the right teams when incidents occur. For example, one of Papertrail’s key features is search alerts, which can detect patterns in logs (and even when a pattern doesn’t occur) and notifies external services such as Slack, SolarWinds Librato, PagerDuty, Campfire, and your own custom HTTP webhooks.
Papertrail can send notifications every minute, every hour, or every day, allowing it to be used for everything from by-the-minute monitoring to daily reporting.
The docker logs command provides a –follow option that allows you to continuously watch container log files as they’re written. This feature, known as live tail, is extremely useful for troubleshooting containers in production because you can monitor logs in real time and see issues as they happen. Developers often find that visibility into their apps invaluable, and it’s something many log management tools also support.
Papertrail has a live tail feature which also lets you search and filter tailed logs to drill down into the parts you care about most. And if you need to tail logs from multiple sources, the Papertrail event viewer brings them all together into a single web-based interface.
There are many ways to build a log aggregation design with Docker, and making those choices isn’t always straightforward. The way you collect, aggregate, and analyze your container logs directly affects your team’s ability to maintain service availability and minimize disruptions. With the tips provided in this post, you’ll be able to fine-tune your logging setup and get the most value out of logs.