Container Log Collection and Management

The way Kubernetes handles container logging is called cluster-level-logging, which means that the logging system is completely independent of the lifecycle of containers, pods, and nodes. This design is of course to ensure that, whether the container hangs, Pod was deleted, or even node downtime, the application logs can still be obtained normally.

For a container, when the application outputs logs to stdout and stderr, the container project by default outputs those logs to a JSON file on the host. This way, you can see these container logs via the kubectl logs command.

The first option is to deploy a logging agent on the Node that forwards the log files to backend storage for safekeeping. The architecture of this scenario is shown below

The core lies in the logging agent, which usually runs on the node as DaemonSet, then mounts the container log directory on the host, and finally the logging-agent forwards the logs out.
The biggest advantage of deploying a logging agent on a Node is that only one agent needs to be deployed on a Node, and it is not intrusive to the application or Pod. Therefore, this solution is the most commonly used one in the community.
The second Kubernetes container logging scenario is to deal with this special case, that is, when the container's logs can only be output to certain files, we can use a sidecar container to re-export these log files to the stdout and stderr of the sidecar, so that we can continue to use the first scenario.

Since the volume is shared between the sidecar and the host container, the additional performance loss of the sidecar scenario here is not that high, just a little more CPU and memory.
There are actually two identical log files on the host: one written by the application itself, and one JSON file corresponding to sidecar's stdout and stderr. This is a huge waste of disk. So, unless you have no other choice, or the application container is completely immune to modification, it's recommended that you use either option 1, or the third option below.
The third option is to use a sidecar container to send the application log files directly to the remote storage. That is, the equivalent of the logging agent in option one, placed in the application Pod. The architecture of this scenario is shown below:

In this scenario, your application can also directly output logs to a fixed file instead of stdout, your logging-agent can also use fluentd, and the back-end storage can also be ElasticSearch. only that the input source of fluentd becomes the application's log files. Generally speaking, we will save the input source configuration of fluentd in a ConfigMap.
Although this solution is easy to deploy and very friendly to the host, the sidecar container is likely to consume more resources, and even drag down the application container. Also, since the logs are still not output to stdout, you can't see any log output via kubectl logs.