Kubernetes Sidecar - Logging with FluentD to EFK

Table of Contents

Introduction

Log management in Infrastructure has changed tremendously in recent few years and we are a having wonderful products in the market to manage, parse, analyze log files

These tools are being used extensively for incident management, alerting, security analysis and application performance monitoring etc.

In the Opensource market, Elastic Search or ELK Stack is leading and makes it possible for small to medium companies to afford this wonderful log management solutions. while Splunk leads the enterprise market.

If you are not already aware, ELK stack is a combination of three open source projects, Elastic Search, Log Stash and Kibana and they come together to give you a combined log management experience and each one of them has a job to do.

Elastic Search - Search and Data analytics Engine just like Apache Solr, Lucene
Log Stash - Data processing and ingesting tool sends/stash to Elastic search
Kibana - A Visualizing tool for the data stored in Elastic create graphs, alerts etc.

Hope this gives you a quick intro to ELK. but we are going to use EFK. what is EFK?

ELK vs EFK

You know what is ELK already and what is this EFK. It is an acronym of Elastic FluentD Kibana

If you have not heard/read about fluentD yet. This is your lucky day.

FluentD is a wonderful Log Collector tool just like Log Stash ( far better than) and it serves as unified logging platform yet Simple

Based on my personal experience. I found Log stash really hard to implement with GROK patterns and FluentD saved me. ( I mean it )

There is a lightweight log shipping product from Elastic named Beats as an alternate for LogStash. but it does not meet the efficiency and simplicity of fluentd

With no further ado, let us talk about our objective of implementing the Kubernetes FluentD Sidecar container.

Our Objective

We have a tomcat based web application container to be deployed into Kubernetes.

I would like to collect and ship my tomcat logs and application logs from kubernetes to a remote elastic search server.

we are going to use the Elastic FluentD Kibana (EFK) stack using Kubernetes sidecar container strategy.

Why Sidecar logging and not K8s Cluster level logging

Not all the containers we deploy to K8s is writing logs to stdout, though it is recommended, It does not suit all requirements

Let's say I containerize my Java web app deployed on tomcat writes its own log files like Audit log, Application log etc.

I cannot obviously redirect all of them to the same stream.

Additionally, what if I want these log files to go to different Indexes in my Elastic Search for Monitoring and auditing.

Well, that's where Sidecar comes to help.

FluentD should have access to the log files written by tomcat and it is being achieved through Kubernetes Volume and volume mounts

FluentD would ship the logs to the remote Elastic search server using the IP and port along with credentials.

Steps to deploy fluentD as a Sidecar Container

To deploy fluentD as a sidecar container on Kubernetes POD. we need to create a few configuration elements like ConfigMap, Volumes, Deployment etc.

We will see all of them in detail one by one.

ConfigMap - to store fluentd config file

Our first task is to create a Kubernetes ConfigMap object to store the fluentd configuration file. You can store any non-confidential key-value data in ConfigMap object including files.

Here we are creating a ConfigMap named fluentdconf with the key name equivalent to the resulting filename fluent.conf

This ConfigMap with key:value data set would later be mounted into the POD. where the key would be the file name and the value would be stored into the file.

We have defined three major tags (or) root elements in the fluent.conf file

Source - to define the file details to monitor/lookup and to set format to look out for
filter - customizing the event collected and overwriting fields (or) adding fields
Match - to define what to do with the matching data/log events and where to stash.

you can write your own format using RUBY Regular Expression (or) use built-in formats like apache2 to event filtering.

Under the match root element, we have to define our Elastic Search server details and credentials along with the index name to which we are sending the logs.

apiVersion: v1
kind: ConfigMap
metadata:
  name: fluentdconf
  namespace: development
data:
  fluent.conf: |
      <source>
        @type tail
        time_key time
        format multiline
        format_firstline /^\[[[A-Z]|\s]+\]/
        format1 /^\[(?<LogLevel>[[A-Z]|\s]+)\]\[(?<time>[0-9]{4}-[0-9]{2}-[0-9]{2}\s[0-9]{2}\:[0-9]{2}\:[0-9]{2}\,[0-9]{3})\]\s-\s(?<message>.*+)$/
        path /opt/tomcat/webapps/ROOT/WEB-INF/log/myapp.log
        pos_file /tmp/myapp.log.pos
        tag dev.myapp.applogs
      </source>

      <filter dev.myapp.applogs>
        @type record_transformer
        <record>
        hostname ${hostname}
        </record>
      </filter>

      <match dev.myapp.applogs>
        @type elasticsearch
        host 172.99.1.218
        port 59200
        user elasticusr
        password Pa$$w0rd
        index_name fluentd.${tag}
        logstash_format true
        logstash_prefix dev.myapp.applogs
        flush_interval 1s
      </match>

      <source>
        @type tail
        path /opt/tomcat/logs/localhost_access_log.%Y-%m-%d.txt
        pos_file /tmp/tomcataccesslog.pos
        tag dev.myapp.tomcatlogs
        format apache2
      </source>

      <filter dev.myapp.tomcatlogs>
        @type record_transformer
        <record>
        hostname ${hostname}
        </record>
      </filter>

      <match dev.myapp.tomcatlogs>
        @type elasticsearch
        host 172.99.1.218
        port 59200
        user elasticusr
        password Pa$$w0rd
        logstash_format true
        logstash_prefix dev.myapp.tomcatlogs
      </match>

Creating our Docker Container Images

As discussed earlier, we are going to have two containers in our POD. Tomcat as a primary container with our application pre-deployed and a sidecar container fluentd

Tomcat Docker file

this is the Dockerfile content for our tomcat image and it has been discussed in detail in our previous article here

FROM centos

RUN mkdir /opt/tomcat/

WORKDIR /opt/tomcat
RUN curl -O https://downloads.apache.org/tomcat/tomcat-8/v8.5.76/bin/apache-tomcat-8.5.76.tar.gz
RUN tar xvfz apache*.tar.gz
RUN mv apache-tomcat-8.5.76/* /opt/tomcat/.
RUN sed -i -e "s|mirrorlist=|#mirrorlist=|g" /etc/yum.repos.d/CentOS-*; sed -i -e "s|#baseurl=http://mirror.centos.org|baseurl=http://vault.centos.org|g" /etc/yum.repos.d/CentOS-*
RUN yum -y install java
RUN java -version

WORKDIR /opt/tomcat/webapps
RUN curl -O -L https://github.com/AKSarav/SampleWebApp/raw/master/dist/SampleWebApp.war

EXPOSE 8080

CMD ["/opt/tomcat/bin/catalina.sh", "run"]

You can make changes to the image and build it with your desired name like this

docker build -t tomcat8 .

If you do not want to make customizations you can pull it locally (or) just use my image name globally available from docker hub

docker pull saravak/tomcat8

Note*: The URL we have used in the Dockerfile to download tomcat often be updated by tomcat or there could be different download url available for different regions across the world.

If you are getting any issues with the link.

Please visit https://tomcat.apache.org/download-80.cgi to get the latest download URL and update it, in your dockerfile.

FluentD Docker file

Content for fluentD container's Dockerfile is given below.

FROM ubuntu:16.04 
RUN apt-get update 
RUN ulimit -n 65536 
RUN apt-get install -y curl 
RUN curl https://packages.treasuredata.com/GPG-KEY-td-agent | apt-key add -
RUN echo "deb http://packages.treasuredata.com/3/ubuntu/xenial/ xenial contrib" > /etc/apt/sources.list.d/treasure-data. list

RUN apt-get update && apt-get install -y -q curl make g++ && apt-get clean && apt-get install -y td-agent && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* 

RUN sed -i -e "s/USER=td-agent/USER=root/" -e "s/GROUP=td-agent/GROUP=root/" /etc/init.d/td-agent 

RUN /usr/sbin/td-agent-gem install fluent-plugin-aws-elasticsearch-service -v 1.0.0 CMD /usr/sbin/td-agent $FLUENTD_ARGS

As said earlier you can build your own docker image out of this file (or) simply use my globally available image saravak/fluentd

So before proceeding further, you need to have the docker images ready.

Either your custom-built docker image available in your local (or) you can choose to use my globally available docker images from docker hub

I am going with the global image for this article.

Creating a Kubernetes Deployment Manifest Yaml file

Our Container images are ready and we can use the images in our Kubernetes Deployment manifest.

Rather than using the POD manifest, Deployment kind has various advantages including the creation of replication controller, application roll-out, pod replacement etc

Deployment is the ideal way to deploy Microservices to Production Kubernetes

apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-dpl
  namespace: development
  labels:
    app: myapp
spec:
  replicas: 1
  selector:
    matchLabels:
      app: myapp
  template:
    metadata:
      labels:
        environment: DEV
        managedby: K8s
        app: myapp
    spec:
      containers:

      # Primary Container
      - name: tomcat
        image:  saravak/tomcat8
        volumeMounts:
          - name: applog
            mountPath: /opt/tomcat/webapps/ROOT/WEB-INF/log/
          - name: tomcatlog
            mountPath: /opt/tomcat/logs
        ports:
        - containerPort: 8080
        resources:
          limits:
            memory: 1028Mi

      # This is for FluentD Logging Container
      - name: fluentd
        env:
        - name: FLUENT_UID
          value: root
        - name: FLUENT_CONF
          value: fluent.conf
        - name: FLUENTD_ARGS
          value: -c /fluentd/etc/fluent.conf
        image: saravak/fluentd:elastic
        volumeMounts:
        - name: applog
          mountPath: /opt/tomcat/webapps/ROOT/WEB-INF/log/
        - name:  fdconf
          mountPath:  /fluentd/etc/

      volumes:
      - name: applog
        emptyDir: {}
      - name: tomcatlog
        emptyDir: {}
      - name: fdconf
        configMap:
          name: fluentdconf

In the preceding manifest file, we are creating an application named myapp with two containers under the development namespace

The container names are tomcat and fluentd and the latter one is our primary objective here

For fluentd to function properly we need to pass a few values during the container startup such as

FLUENT_UID - to define what user the fluent should start
FLUENT_CONF - represents the configuration file name
FLUENTD_ARGS - to define the fully qualified path of the fluent.conf file while starting the fluent with -c flag

Volumes and Volume Mounts

Another important item in our deployment manifest YAML file is volumes andVolume_mounts

you can notice that we are creating three volumes named applog, tomcatlog and fdconf

applog - an empty Directory volume to share the application logs directory between Tomcat and FluentD
tomcatlog - an empty Directory volume to share the tomcat logs directory between Tomcat and fluentD
configMap - ConfigMap type of volume to inject config data into the POD. in our case it is a file content stored in configMap

These volumes would be mounted with the help of volumeMounts within the containers Tomcat and FluentD.

When the volume is being mounted. It can be mounted to any physical location/path. This is similar to the Unix mounts.

Just to keep things clean I am using the same path on both containers. but it is not necessary.

the applog and tomcatlogs volumes should be mounted to the location specified in the fluent.conf file and it should match.

Deploying our YAML for Kubernetes FluentD Sidecar setup

So far we have seen how Kubernetes FluentD sidecar container is setup and the necessary elements like ConfigMap, Volume etc

Here I have collected all these codes and compiled as a single file with one addition.

Just to make my application available externally and to expose it, I have added a service to my final YAML file and I can deploy it right now to Kubernetes cluster

This is the complete file with ConfigMap + Service + Deployment

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: fluentdconf
  namespace: development
data:
  fluent.conf: |
      <source>
        @type tail
        time_key time
        format multiline
        format_firstline /^\[[[A-Z]|\s]+\]/
        format1 /^\[(?<LogLevel>[[A-Z]|\s]+)\]\[(?<time>[0-9]{4}-[0-9]{2}-[0-9]{2}\s[0-9]{2}\:[0-9]{2}\:[0-9]{2}\,[0-9]{3})\]\s-\s(?<message>.*+)$/
        path /opt/tomcat/webapps/ROOT/WEB-INF/log/myapp.log
        pos_file /tmp/myapp.log.pos
        tag dev.myapp.applogs
      </source>

      <filter dev.myapp.applogs>
        @type record_transformer
        <record>
        hostname ${hostname}
        </record>
      </filter>

      <match dev.myapp.applogs>
        @type elasticsearch
        host 172.99.1.218
        port 59200
        user elasticusr
        password Pa$$w0rd
        index_name fluentd.${tag}
        logstash_format true
        logstash_prefix dev.myapp.applogs
        flush_interval 1s
      </match>

      <source>
        @type tail
        path /opt/tomcat/logs/localhost_access_log.%Y-%m-%d.txt
        pos_file /tmp/tomcataccesslog.pos
        tag dev.myapp.tomcatlogs
        format apache2
      </source>

      <filter dev.myapp.tomcatlogs>
        @type record_transformer
        <record>
        hostname ${hostname}
        </record>
      </filter>

      <match dev.myapp.tomcatlogs>
        @type elasticsearch
        host 172.99.1.218
        port 59200
        user elasticusr
        password Pa$$w0rd
        logstash_format true
        logstash_prefix dev.myapp.tomcatlogs
      </match>

---
kind: Service
apiVersion: v1
metadata:
  name:  myapp-svc
  namespace: development
  labels:
    app: myapp
spec:
  selector:
    app:  myapp
  type: NodePort
  ports:
  - port:  80
    targetPort:  8080

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-dpl
  namespace: development
  labels:
    app: myapp
spec:
  replicas: 1
  selector:
    matchLabels:
      app: myapp
  template:
    metadata:
      labels:
        environment: DEV
        managedby: K8s
        app: myapp
    spec:
      containers:

      # Primary Container
      - name: tomcat
        image:  saravak/tomcat8
        volumeMounts:
          - name: applog
            mountPath: /opt/tomcat/webapps/ROOT/WEB-INF/log/
          - name: tomcatlog
            mountPath: /opt/tomcat/logs
        ports:
        - containerPort: 8080
        resources:
          limits:
            memory: 1028Mi

      # This is for FluentD Logging Container
      - name: fluentd
        env:
        - name: FLUENT_UID
          value: root
        - name: FLUENT_CONF
          value: fluent.conf
        - name: FLUENTD_ARGS
          value: -c /fluentd/etc/fluent.conf
        image: saravak/fluentd:elastic
        volumeMounts:
        - name: applog
          mountPath: /opt/tomcat/webapps/ROOT/WEB-INF/log/
        - name: tomcatlog
          mountPath: /opt/tomcat/logs
        - name:  fdconf
          mountPath:  /fluentd/etc/

      # Logrotating containers:
      - name: LogRotate
        image: saravak/logrotatek8s
        volumeMounts:
        - name: tomcatlog
          mountPath: /logs/

      volumes:
      - name: applog
        emptyDir: {}
      - name: tomcatlog
        emptyDir: {}
      - name: fdconf
        configMap:
          name: fluentdconf

I save this entire configuration file as myapp-deployment.yml and creating a deployment using kubectl

kubectl create -f myapp-deployment.yml

You can also use

kubectl apply -f myapp-deployment.yml

Validating Kubernetes FluentD and Tomcat Containers in POD

After creating the deployment. Once you get the message "deployment created" in your terminal.

You can give it a few seconds for the POD to get created and check if the pod is ready

Check if the pod is created and running with 2 containers

# kubectl get podsNAME                       READY STATUS  RESTARTS AGE
myapp-dpl-5f5bf998c7-m4p79 2/2   Running 0        128d

you can see the status is Running and both fluentd and tomcat containers are ready

If you see anything other than 2/2 it means an issue with container startup. One of the containers might have failed.

Container wise status

If you have jq installed in your terminal. you can parse the kubectl output and display a container-wise status like this along with their name.

✗ kubectl get pods myapp-dpl-5f5bf998c7-m4p79 -o json|jq '.status.containerStatuses[] |{Name: .name, Status: .state} '
{
  "Name": "fluentd",
  "Status": {
    "running": {
      "startedAt": "2020-11-16T18:44:40Z"
    }
  }
}
{
  "Name": "tomcat",
  "Status": {
    "running": {
      "startedAt": "2020-11-16T18:44:40Z"
    }
  }
}

Describe pod to know more information

If you need to more information about your POD like the volume names and the volume mount names etc

you can use the following command

kubectl describe myapp-dpl-5f5bf998c7-m4p79

Check the logs of Kubernetes fluentD container for any Error

Just to make sure there are no connectivity issues and FluentD is able to connect to the Elastic Search server at the remote.

We need to check the logs of fluentd container running inside the pod

kubectl logs myapp-dpl-5f5bf998c7-m4p79 fluentd

Since there are two containers in the POD, you have to explicitely mention the pod name after the POD name.

Validating the arrival of the logs at Elastic

Now you can log in to Kibana dashboard and validate if you are able to see the logs coming in

The hostname on each event would match your POD name

Conclusion

Hope this article helps you to understand a few concepts like Sidecar container and our primary objective of collecting and shipping the logs to Elastic Search has also been achieved.

Kubernetes FluentD - EFK logging is really efficient and microservices ready and this would be helpful in various other microservice setups as well.

If you have any questions or tips to make this article better. Please feel free to let me know in comments

Cheers
Sarav AK

Follow me on Linkedin My Profile
Follow DevopsJunction onFacebook orTwitter
For more practical videos and tutorials. Subscribe to our channel

Signup for Exclusive "Subscriber-only" Content