How I configured ELK (Elasticsearch, Logstash, Kibanna) central log management on Amazon Linux (EC2)

Well, my previous experience deploying ELK was not working on Amazon Linux for some reason!

Here is what I managed to do for making it work. It is notable that the versions used are very important as using any other version might cause components not working together!

Of course downloading these packages from official elasticsearch.org:

wget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-1.4.1.noarch.rpm
wget https://download.elasticsearch.org/logstash/logstash/packages/centos/logstash-1.4.2-1_2c0f5a1.noarch.rpm
wget https://download.elasticsearch.org/kibana/kibana/kibana-4.0.0-beta3.tar.gz

And installing them plus some cleaning up:

sudo yum install elasticsearch-1.4.1.noarch.rpm -y
sudo yum install logstash-1.4.2-1_2c0f5a1.noarch.rpm -y
tar xvf kibana-4.0.0-beta3.tar.gz
sudo mkdir /opt/kibana/
sudo mv kibana-4.0.0-beta3/* /opt/kibana/
rm -rf kibana-4.0.0-beta3

Next, we need to configure each component:

Lets start with Elasticsearch (ES) as the engine who store/retrieve our data. We specify a name for cluster and the host that ES will accept traffic. We also define the allowed sources.

sudo vi /etc/elasticsearch/elasticsearch.yml

Assumption: the local ip address is 192.168.1.2

cluster.name: myCompany
network.host: localhost
http.cors.enabled: true
http.cors.allow-origin: http://192.168.1.2

The main configuration part belongs to Logstash since we need to specify how, what and where the logs are going to be handled and processed. In this case we enable syslog logs to be recived in port 5000. Additionally I need to obtain some S3 logs (load balancer logs).

Basically Logstash configuration has 3 main parts. input, filter and output. In this case input is from S3 and rsyslog. Filter section require some pattern or module to be able to analyze the revived data (grok is the best fit for our case but there might be some ready-made module by the time you read this post). And the last part is output which is defined to ES (as the results will be sent to ES for being stored).

sudo vi /etc/logstash/conf.d/logstash.yml
    input {
        s3 {
            type => "loadbalancer"
            bucket => "myS3bucket"
            credentials => ["AKIAJAAAAAARL3SPBLFA", "Qx9gaaaa/59CMmPsCAAAAI7Hs8di7Eaaaar9SZo1"]
            region_endpoint => "ap-southeast-1"
        }
        syslog {
            type => syslog
            port => 5000
        }
    }
    filter {
        if [severity_label] == "Informational" {drop {}}
        if [facility_label] == "user-level" {drop {}}
        if [program] == "anacron" and [severity_label] == "notice" {drop {}}
        if [type] == "loadbalancer" {
            grok {
                match => [ "message", "%{TIMESTAMP_ISO8601:timestamp} %{NOTSPACE:loadbalancer} %{IP:client_ip}:%{NUMBER:client_port:int} %{IP:backend_ip}:%{NUMBER:backend_port:int} %{NUMBER:request_processing_time:float} %{NUMBER:backend_processing_time:float} %{NUMBER:response_processing_time:float} %{NUMBER:elb_status_code:int} %{NUMBER:backend_status_code:int} %{NUMBER:received_bytes:int} %{NUMBER:sent_bytes:int} %{QS:request}" ]
            }
            date {
                match => [ "timestamp", "ISO8601" ]
            }
        }
        if [type] == "syslog" {
            grok {
                type => "syslog"
                pattern => [ "<%{POSINT:syslog_pri}>%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{PROG:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" ]
                add_field => [ "received_at", "%{@timestamp}" ]
                add_field => [ "received_from", "%{@source_host}" ]
            }
            syslog_pri {
                type => "syslog"
            }

            date {
                type => "syslog"
                match => ["syslog_timestamp", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss"]
            }

            mutate {
                type => "syslog"
                exclude_tags => "_grokparsefailure"
                replace => [ "@source_host", "%{syslog_hostname}" ]
                replace => [ "@message", "%{syslog_message}" ]
            }
            mutate {
                type => "syslog"
                remove => [ "syslog_hostname", "syslog_message", "syslog_timestamp" ]
            }
        }
    }
    output {
        elasticsearch {
            host => "localhost"
            protocol => "http"
        }
    }

And finally the Kibana. We will define the network and port number we want Kibana to listen on:

sudo vi /opt/kibana/config/kibana.yml
port: 8080
host: "192.168.1.2"

And finally, running them:

sudo service elasticsearch start
sudo service logstash start
screen -d -m /opt/kibana/bin/kibana

—————

Please make sure you have configured rsyslog on other servers to send their logs to logstash by simply doing the following:

sudo vi /etc/rsyslog.conf

Add the following:

*.* @@logs.internal.albaloo.com:5000

And restart the syslog service:

sudo service rsyslog restart
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s