Visualising logs in Kibana

In previous posts (here and here) I have discussed installation of ELK (elasticsearch, logstash, kibana).

I myself have been using this system happily! Cause I did not have to go through all the raw logs on different servers. I could go though the log records, search and sort different logs from different servers all in one place.

The problem arises when the number logs hits a million. In my case I had to deal with more than 30+ millions log records each week, which start to become a disaster after a couple of weeks.

The solution is to realise what you need to find from logs and use Kibana features to do that. In this post I come up with two case scenarios. One case is to find crawlers! The other case is to find daily raw traffic (by the number of requests).

So here we go, first you need to have the basic setup for Kibana meaning at least the Discover tab should be working.

To create the visualisation for daily traffic, just go Visualise tab. Create a visualisation from a new search and then select vertical bar chart.

Now you need to add matrices. For Y axis choose “Unique Count” aggregation and choose “client_ip” for field.
Then add an X axis with “Date Histogram” aggregation, choose the timestamp for field and select an interval (i.e. Daily). Click on apply and save if it is ok.

visitor count

To create the visualisation for finding crawlers, just go Visualise tab. Create a visualisation from a new search and then select data table.

Select “Count” aggregation as Metric. Then Add “Split Rows” aggregation. Choose “Terms” as aggregation, “client_ip” as field and finally number of results (size) plus sorting option (Order).

crawlers

 

 

Advertisements

How I configured ELK (Elasticsearch, Logstash, Kibanna) central log management on Amazon Linux (EC2)

Well, my previous experience deploying ELK was not working on Amazon Linux for some reason!

Here is what I managed to do for making it work. It is notable that the versions used are very important as using any other version might cause components not working together!

Of course downloading these packages from official elasticsearch.org:

wget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-1.4.1.noarch.rpm
wget https://download.elasticsearch.org/logstash/logstash/packages/centos/logstash-1.4.2-1_2c0f5a1.noarch.rpm
wget https://download.elasticsearch.org/kibana/kibana/kibana-4.0.0-beta3.tar.gz

And installing them plus some cleaning up:

sudo yum install elasticsearch-1.4.1.noarch.rpm -y
sudo yum install logstash-1.4.2-1_2c0f5a1.noarch.rpm -y
tar xvf kibana-4.0.0-beta3.tar.gz
sudo mkdir /opt/kibana/
sudo mv kibana-4.0.0-beta3/* /opt/kibana/
rm -rf kibana-4.0.0-beta3

Next, we need to configure each component:

Lets start with Elasticsearch (ES) as the engine who store/retrieve our data. We specify a name for cluster and the host that ES will accept traffic. We also define the allowed sources.

sudo vi /etc/elasticsearch/elasticsearch.yml

Assumption: the local ip address is 192.168.1.2

cluster.name: myCompany
network.host: localhost
http.cors.enabled: true
http.cors.allow-origin: http://192.168.1.2

The main configuration part belongs to Logstash since we need to specify how, what and where the logs are going to be handled and processed. In this case we enable syslog logs to be recived in port 5000. Additionally I need to obtain some S3 logs (load balancer logs).

Basically Logstash configuration has 3 main parts. input, filter and output. In this case input is from S3 and rsyslog. Filter section require some pattern or module to be able to analyze the revived data (grok is the best fit for our case but there might be some ready-made module by the time you read this post). And the last part is output which is defined to ES (as the results will be sent to ES for being stored).

sudo vi /etc/logstash/conf.d/logstash.yml
    input {
        s3 {
            type => "loadbalancer"
            bucket => "myS3bucket"
            credentials => ["AKIAJAAAAAARL3SPBLFA", "Qx9gaaaa/59CMmPsCAAAAI7Hs8di7Eaaaar9SZo1"]
            region_endpoint => "ap-southeast-1"
        }
        syslog {
            type => syslog
            port => 5000
        }
    }
    filter {
        if [severity_label] == "Informational" {drop {}}
        if [facility_label] == "user-level" {drop {}}
        if [program] == "anacron" and [severity_label] == "notice" {drop {}}
        if [type] == "loadbalancer" {
            grok {
                match => [ "message", "%{TIMESTAMP_ISO8601:timestamp} %{NOTSPACE:loadbalancer} %{IP:client_ip}:%{NUMBER:client_port:int} %{IP:backend_ip}:%{NUMBER:backend_port:int} %{NUMBER:request_processing_time:float} %{NUMBER:backend_processing_time:float} %{NUMBER:response_processing_time:float} %{NUMBER:elb_status_code:int} %{NUMBER:backend_status_code:int} %{NUMBER:received_bytes:int} %{NUMBER:sent_bytes:int} %{QS:request}" ]
            }
            date {
                match => [ "timestamp", "ISO8601" ]
            }
        }
        if [type] == "syslog" {
            grok {
                type => "syslog"
                pattern => [ "<%{POSINT:syslog_pri}>%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{PROG:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" ]
                add_field => [ "received_at", "%{@timestamp}" ]
                add_field => [ "received_from", "%{@source_host}" ]
            }
            syslog_pri {
                type => "syslog"
            }

            date {
                type => "syslog"
                match => ["syslog_timestamp", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss"]
            }

            mutate {
                type => "syslog"
                exclude_tags => "_grokparsefailure"
                replace => [ "@source_host", "%{syslog_hostname}" ]
                replace => [ "@message", "%{syslog_message}" ]
            }
            mutate {
                type => "syslog"
                remove => [ "syslog_hostname", "syslog_message", "syslog_timestamp" ]
            }
        }
    }
    output {
        elasticsearch {
            host => "localhost"
            protocol => "http"
        }
    }

And finally the Kibana. We will define the network and port number we want Kibana to listen on:

sudo vi /opt/kibana/config/kibana.yml
port: 8080
host: "192.168.1.2"

And finally, running them:

sudo service elasticsearch start
sudo service logstash start
screen -d -m /opt/kibana/bin/kibana

—————

Please make sure you have configured rsyslog on other servers to send their logs to logstash by simply doing the following:

sudo vi /etc/rsyslog.conf

Add the following:

*.* @@logs.internal.albaloo.com:5000

And restart the syslog service:

sudo service rsyslog restart

Deploying and Configuring ELK (elasticsearch,logstash,kibana)

It gave me headache to make the combination of elasticsearch, logstash, kibana and logstash-forwarder work togather properly. The main problems I faced was to compile the go code for logstash-forwarder and the x509v3 self-signed certificate for logstash.

You will need to get the following files:

elasticsearch-1.4.2.tar.gz
logstash-1.5.0.beta1.tar.gz
kibana-4.0.0-beta3.tar.gz

Use “tar xvf file.tar.gz” to extract them.

Elasticsearch and Kibana have config file so we just need to edit, but for logstash create config/logstash.yml file inside.

mkdir logstash-1.5.0.beta1/config/
touch logstash-1.5.0.beta1/config/logstash.yml
mkdir logstash-1.5.0.beta1/

Edit all these file according to the contents you will find in appendix 1,2 and 3.

vi logstash-1.5.0.beta1/config/logstash.yml
vi elasticsearch-1.4.2/config/elasticsearch.yml
vi kibana-4.0.0-beta3/config/kibana.yml

Then we need to create a certificate and private key for logstash:

mkdir cert
cd cert/
touch ssl.conf (use appendix 5 content)
openssl req -x509 -batch -nodes -newkey rsa:2048 -keyout server.key -out server.crt -config ssl.conf -days 1825

And finally run them

elasticsearch-1.4.2/bin/elasticsearch
kibana-4.0.0-beta3/bin/kibana
logstash-1.5.0.beta1/bin/logstash agent -f ~/logstash-1.5.0.beta1/config/logstash.yml

Deploying logstash_forwarder:

To deploy logstash forwarder we need to intsall go and gem-fpm. We are basically creating a rpm or deb installer file.

yum install golang ruby ruby-devel rubygems
git clone https://github.com/elasticsearch/logstash-forwarder.git
cd logstash-forwarder
go build
gem install fpm
make rpm
sudo rpm -ivh logstash-forwarder-*.x86_64.rpm

Once it is installed we need to deal with keys:

sudo cp server.key /usr/local/etc/logstash-forwarder/server.key
sudo cp server.crt /usr/local/etc/logstash-forwarder/server.crt
sudo openssl x509 -in server.crt -text &gt;&gt; /etc/pki/tls/certs/ca-bundle.crt

To configure logstash forwarder create a file and copy the contents of appendix 4. We used /etc/logstash_forwarder.yml file.

And then run it:

/opt/logstash-forwarder/bin/logstash-forwarder -config /etc/logstash_forwarder.yml

Appendix1: Elasticsearch configuration to be added

script.disable_dynamic: true
network.host: localhost
http.cors.allow-origin: "/.*/"
http.cors.enabled: true

Appendix2: Logstash configuration

input {
  lumberjack {
    port => 5000
    type => "logs"
    ssl_certificate => "~/cert/server.crt"
    ssl_key => "~/cert/server.key"
  }
}

filter {
  if [type] == "syslog" {
    grok {
      match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
      add_field => [ "received_at", "%{@timestamp}" ]
      add_field => [ "received_from", "%{host}" ]
    }
    syslog_pri { }
    date {
      match => [ "syslog_timestamp", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss" ]
    }
  }
}

output {
  elasticsearch { host => localhost }
  stdout { codec => rubydebug }
}

Appendix3: Kibana configuration to be edited

port: 8080
host: "10.0.20.7"
elasticsearch: "http://localhost:9200

Appendix4: Logstash_forwarder configurations

{
    "network": {
        "servers": [ "10.0.0.1:5000" ],
        "ssl certificate": "/usr/local/etc/logstash-forwarder/server.crt",
        "ssl key": "/usr/local/etc/logstash-forwarder/server.key",
        "timeout": 15
    },
    
    "files": [
        {
        "paths": [
        "/var/log/syslog",
        "/var/log/auth.log"
        ],
        "fields": { "type": "syslog" }
        },
        {
        "paths": [
        "/var/log/httpd/*.log"
        ],
        "fields": { "type": "apache" }
        }
    ]
    
}

Appendix 5: OpenSSL configuration file for creating certificate:

[req]
distinguished_name = req_distinguished_name
x509_extensions = v3_req
prompt = no

[req_distinguished_name]
C = TG
ST = Togo
L =  Lome
O = Private company
CN = *

[v3_req]
subjectKeyIdentifier = hash
authorityKeyIdentifier = keyid,issuer
basicConstraints = CA:TRUE
subjectAltName = @alt_names

[alt_names]
DNS.1 = *
DNS.2 = *.*
DNS.3 = *.*.*
DNS.4 = *.*.*.*
DNS.5 = *.*.*.*.*
DNS.6 = *.*.*.*.*.*
DNS.7 = *.*.*.*.*.*.*
IP.1 = 10.0.0.1
IP.2 = 10.0.0.2
IP.3 = 127.0.0.1

Appendix 6: Alternatives for /etc/pki/tls/certs/ca-bundle.crt

/etc/ssl/certs/ca-certificates.crt
/etc/pki/tls/certs/ca-bundle.crt
/etc/ssl/ca-bundle.pem
/etc/ssl/cert.pem
/usr/local/share/certs/ca-root-nss.crt