Jurgens du Toit

jrgns

Everything's Connected:
Hacking the interwebs
with Logstash

Data Generators

Some many things!

Too many things?

Applications

  • Email
  • RSS
  • Twitter
  • IRC
  • Chat

Servers

  • Apache / Nginx
  • HAProxy
  • MySQL
  • Postgres
  • MongoDB
  • Elasticsearch

Monitoring

  • CollectD
  • Monit
  • Nagios
  • Windows Events

Messages

  • RabbitMQ
  • ZeroMQ
  • XMPP
  • SQS

And then...

  • Syslog
  • log4j
  • Accounting Packets
  • Application Logs
  • Error Logs
  • OS Logs
  • Server Metrics
  • Network Metrics
  • Build Logs
  • Deployment Logs
  • Commit History
  • Web Analytics
  • Account History
  • Support History
  • Usage History
  • Financials

The Solution!

Um.... No

More than a drop in the bucket

Stash what?

  • Elastic.co
  • Elasticsearch
  • Logstash
  • Kibana

The ELK Stack

Two Main Use Cases

  1. Full Document Search
  2. Time Series Data

Logstash

  • Great Standalone
  • Great Connector

Think of it as Server Side IFTTT


If

This


Then

That


input {
  stdin {}
}

output {
  stdout {}
}
          

In


collectd       heroku         rackspace  tcp
drupal_dblog   imap           redis      twitter
elasticsearch  invalid_input  relp       udp
eventlog       irc            s3         unix
exec           jmx            snmptrap   varnishlog
file           log4j          sqlite     websocket
ganglia        lumberjack     sqs        wmi
gelf           pipe           stdin      xmpp
gemfire        puppet_facter  stomp      zenoss
generator      rabbitmq       syslog     zeromq
graphite
        

Out


boundary         gemfire               mongodb      sns
circonus         google_bigquery       nagios       solr_http
cloudwatch       google_cloud_storage  nagios_nsca  sqs
csv              graphite              null         statsd
datadog          graphtastic           opentsdb     stdout
datadog_metrics  hipchat               pagerduty    stomp
elasticsearch    http                  pipe         syslog
els_http         irc                   rabbitmq     tcp
els+rover        jira                  rackspace    udp
email            juggernaut            redis        websocket
exec             librato               redmine      xmpp
file             loggly                riak         zabbix
ganglia          lumberjack            riemann      zeromq
gelf             metriccatcher         s3
        

Combinations

  • IMAP → http (Twilio)
  • Twitter → PagerDuty
  • IRC → Redmine
  • eventlog → null

Rewrite FNB InContact
IMAP to HTTP


input {
  imap {
    host => "imap.gmail.com"
    user => "myemail@gmail.com"
    password => "supersecret"
    folder => "InContact"
  }
}
          

filter {
  grok {
    match => [
      "subject",
      "FNB :-?\) R%{NUMBER:amount:float} %{DATA:action} (@ %{DATA:reference} )?from %{WORD:from_type} a/c..(?[0-9]{6}) ((using|@) )?%{DATA:method}\. [0-9]{2}[A-Z][a-z]{2}"
    ]
  }

  if "_grokparsefailure" not in [tags] {
    mutate {
      add_field => {
          "To" => "+27821234567"
        "From" => "+12051234567"
        "Body" => "You spent R %{amount} @ %{reference} :("
      }
      remove_field => [ "@timestamp", "@version" ]
      keep => [ "From", "Body" ]
    }
  }
}
          

output {
  http {
    url => "https://api.twilio.com/2010-04-01/Accounts/ACabafjrSf4FdzsdRg5Sf3deasd3Dadqw0/Messages.json"
    http_method => "post"
    headers => [ 'Authorization', 'Basic SomeAUTHToken==' ]
    format => "form"
  }
}
          

Demo Time...

How about solving them...

Use Cases

  • Modernize old API's
    HTTP in → RabbitMQ out
  • Fix stupid design decisions<
    IMAP in → Elasticsearch out
  • Communication backup
    IRC / XMPP / IMAP in → S3 out
  • Lead Generation TODO
    HTTP → HTTP (Trello)

Questions?

Thanx!

Check Out: