Logstash and WSO2 Carbon Logs, dealing with Java Stack Traces

So, now that I’ve got my WSO2 Cluster setup, I get to diagnose issues. The biggest problem is that when trying to work through a cause I’ve got to look at half a dozen log files spread across half a dozen machines. Centralized logging is the solution of course.

I prefer logstash and kibana because they’re free and very configurable. They are suprisingly easy to setup, and once you find a few tools (grok debugger) easy to configure. The biggest problem is that most logs are only a single line. However, WSO2 has the nasty habit of dumping java stack traces in it’s log all the time. Luckily Logstash has the Multiline filter to help with that. Configuring Multiline is a bit of a pain, so here is the config I’m using.

input {
 syslog {
  port => 514
  type => "syslog"
 }
}

To make life easier, I just use rsyslog for everything, one thing I didn’t realize is that syslog automatically applys a syslog grok, and truncates the message.

filter {
    if "_grokparsefailure" in [tags] {
        grok {
            type => "syslog"
            match => ["message", "%{SYSLOG5424PRI}%{TIMESTAMP_ISO8601} +(?:%{HOSTNAME:syslog5424_host}|-) %{SYSLOGPROG}%{GREEDYDATA:messagebodysyslog}"]
            match => ["message", "%{SYSLOG5424PRI}%{SYSLOGTIMESTAMP} +(?:%{HOSTNAME:syslog5424_host}|-) %{SYSLOGPROG}%{GREEDYDATA:messagebodysyslog}"]
            remove_tag => ["_grokparsefailure"]
        }
        if "_grokparsefailure" not in [tags] {
            mutate {
                replace => ["message","%{messagebodysyslog}"]
                remove_field => ["messagebodysyslog"]
            }
        }
    }
}

This section just parses the syslog portion out of anything not caught by the input syslog filter. Notice the 2 matches, for some reason I have some rsyslog messages coming in with one time format, and others with a different one. Somtimes even from the same machine

filter {
    if "wso" in [program] and "multiline" not in [tags] {
                grok    {
            match => [ "message", "TID\: \[%{INT}\] \[%{WORD:product}\] \[%{TIMESTAMP_ISO8601:logdate}\] +%{LOGLEVEL:level} \{%{DATA:classname}\} - %{GREEDYDATA:messagebody22}"]
                        remove_tag => ["_grokparsefailure"]
                }
        if "_grokparsefailure" not in [tags] {
                    mutate {
                            replace => ["message","%{messagebody22}"]
                remove_field => ["messagebody22"]
                    }
        }
    }

}

Here is the wso2 parser, it gets almost all versions of wso2 messages, except stack traces of course

filter {
    if "wso2" in [program] {
        multiline {
            pattern => "(Uncaught exception.+)|(([^\s]+)Exception.+)|(([\s]+)at.+\))|(.+\.Exception)"
            stream_identity => "%{logsource}.%{@type}"
            what    =>"previous"
        }
    }
}

The stack trace, it only catches if it’s a wso2 message, and any line that matches the pattern regex gets stored and merged

filter {
    if "wso2" in [program] {
        mutate {
            replace => ["type","wso2_carbon"]
            add_field => ["logsource","%{syslog5424_host}"]
            remove_field => ["@originalmessage"]
            remove_tag   => ["_grokparsefailure"]
        }
    }
}

Clean up the messages

output {
 elasticsearch_http {
  host => "10.21.3.48"
 }
}

Output to elasticsearch

This entry was posted in Technical. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *