Writing An Effective GROK Pattern - DEV Community
Maybe your like
Grok is one of the popular Logstash filters which is used to parse the unstructured log data to a meaningful format.
Logstash ships with 120 default built-in patterns. You can find them here: https://github.com/logstash-plugins/logstash-patterns-core/tree/master/patterns
Also, some of the patterns can be referred from https://github.com/hpcugent/logstash-patterns/blob/master/files/grok-patterns I personally prefer the above link for constructing grok pattern.
Now, there may be cases when these grok patterns won't fit. So, we have a regular expression library Oniguruma, which can be combined with grok to create powerful patterns.
Grok Syntax
%{SYNTAX:SEMANTIC} Enter fullscreen mode Exit fullscreen mode- SYNTAX is the default grok patterns
- SEMANTIC is the key
Oniguruma Syntax
(?<field_name>regex pattern) Enter fullscreen mode Exit fullscreen mode- field_name is the key
- regex pattern is the placeholder to add your regex
How to use?
Let's try to create a pattern to parse unstructured log data.
Sample Log Data
09:33:45,416 (metrics-logger-reporter-1-thread-1) type=GAUGE, name=notifications.received, value=2 Enter fullscreen mode Exit fullscreen modeRequired fields from log data
| Field | Field Value |
|---|---|
| timestamp | 09:33:45,416 |
| logthread | metrics-logger-reporter-1-thread-1 |
| type | GAUGE |
| name | notifications.received |
| value | 2 |
Grok Pattern
We will use Grok Debugger to test our pattern to match the log data.
Let's disintegrate the log data to create a pattern that matches a particular field:
| Field | Pattern |
|---|---|
| timestamp | %{TIME} |
| type | %{DATA} |
| name | %{DATA} |
| value | %{POSINT} |
The field thread, can be a combination of the alphanumeric characters.
So, we need to use oniguruma to match the field logthread. Considering the syntax of oniguruma, we need to create a regex pattern that will match the value of the field logthread
Constructing Regex Pattern
We now use Regex Checker that will help us to construct and test the regex pattern for the value of field logthread

The (?:[()a-zA-Z\d-]+) non-capturing group matches single character present in the list below:
- + greedy match i.e. matches the previous token between one and unlimited times, as many times as possible
- () matches a single character in the list ()
- a-z matches a single character in the range between a and z
- A-Z matches a single character in the range between A and Z
- \d matches a digit
- - matches the character -
Oniguruma
The final Oniguruma pattern for the field logthread:
(?<logthread>(?:[()a-zA-Z\d-]+)) Enter fullscreen mode Exit fullscreen modeGrok Pattern + Oniguruma (Final Pattern)
The final pattern that will match the log data:
%{TIME:timestamp} \((?<logthread>(?:[()a-zA-Z\d-]+))\) type=%{DATA:type}, name=%{DATA:name}, value=%{POSINT:value} Enter fullscreen mode Exit fullscreen mode
Output of the pattern
{ "timestamp": [ [ "09:33:45,416" ] ], "HOUR": [ [ "09" ] ], "MINUTE": [ [ "33" ] ], "SECOND": [ [ "45,416" ] ], "logthread": [ [ "metrics-logger-reporter-1-thread-1" ] ], "type": [ [ "GAUGE" ] ], "name": [ [ "notifications.received" ] ], "value": [ [ "2" ] ] } Enter fullscreen mode Exit fullscreen modeConclusion
The combination of Grok Pattern and Oniguruma is a perfect pair. Tha pairing can help to transform any complex logs into structured data. Give it a try using Grok Pattern + Oniguruma in Logstash !!
Let me know in the comments if you have any better way of doing or facing any problem with the above example.
Tag » How To Write The Grook In Logstash
-
Grok Filter Plugin | Logstash Reference [8.4] - Elastic
-
What Is Grok In Logstash? Patterns, Examples & Debugging
-
Tutorial: Logstash Grok Patterns With Examples - Coralogix
-
Guide On How To Use The Grok Filter Plugin Logstash Pattern
-
ELK - 31. LOGSTASH : THE GROK FILTER (PRINCIPLES AND USE)
-
Webinar: Introduction To The Logstash Grok - YouTube
-
How To Write Grok Pattern In Logstash - Stack Overflow
-
How To Write The Grok Expression For My Log? - Stack Overflow
-
Log Analysis - Custom GROK Pattern - IBM
-
Grok Input Data Format | Telegraf 1.14 Documentation
-
Filters/grok
-
Logstash-patterns/grok-patterns At Master - GitHub
-
How To Use Grok To Structure Unstructured Data In Logstash - Medium
-
GROK: Instructions For Use - NetEye Blog