Skip to main content
Version: 0.11

Preprocessors

Preprocessors operate on the raw data stream and transform it. They are run before data reaches the codec and do not know or care about tremor's internal representation.

Online codecs, preprocessors can be chained to perform multiple operations in succession.

Supported Preprocessors

lines

Splits the input into lines, using character 10 \n as the line separator.

Buffers any line fragment that may be present (after the last line separator), till more data arrives. This makes it ideal for use with streaming onramps like tcp, to break down incoming data into distinct events.

Any empty lines present are forwarded as is -- if you want to remove them, please chain the remove-empty preprocessor with this preprocessor. An example:

preprocessors:
- lines
- remove-empty

Note: The proliferation of various lines preprocessors here will go away once preprocessors support configuration.

lines-null

Variant of the lines preprocessor that uses null byte \0 as the line separator.

lines-pipe

Variant of the lines preprocessor that uses pipe character | as the line separator.

lines-no-buffer

Variant of the lines preprocessor that does not buffer any data that may be present after the last line separator -- the fragment is forwarded as is (i.e. treated as a full event).

lines-cr-no-buffer

Variant of the lines-no-buffer preprocessor that uses character 13 \r (carriage return) as the line separator.

base64

Decodes base64 encoded data to the raw bytes.

decompress

Decompresses a data stream. It is assumed that each message reaching the decompressor is a complete compressed entity.

The compression algorithm is detected automatically from the supported formats. If it can't be detected, the assumption is that the data was decompressed and will be sent on. Errors then can be transparently handled in the codec.

Supported formats:

  • gzip
  • zlib
  • xz
  • snappy
  • lz4

gzip

Decompress GZ compressed payload.

zlib

Decompress Zlib (deflate) compressed payload.

xz

Decompress Xz2 (7z) compressed payload.

snappy

Decompress framed snappy compressed payload (does not support raw snappy).

lz4

Decompress Lz4 compressed payload.

gelf-chunking

Reassembles messages that were split apart using the GELF chunking protocol.

If the GELF messages were sent compressed, you can decompress them by chaining the decompress preprocessor. An example is documented here -- you may need to apply decompress before and/or after the reassembly here, depending on how your GELF client(s) behave.

remove-empty

Removes empty messages (aka zero len).

length-prefixed

Separates a continuous stream of data based on length prefixing. The length for each package in a stream is based on the first 64 bit decoded as an unsigned big endian value.

textual-length-prefix

Extracts the message based on prefixed message length given in ascii digits which is followed by a space as used in RFC 5425 for TLS/TCP transport for syslog.