Skip to main content

2 posts tagged with "collaboration"

View All Tags

· 4 min read

Welcome, Tremor Enthusiasts! This is the first post in a series intended inform and entertain Tremor Technologists with recent changes in the tremor project. We'll mostly focus these posts on Pull Requests and other notable developments. With these posts, you can stay informed and learn more about the project without having to read pull requests, or wait for release notes.

This posts' theme is open and improved communication.

Release Candidates

Where's the next release of Tremor?!

Tremor's release schedule is becoming a bit more regular. We're releasing about twice a year, every 6 to eight months. The next release of Tremor should include an exciting new way to write plugins with the plugin development kit, along with other goodies for faster, well understood tremor scripts and plenty of other performance and bug fixes.

Most notable as an update for us all is that Tremor has a release candidate live. If we're pleased with it, then a new version of tremor is soon to be released!

In this article, Let's dive into three topics: CI, PDK, and performance!

CI

One of my favorite parts of working with tremor, or any project, is using automation. Running a bunch of stuff on my machine can be quicker, but less reliable than a well established automation platform. Tremor makes an effort to improve Continuous integration regularly; and this month is no exception.

One of our active members of the Tremor Community, Pimmy has taken it upon himself to dramatically improve the CI and relase process for tremor-runtime.

The new process will automatically publish a release when we're ready. Using this automatino, we can bump versions appropriately with specially named branches and some fancy GitHub Actions. This lets the Tremor team take full advantage of the seamless GitHub experience from merge -> release, complete with logs and links directly from the Pull Request. We can even extract release notes and bump versions automatically. Once again, truly great work from our community here!

PDK (Plugin Development Kit)

I won't spend too much time on this point, since the lovely marioortizmanero has already taken the time to write out a full blog post or two on the topic. We'll give a full breakdown another time. For now: know that Tremor is currently in the works of creating an easier way for developers to plug their own binaries into the system to run connectors and pipelines.

Performance

Tremor cares a lot about performance. As you may already know from TremorCon 2021's fabulous talks; it was originally created and adopted for performance gains in Wayfair's event processing infrastructure. There are plenty of performance gains to make in a project as large as Tremor, both inside the project and through dependencies.

One such dependency we depend on to parse gigabytes of JSON per second is simd_json. simd_json is a port of the simdjson c++ library into rust. It can not only parse from JSON into Rust, but aid the serde library that can easily serialize and deserialize Rust data structures. That's pretty handy for an event processing project that will have to marshal plenty of events from disparate sources of all data forms! As you might imagine, a project like simd_json comes with a lot of configuration options, and optimizations behind those options.

Further than just the configuration that you might give serde, or simd_json, is the options you may give to the rust compiler. The compiler, by default, compiles for the largest compatability. For best performance on a specific plaform, we make use of the target-feature flag. This will allow us to target specific features available on different platforms and CPUs.

It should be no surprise that we continue this effort within the Tremor team. Know that we also depend on our community, such as a pull request from scarabsha. Sasha idenfitifed additional target-features for the target x86_64-unknown-linux-gnu that speed up the performance of simd-json. There's not a benchmark to show exactly how much faster this made json processing, but we appreciate the fact that it will undoubtedly increase performance for some clients of Tremor.

Thank You

The Tremor project as strong as the community around it. Reading this article, making contributions, and generally being involved in the project makes us more successful. Thank you for reading and contributing! See you next time.

  • Gary, and the Tremor team.

· 4 min read

Like all good projects in the open source community, great collaborations start with ad hoc interactions.

Redpanda Rocket

A recent set of discussions between the Tremor maintainers and our community brought Redpanda to our attention.

Some of our community would like to replace their Kafka deployments with Redpanda, a Kafka API-compatible streaming data platform, to realize gains in performance and reduce their total cost of operations. . As we already have Kafka connectivity, enabling this shouldn’t be too complex, right? Let’s find out.

Context

We are always looking out for interesting technology, and recently we read a very interesting article on the Redpanda blog about using WASM as server-side filters for subscription. Being as excitable as we are, we obviously had to give Redpanda a shot.

To our joy this turned out to be painless, Redpanda reuses the Kafka API so our existing connectors for Kafka work out of the box — and as a bonus we could get rid of a whole bunch of docker-compose YAML dancing that we needed to set up Zookeeper.

Alex Gallego, founder of Redpanda, reached out to us and we started experimenting with Tremor and Redpanda in this repo: github.com/tremor-rs/tremor-redpanda

Setting up Tremor and Redpanda

So, let’s get our hands dirty and actually connect Redpanda and tremor in a real-world project.

We have prepared a fully equipped workshop for this occasion. Give it a shot here if you are impatient.

Tremor can flexibly act as a Redpanda/Kafka consumer or producer, make use of auto-commit for offset management or manually commit when events are completely handled by the tremor pipeline. Here we are configuring Tremor only committing offsets when events have been successfully handled.

onramp:
- id: redpanda-in
type: kafka
codec: json
config:
brokers:
- redpanda:9092
topics:
- tremor
group_id: redpanda_es_correlation
retry_failed_events: false
rdkafka_options:
enable.auto.commit: false

This tremor application is reporting success or failure of ingesting the received events into elasticsearch via another Redpanda topic. Configuring this is also straightforward, here we have a Redpanda consumer ready for copy-pasting:

offramp:
- id: redpanda-out
type: kafka
codec: json
config:
group_id: tremor-in
brokers:
- redpanda:9092
topic: tremor

Here you go. A fully working setup for orchestrating document ingestion with Redpanda delivering the documents and receiving acknowledgements. For more details check out this Redpanda recipe on our website.

Tremor is designed to be robust when faced with high volumetric data streams. It comes with batteries included for traffic shaping, QoS and data distribution. With those tremor can guarantee at-least-once message delivery. We try to reduce CPU and memory footprint for a given workload and at the same time provide a “just works” experience for operators. And we think we found a soulmate project in Redpanda.

And most importantly, it is working like a charm. In fact we just dropped in Redpanda and expected some hours of troubleshooting, but this hope was cut short by a smooth transition:

104_redpanda_elastic_correlation-tremor_out-1     | 2021-12-10T15:17:01.828694200+00:00 INFO tremor_runtime::system - Binding onramp tremor://localhost/onramp/redpanda-in/01/out
104_redpanda_elastic_correlation-tremor_out-1 | 2021-12-10T15:17:01.830056300+00:00 INFO tremor_runtime::source::kafka - [Source::tremor://localhost/onramp/redpanda-in/01/out] Starting kafka onramp
104_redpanda_elastic_correlation-tremor_out-1 | 2021-12-10T15:17:01.831848100+00:00 INFO tremor_runtime::source::kafka - [Source::tremor://localhost/onramp/redpanda-in/01/out] Subscribing to: ["tremor"]
104_redpanda_elastic_correlation-tremor_out-1 | 2021-12-10T15:17:01.832735600+00:00 INFO tremor_runtime::source::kafka - [Source::tremor://localhost/onramp/redpanda-in/01/out] Subscription initiated...
104_redpanda_elastic_correlation-tremor_out-1 | 2021-12-10T15:17:01.833342+00:00 INFO tremor_runtime::onramp - Onramp tremor://localhost/onramp/redpanda-in/01/out started.
104_redpanda_elastic_correlation-tremor_out-1 | 2021-12-10T15:17:01.828694200+00:00 INFO tremor_runtime::system - Binding onramp tremor://localhost/onramp/redpanda-in/01/out
104_redpanda_elastic_correlation-tremor_out-1 | 2021-12-10T15:17:01.830056300+00:00 INFO tremor_runtime::source::kafka - [Source::tremor://localhost/onramp/redpanda-in/01/out] Starting kafka onramp
104_redpanda_elastic_correlation-tremor_out-1 | 2021-12-10T15:17:01.831848100+00:00 INFO tremor_runtime::source::kafka - [Source::tremor://localhost/onramp/redpanda-in/01/out] Subscribing to: ["tremor"]
104_redpanda_elastic_correlation-tremor_out-1 | 2021-12-10T15:17:01.832735600+00:00 INFO tremor_runtime::source::kafka - [Source::tremor://localhost/onramp/redpanda-in/01/out] Subscription initiated...
104_redpanda_elastic_correlation-tremor_out-1 | 2021-12-10T15:17:01.833342+00:00 INFO tremor_runtime::onramp - Onramp tremor://localhost/onramp/redpanda-in/01/out started.
...

Both in terms of operator friendliness and performance we root for Redpanda.

We started using it in our integration test suite, so users can be 100% sure Redpanda connectivity just works.