Skip to main content

4 posts tagged with "collaboration"

View All Tags

· 2 min read

The best aspect of open source software is collaboration, not only of systems but of the people involved. It is wonderful to see them come together to take multiple parts to build something bigger out of them.

In this spirit, we had the chance to collaborate on a little project with the crew from TDengine. For those that don’t know them yet, TDengine is an exciting new time-series database focusing on scalability, performance, and using SQL as a way to interact. Better yet using the adapter allows using a number of protocols including the Influx Line Protocol.

This does compliment tremor extremely well, as what we build can be described as an event processing engine focusing on performance and usability, using SQL as a way to manipulate data. I’m pretty sure you can already see how those two pieces fit together.

But, we don’t want to go much into the technical details here the post over at the TDengine blog does a wonderful job at that.

Working with Shuduo on this was a lot of fun, it was a fascinating forth and back of ideas and possible improvements. The integration was super painless with both systems supporting the same protocols, and it was really great to learn some tricks from the experts, like how queries work best in TDengine.

The most interesting part of this is however that both systems not only create the proverbial, larger sum than their parts are but also grow in the process.

And the speed the folks at TDengine do that is amazing, during the time of the collaboration, they not only released an official grafana data source, released a new version of TDengine itself, but also created and published a grafana dashboard for system monitoring that made it extremely easy to add this.

For tremor we have discovered that we’re really starting to miss sliding windows, they have been on the roadmap for a while, but never with any direct requirement other than “it would be nice to have them”. While talking about real-time alerting, sliding windows have significant advantages over tumbling windows, so this collaboration revived that discussion - which means we’ll hopefully add them sooner rather than later.

· 4 min read

Welcome, Tremor Enthusiasts! This post is another in a series intended to inform and entertain Tremor Technologists with recent changes in the tremor project. We'll mostly focus these posts on Pull Requests and other notable developments. With these posts, you can stay informed and learn more about the project without having to read pull requests, or wait for release notes.

Have you ever cleaned the house for company, to make it seem like nobody lives there? That’s basically what we’ve been up to getting ready for 0.12.0. It’s a big deal. We're trying to make the codebase spotless with plenty of sweeping up. We also thank our contributors for all the hard work put into automation and new stuff this month!

New Stuff

Most exciting of all: We made the 0.12.0 release! We put a lot of cool stuff in right before releasing, so finish the article before you jump in!

Being able to make a new tremor project has been a pain point from some of our users. We wanted getting started to be as easy as typing a command. Now it is! You can simply call our fancy new command to make a new tremor template. Quick and easy!

Our elasticsearch integrations got some upgrades with support for raw elastic payloads through our connector, and native support for auth between elastic <-> tremor. No more need for custom headers on connectors! Check out the links provided for more details, and we'll update the docs soon.

CI

The CI has been changed heavily in the last month. We have PrimalPimmy on GitHub to thank for much of the contributions. Thank you!

We've tweaked our CI when we create a release that should prevent creating crates unexpectedly. We also trigger many workflows across our projects from a release flow in tremor-runtime. We also decided that publishing in many steps would help to prevent single points of failure.

We can also better provide descriptive options when we create a release. We can also provide better options for publishing.

Lastly, we tweaked how much we use sed when we cut a release. This one is great for a quick example:

cd tremor-script
sed -e "s/^tremor-common = { version = \"${old}\"/tremor-common = { version = \"${new}\"/" -i.release "Cargo.toml"
sed -e "s/^tremor-influx = { version = \"${old}\"/tremor-common = { version = \"${new}\"/" -i.release "Cargo.toml"
sed -e "s/^tremor-value = { version = \"${old}\"/tremor-common = { version = \"${new}\"/" -i.release "Cargo.toml"

Where we used to manually edit each project's .toml file, we now do this as part of our GitHub Actions workflow elsewhere.

Sweeping Up

As we mentioned before, there was a lot of sweeping up we wanted to finish before the newest release.

One of the more interesting fixes we put in was a small bug for tremor run. Something that had gone unnoticed for a while was an upper limit when running tremor scripts. After 150 seconds, the script would time out. A strange and small feature limitation we didn't notice.

We had a lot of work specifically to do on connectors.

We also removed a limitation we had for code coverage. Where our contributors would have to avoid functionally any drop at all in code coverage, we now allow a .1 percent drop.

If you were a using the tremor api sub command, you should be aware that we've completely deprecated the subcommand.

Bug Fixes

Our bug fixes were pretty few over ht elast month. We had some publishing, connecting, and regular expressions patches. We updated our new code coverage tool, codecov, to more accurately track our coverage. We also found that piping information to tremor would sometimes cause issues and fixed that.

Thank You

The Tremor project as strong as the community around it. Reading this article, making contributions, and generally being involved in the project makes us more successful. Thank you for reading and contributing! See you next time.

  • Gary, and the Tremor team.

· 4 min read

Welcome, Tremor Enthusiasts! This is the first post in a series intended inform and entertain Tremor Technologists with recent changes in the tremor project. We'll mostly focus these posts on Pull Requests and other notable developments. With these posts, you can stay informed and learn more about the project without having to read pull requests, or wait for release notes.

This posts' theme is open and improved communication.

Release Candidates

Where's the next release of Tremor?!

Tremor's release schedule is becoming a bit more regular. We're releasing about twice a year, every 6 to eight months. The next release of Tremor should include an exciting new way to write plugins with the plugin development kit, along with other goodies for faster, well understood tremor scripts and plenty of other performance and bug fixes.

Most notable as an update for us all is that Tremor has a release candidate live. If we're pleased with it, then a new version of tremor is soon to be released!

In this article, Let's dive into three topics: CI, PDK, and performance!

CI

One of my favorite parts of working with tremor, or any project, is using automation. Running a bunch of stuff on my machine can be quicker, but less reliable than a well established automation platform. Tremor makes an effort to improve Continuous integration regularly; and this month is no exception.

One of our active members of the Tremor Community, Pimmy has taken it upon himself to dramatically improve the CI and relase process for tremor-runtime.

The new process will automatically publish a release when we're ready. Using this automatino, we can bump versions appropriately with specially named branches and some fancy GitHub Actions. This lets the Tremor team take full advantage of the seamless GitHub experience from merge -> release, complete with logs and links directly from the Pull Request. We can even extract release notes and bump versions automatically. Once again, truly great work from our community here!

PDK (Plugin Development Kit)

I won't spend too much time on this point, since the lovely marioortizmanero has already taken the time to write out a full blog post or two on the topic. We'll give a full breakdown another time. For now: know that Tremor is currently in the works of creating an easier way for developers to plug their own binaries into the system to run connectors and pipelines.

Performance

Tremor cares a lot about performance. As you may already know from TremorCon 2021's fabulous talks; it was originally created and adopted for performance gains in Wayfair's event processing infrastructure. There are plenty of performance gains to make in a project as large as Tremor, both inside the project and through dependencies.

One such dependency we depend on to parse gigabytes of JSON per second is simd_json. simd_json is a port of the simdjson c++ library into rust. It can not only parse from JSON into Rust, but aid the serde library that can easily serialize and deserialize Rust data structures. That's pretty handy for an event processing project that will have to marshal plenty of events from disparate sources of all data forms! As you might imagine, a project like simd_json comes with a lot of configuration options, and optimizations behind those options.

Further than just the configuration that you might give serde, or simd_json, is the options you may give to the rust compiler. The compiler, by default, compiles for the largest compatability. For best performance on a specific plaform, we make use of the target-feature flag. This will allow us to target specific features available on different platforms and CPUs.

It should be no surprise that we continue this effort within the Tremor team. Know that we also depend on our community, such as a pull request from scarabsha. Sasha idenfitifed additional target-features for the target x86_64-unknown-linux-gnu that speed up the performance of simd-json. There's not a benchmark to show exactly how much faster this made json processing, but we appreciate the fact that it will undoubtedly increase performance for some clients of Tremor.

Thank You

The Tremor project as strong as the community around it. Reading this article, making contributions, and generally being involved in the project makes us more successful. Thank you for reading and contributing! See you next time.

  • Gary, and the Tremor team.

· 4 min read

Like all good projects in the open source community, great collaborations start with ad hoc interactions.

Redpanda Rocket

A recent set of discussions between the Tremor maintainers and our community brought Redpanda to our attention.

Some of our community would like to replace their Kafka deployments with Redpanda, a Kafka API-compatible streaming data platform, to realize gains in performance and reduce their total cost of operations. . As we already have Kafka connectivity, enabling this shouldn’t be too complex, right? Let’s find out.

Context

We are always looking out for interesting technology, and recently we read a very interesting article on the Redpanda blog about using WASM as server-side filters for subscription. Being as excitable as we are, we obviously had to give Redpanda a shot.

To our joy this turned out to be painless, Redpanda reuses the Kafka API so our existing connectors for Kafka work out of the box — and as a bonus we could get rid of a whole bunch of docker-compose YAML dancing that we needed to set up Zookeeper.

Alex Gallego, founder of Redpanda, reached out to us and we started experimenting with Tremor and Redpanda in this repo: github.com/tremor-rs/tremor-redpanda

Setting up Tremor and Redpanda

So, let’s get our hands dirty and actually connect Redpanda and tremor in a real-world project.

We have prepared a fully equipped workshop for this occasion. Give it a shot here if you are impatient.

Tremor can flexibly act as a Redpanda/Kafka consumer or producer, make use of auto-commit for offset management or manually commit when events are completely handled by the tremor pipeline. Here we are configuring Tremor only committing offsets when events have been successfully handled.

onramp:
- id: redpanda-in
type: kafka
codec: json
config:
brokers:
- redpanda:9092
topics:
- tremor
group_id: redpanda_es_correlation
retry_failed_events: false
rdkafka_options:
enable.auto.commit: false

This tremor application is reporting success or failure of ingesting the received events into elasticsearch via another Redpanda topic. Configuring this is also straightforward, here we have a Redpanda consumer ready for copy-pasting:

offramp:
- id: redpanda-out
type: kafka
codec: json
config:
group_id: tremor-in
brokers:
- redpanda:9092
topic: tremor

Here you go. A fully working setup for orchestrating document ingestion with Redpanda delivering the documents and receiving acknowledgements. For more details check out this Redpanda recipe on our website.

Tremor is designed to be robust when faced with high volumetric data streams. It comes with batteries included for traffic shaping, QoS and data distribution. With those tremor can guarantee at-least-once message delivery. We try to reduce CPU and memory footprint for a given workload and at the same time provide a “just works” experience for operators. And we think we found a soulmate project in Redpanda.

And most importantly, it is working like a charm. In fact we just dropped in Redpanda and expected some hours of troubleshooting, but this hope was cut short by a smooth transition:

104_redpanda_elastic_correlation-tremor_out-1     | 2021-12-10T15:17:01.828694200+00:00 INFO tremor_runtime::system - Binding onramp tremor://localhost/onramp/redpanda-in/01/out
104_redpanda_elastic_correlation-tremor_out-1 | 2021-12-10T15:17:01.830056300+00:00 INFO tremor_runtime::source::kafka - [Source::tremor://localhost/onramp/redpanda-in/01/out] Starting kafka onramp
104_redpanda_elastic_correlation-tremor_out-1 | 2021-12-10T15:17:01.831848100+00:00 INFO tremor_runtime::source::kafka - [Source::tremor://localhost/onramp/redpanda-in/01/out] Subscribing to: ["tremor"]
104_redpanda_elastic_correlation-tremor_out-1 | 2021-12-10T15:17:01.832735600+00:00 INFO tremor_runtime::source::kafka - [Source::tremor://localhost/onramp/redpanda-in/01/out] Subscription initiated...
104_redpanda_elastic_correlation-tremor_out-1 | 2021-12-10T15:17:01.833342+00:00 INFO tremor_runtime::onramp - Onramp tremor://localhost/onramp/redpanda-in/01/out started.
104_redpanda_elastic_correlation-tremor_out-1 | 2021-12-10T15:17:01.828694200+00:00 INFO tremor_runtime::system - Binding onramp tremor://localhost/onramp/redpanda-in/01/out
104_redpanda_elastic_correlation-tremor_out-1 | 2021-12-10T15:17:01.830056300+00:00 INFO tremor_runtime::source::kafka - [Source::tremor://localhost/onramp/redpanda-in/01/out] Starting kafka onramp
104_redpanda_elastic_correlation-tremor_out-1 | 2021-12-10T15:17:01.831848100+00:00 INFO tremor_runtime::source::kafka - [Source::tremor://localhost/onramp/redpanda-in/01/out] Subscribing to: ["tremor"]
104_redpanda_elastic_correlation-tremor_out-1 | 2021-12-10T15:17:01.832735600+00:00 INFO tremor_runtime::source::kafka - [Source::tremor://localhost/onramp/redpanda-in/01/out] Subscription initiated...
104_redpanda_elastic_correlation-tremor_out-1 | 2021-12-10T15:17:01.833342+00:00 INFO tremor_runtime::onramp - Onramp tremor://localhost/onramp/redpanda-in/01/out started.
...

Both in terms of operator friendliness and performance we root for Redpanda.

We started using it in our integration test suite, so users can be 100% sure Redpanda connectivity just works.