- Feature Name:
- Start Date: 2020-01-21
- Tremor Issue: tremor-rs/tremor-runtime#0015
- RFC PR: tremor-rs/tremor-rfcs#0008
Pull data from Postgres tables.
We see use cases a function of features supported by PostgreSQL out of the box, as well as additional features introduced as extensions or otherwise packaged as completely separate products. Primary examples that we are looking at are TimescaleDB and PipelineDB.
The ultimate goal is to be able to reliably pipe in data from PostgreSQL core to Tremor engine.
PostgreSQL is a standard Tremor onramp. It is configured by passing in standard PostgreSQL connection string arguments, such as
In addition to connection string arguments, we require:
intervalin milliseconds that specifies the time we wait before executing the next query.
consume_fromspecifies the timestamp from which to backfill data.
queryin form of standard
SELECTquery which retrieves rows from a table or view.
Unlike other onramps implemented so far, PostgreSQL onramp will persist
consume_until, allowing recovery from error conditions or other failures.
query: "SELECT id, name from events WHERE produced_at <= $1 AND produced_at >
Cache is a memory mapped region, either file-based or anonymous.
The onramp does not provide an implementation of all types Postgres supports and the entirety of query language (for example, LISTEN/NOTIFY semantics).
Rationale and Alternatives
As a starting point, or first implementation, supporting basic
SELECT statements with a time interval as additional
WHERE clause supports the major use case: ingesting rows from a table or view.
Alternative approach would utilize PostgreSQL single row mode suitable for ingestion of results returned by queries that span a large number of rows. Ensuring at-most-once ingestion of rows would be a potential problem and a time-consuming/API-breaking implementation.
Other approaches, such as trigger-based watches or binary log readers are also a possibility.
For users that utilise PostgreSQL as an event store, which seems to be a more
common use case lately, support for
LISTEN/NOTIFY would be essential to become one of modes of this onramp.
Support multiple queries in the same onramp.
More flexible means of specifying parameters.