Skip to main content

Modular Queries


Add support for modular queries to Tremor's query language, Trickle, so that distinct subgraphs could be reused and composed into higher level queries.


Subqueries would allow composition of smaller, reusable queries into higher level queries.

Guide-level explanation



DefineSubqueryDefn ::= DocComment? 'define' 'query' Id ('from' Ports)? ('into' Ports)? WithPartialParams? Subquery
define query custom_subquery
## Documentation Comment
from input_stream_1, input_stream_2
into output_stream
param1 = "foo",
param2 = 42
select event from input_stream_1 where == args.param1 into output_stream;
select event from input_stream_2 where == args.param2 into output_stream;
  • The with clause can be used to pass in parameters when required.
    • The values given to the parameters in the with clause here act as their default values.
    • Parameters are accessible through args inside the subquery.
  • The from clause can be used to define input streams.
    • If elided, in stream is attached.
  • The into clause can be used to define output streams.
    - If elided, `out` and `err` streams are attached.  
  • Note
    • Although there are currently no restrictions on sending events into the input_stream or reading from the output_stream, it is not recommended to do so.
    • The streams named inside the from and into clauses are created inside the subquery implicitly. i.e. there is no need to create stream output_stream inside the subquery.



CreateSubqueryDefn ::= 'create' 'query' Id ( 'from' ModularId )? WithParams?
# Short form
create query custom_subquery;

# Full form
create query my_custom_subq from custom_subquery
param1 = "bar"
  • The short form can be used if you don’t need to give the subquery a custom id.
    • The id from the subquery definition is used, ie custom_subquery in this case.
  • The full form allows you to give a custom id to the subquery.
  • The with clause can be used with either form to specify some or all of the parameters. Default values from the definition will be used for those left unspecified.


select event from in into my_custom_subq/input_stream_1;
select event from in into my_custom_subq/input_stream_2;
select event from my_custom_subq/output_stream into out;
  • We need to explicitly specify the ports with my_custom_subq here because the subquery is not using the default in, out or err ports.

Example 1

mod library with
mod utils with

define script mark_malformed
emit {
"event": event,
"status": "malformed"
} => "invalid";

define query select_minage
age = 18 # Parameter with default value of `18`
select event from in where in.age >= args.age into out;


define query select_valid_people
age = 21,
placeholder_name = "NA"
use utils;
create script mark_malformed;
create query select_min_age
age = args.age

select event from in into select_min_age;
select event from select_min_age where != args.placeholder_name into out;

select event from select_min_age where == args.placeholder_name into mark_malformed;
select event from mark_malformed/invalid into err;


create query valid_over_21 from library::select_valid_people
placeholder_name = "John Doe" # Overrides the default value of "NA".

select from in into valid_over_21;
select from valid_over_21 into out;
select from valid_over_21/err into err;

# Routes all events with age>=21 and name!="John Doe" to out
# Events with age<=21 are ignored
# Events with name=="John Doe" are marked as malformed and routed to err

Here, we have a subquery, valid_over_21, defined in the library module. The subquery itself is composed out of more generic components defined in the utils module and overrides their defaults with its own where appropriate.

  • Note
    • It's not possible to access the mark_malformed/invalid stream from outside the subquery unless it's connected to err (or any other into) port.

Example 2

define query custom_subquery
interval = core::datetime::with_seconds(60),
minimum_count = 0
define tumbling window interval_window
interval = args.interval
select aggr::stats::hdr(event.count)
from in[interval_window]
group by each(event.topic)
into out
having count > args.minimum_count;

create query six_per_two_minutes from custom_subquery
interval = core::datetime::with_seconds(120),
minimum_count = 6

The custom_subquery defined here contains its own scoped definition of a tumbling window, interval_window. This window is not accessible outside the subquery.

  • Note
    • While it's possible to define and create a component inside a subquery, it's not possible to pass in an externally created component as a parameter.

Reference-level explanation

The subqueries syntax builds upon the existing modularity features to enable the composition of smaller components into higher level queries. During the construction of the DAG nested subqueries are, recursively, flattened and inlined into their parent Query.

Grammar Changes

  • Stmt
    • Two new statement types are introduced in Stmt for defining and creating subqueries.
    • Stmt
Stmt ::= 
| DefineWindowDefn
| CreateStreamDefn
| DefineOperatorDefn
| CreateOperatorDefn
| DefineScriptDefn
| CreateScriptDefn
| DefineSubqueryDefn // New
| CreateSubqueryDefn // New
| SelectStmt
  • ModuleStmtInner
    • The definition of ModuleStmtInner is extended to include subquery definitions.
    • ModuleStmtInner
ModuleStmtInner ::=
| DefineWindowDefn
| DefineOperatorDefn
| DefineScriptDefn
| DefineSubqueryDefn // New

Grammar Additions

  •  CreateSubqueryDefn
    • A new keyword query is introduced.
    • CreateSubqueryDefn
CreateSubqueryDefn ::= 'create' 'query' Id ( 'from' ModularId )? WithParams?
  •  DefineSubqueryDefn
    • DefineSubqueryDefn
DefineSubqueryDefn ::= DocComment? 'define' 'query' Id ('from' Ports)? ('into' Ports)? WithPartialParams? Subquery
  • Ports
    • Ports
Ports ::= Id ',' Ports | Id
  • Subquery
    • Subquery
Subquery ::= 'query' SubqueryStmtInner 'end'
  • SubqueryStmtInner
    • Currently, the definition of SubqueryStmtInner is equivalent to that of Query.
    • SubqueryStmtInner
SubqueryStmtInner ::= ( Stmt ';' )+ | Stmt


  • Introduction of subqueries would encourage deeper nesting which might increase compile time complexity.
  • Unlike most other nodes, subqueries do not always use the default (in, out, err) ports, this can seem unfamiliar and verbose to Tremor's users.

Rationale and alternatives

  • Why is this design the best in the space of possible designs?
    • The syntax is already familiar to Tremor users as it’s similar to the current syntax for Operators and Scripts.
    • The flattening of subqueries in the DAG allows for pipeline optimizations to apply to subqueries where applicable.
  • What other designs have been considered and what is the rationale for not choosing them?
    • A function-like syntax was explored but was abandoned in early stages as it proved to be incongruous for this use case.
    • Subqueries could be implemented as Operator nodes inside the DAG but that would make them inscrutable.

Prior art

Unresolved questions

  • None

Future possibilities

  • The subquery interface for parameters could be made more robust with introduction of typed parameters.
  • It may be useful to add “mandatory parameters” to subqueries. That is, parameters that are not given a value during definition and must be defined on creation.
  • Currently it is possible to both send and receive events on a stream, in the future we could restrict the direction of flow of events for streams inside from and into.