Skip to main content
Version: edge

gcl

note

Authentication happens over the GCP autentication

gcl_writer

The GCL Writer connector integrates Google Cloud Logging from the Google Cloud Platform Operations Suite. This connector allows json logs to be published to Cloud Logging.

Configuration

optiondescription
log_nameThe default log_name for this configuration or default if not provided. The log_name can be overridden on a per-event basis in metadata
resourceA default monitored resource object that is assigned to all log entries in entries that do not specify a value for resource. A comprehensive list of resources is available and resources can be discovered via the gcloud client with gcloud logging resource-descriptors list
partial_successThis setting sets the behaviour of the connector with respect to whether valid entries should be written even if some entries in a batch set to Google Cloud Logging are invalid. Defaults to false
dry_runThis setting enables a sanity check for validating that log entries are well formed and and valid by exercising the connector to write entries where resulting entries are not persisted. Useful primarily during initial exploration and configuration or after large configuration changes as a sanity check. Defaults to false
default_severityThis setting sets a default log severity that can be overriden on a per event basis through metadata
labelsThis setting sets a default set of labels that can be overriden on a per event basis through metadata
connect_timeoutThe timeout in nanoseconds for connecting to the Google API
request_timeoutThe timeout in nanoseconds for each request to the Google API
concurrencyThe number of simultaneous in-flight requests ( defaults to 4 )
tokenThe authentication token see GCP autentication

The timeouts are in nanoseconds.

use std::time::nanos;
define connector gcl_writer from gcl_writer
with
config = {
"connect_timeout": nanos::from_seconds(1), # defaults to 1 second
"request_timeout: nanos::from_seconds(10),# defaults to 10 seconds
"token": "env", # required - The GCP token to use for authentication, see [GCP authentication](./index.md#GCP)
# Concurrency - number of simultaneous in-flight requests ( defaults to 4 )
# "concurrency" = 4,
}
end

Metadata

Metadata can optionally be provided on a per event basis for events flowing to this connector's sink.

The metadata is encapsulated in the $gcl_writer record and is may optionally specify one or many of the following fields.

fielddescription
log_nameOverrides the default configured log_name for this event only
log_severityOverrides the default log severity for this event only
resourceOverrides the default configured resource, if provided
insert_idAn optional unique identifier for the log entry. If you provide a value, then Logging considers other log entries in the same project, with the same timestamp, and with the same insert_id to be duplicates which are removed in a single query result. However, there are no guarantees of de-duplication in the export of logs
http_requestOptional information about the HTTP request associated with this log entry, if applicable
labelsAn optional map of system-defined or user-defined key-value string pairs related to the entry
operationOptional information about an operation associated with the log entry, if applicable
traceOptional. The REST resource name of the trace being written to Cloud Trace in association with this log entry. For example, if your trace data is stored in the Cloud project "my-trace-project" and if the service that is creating the log entry receives a trace header that includes the trace ID "12345", then the service should use "projects/my-tracing-project/traces/12345". The trace field provides the link between logs and traces. By using this field, you can navigate from a log entry to a trace.
span_idOptional. The ID of the Cloud Trace span associated with the current operation in which the log is being written. For example, if a span has the REST resource name of "projects/some-project/traces/some-trace/spans/some-span-id", then the spanId field is "some-span-id". A Span represents a single operation within a trace. Whereas a trace may involve multiple different microservices running on multiple different machines, a span generally corresponds to a single logical operation being performed in a single instance of a microservice on one specific machine. Spans are the nodes within the tree that is a trace. Applications that are instrumented for tracing will generally assign a new, unique span ID on each incoming request. It is also common to create and record additional spans corresponding to internal processing elements as well as issuing requests to dependencies. The span ID is expected to be a 16-character, hexadecimal encoding of an 8-byte array and should not be zero. It should be unique within the trace and should, ideally, be generated in a manner that is uniformly random.
trace_sampledThe sampling decision of the trace associated with the log entry. True means that the trace resource name in the trace field was sampled for storage in a trace backend. False means that the trace was not sampled for storage when this log entry was written, or the sampling decision was unknown at the time. A non-sampled trace value is still useful as a request correlation identifier. The default is False
source_locationOptional. Source code location information associated with the log entry, if any
timestampOptional. Overwrites the timestamp from ingest_ns to this value. the timestamp is provided in nanoseconds.

HTTP Request metadata

Optional related set of HTTP request data relevant to the log entry JSON payload.

fielddescription
request_methodThe HTTP verb for the request
request_urlThe URL, path and params for the request
request_sizeThe size in bytes of the request body
statusThe status of the response to the request
response_sizeThe size in bytes of the response body
user_agentThe user_agent header value
remote_ipThe recorded remote IP address, if available
server_ipThe server IP address, if available
refererThe referer, if available
latencyThe round trip latency in nanoseconds since epoch
cache_lookupTrue if there was a cache lookup for the request
cache_hitTrue if there was a cache lookup, and it was a hit
cache_validated_with_origin_serverTrue, if there was a validated cache lookup with the origin server
cache_fill_bytesBytes of the cache response, if there was a cache hit
protocolThe effective protocol eg: websockets, grpc

Operation metadata

Optional operation metadata field relevant to the log entry of the form:

{
"id": "a unique id for the operation",
"producer": "id of the producer of the operation",
"first": true, # is this the first of a related sequence
"last": true, # is this the last of a related sequence
}

Source location metadata

Optional source code location information if available of the form

{
"file": "path/to/file.rs",
"line": 200,
"function": "snot_badger_transformer",
}

Payload structure

The event value is transformed to JSON and transmitted as a JSON Payload with the log entry and any provided optional metadata.

Example

A worked example flow that uses a metronome source to inject log events into GCP cloud logging periodically which has the basic visual structure as below.

graph LR A[metronome] -->|every 500ms| B(main) B -->|payload to log entry| C[gcl_writer] C -->|gRPC LogEvent message| D{GCP Cloud Logging}
define flow main
flow
use std::time::nanos;
use tremor::connectors;
use tremor::system;
use integration;
use google::cloud::logging as gcl;

# We use a metronome as an event source in this
# example. We fire periodic events every 500
# milliseconds
define connector metronome from metronome
with
config = {"interval": nanos::from_millis(500)}
end;

# Our connection to the GCP cloud logging service
define connector google_cloud_logging from gcl_writer
with
config = {
# Default log_name
"log_name": "projects/my-project-id/logs/test-gcl",
# If connecting external from GCP, use a global resource
"resource": {
"type": "global",
"labels": {
"project_id": "my-project-id"
}
},

# 500ms connection timeout
"connect_timeout": nanos::from_millis(500),

# 1s request timeout
"request_timeout": nanos::from_seconds(1),

# Use `debug` log severity by default
"default_severity": gcl::severity::DEBUG,

# Indicate tremor version
"labels": {
"tremor-version": system::version()
}
}
end;

define pipeline main
pipeline
define script add_metadata_overrides
script
use std::time::nanos;
use google::cloud::logging as gcl;

# Example of setting metadata for each log event
let $gcl_writer = {
# "log_name": "projects/my-project-id/logs/test-gcl2",
"log_severity": gcl::severity::INFO,
"insert_id": "x" + gcl::gen_trace_id_string(),
"http_request": {
"request_method": "GET",
"request_url": "https://www.tremor.rs/",
"request_size": 0,
"status": 200,
"response_size": 1024,
"user_agent": "tremor",
"remote_ip": "164.90.232.184",
"server_ip": "localhost",
"referer": "https://www.tremo.rs",
"latency": nanos::from_millis(10),
},
"labels": {
"tremor-override": "crash-overrun",
},
"operation": {
"id": "snot-id-" + gcl::gen_span_id_string(),
"producer": "github.com/tremor-rs/gcl_writer/test",
"first": true,
"last": true,
},
"trace": gcl::gen_trace_id_string(),
"span_id": gcl::gen_span_id_string(),
"trace_sampled": false,
"source_location": { "file": "snot.rs", "line": 10, "function": "badger" },
};
event
end;

create script add_metadata_overrides;
select event from in into add_metadata_overrides;
select event from add_metadata_overrides into out;
end;

define pipeline exit
pipeline
select {
"exit": 0,
} from in into out;
end;


create connector file from integration::write_file;
create connector metronome;
create connector google_cloud_logging;
create pipeline main;


connect /connector/metronome to /pipeline/main;
connect /pipeline/main to /connector/google_cloud_logging;
connect /pipeline/main to /connector/file;
end;
deploy flow main;