Skip to main content

Prometheus connector

prometheus.png

The Prometheus connector allows reading Prometheus metrics as tables in Trino.

The mechanism for querying Prometheus is to use the Prometheus HTTP API. Specifically, all queries are resolved to Prometheus Instant queries with a form like: http://localhost:9090/api/v1/query?query=up5B21d5D&time=1568229904.000. In this case the up metric is taken from the Trino query table name, 21d is the duration of the query. The Prometheus time value corresponds to the timestamp field. Trino queries are translated from their use of the timestamp field to a duration and time value as needed. Trino splits are generated by dividing the query range into attempted equal chunks.

Requirements

To query Prometheus, you need:

  • Network access from the Trino coordinator and workers to the Prometheus server. The default port is 9090.
  • Prometheus version 2.15.1 or later.

Configuration

Create etc/catalog/prometheus.properties to mount the Prometheus connector as the prometheus catalog, replacing the properties as appropriate:

connector.name=prometheus
prometheus.uri=http://localhost:9090
prometheus.query.chunk.size.duration=1d
prometheus.max.query.range.duration=21d
prometheus.cache.ttl=30s
prometheus.bearer.token.file=/path/to/bearer/token/file
prometheus.read-timeout=10s

Configuration properties

The following configuration properties are available:

Property NameDescription
prometheus.uriWhere to find Prometheus coordinator host
prometheus.query.chunk.size.durationThe duration of each query to Prometheus
prometheus.max.query.range.durationWidth of overall query to Prometheus, will be divided into query-chunk-size-duration queries
prometheus.cache.ttlHow long values from this config file are cached
prometheus.auth.userUsername for basic authentication
prometheus.auth.passwordPassword for basic authentication
prometheus.bearer.token.fileFile holding bearer token if needed for access to Prometheus
prometheus.read-timeoutHow much time a query to Prometheus has before timing out

Not exhausting your Trino available heap

The prometheus.query.chunk.size.duration and prometheus.max.query.range.duration are values to protect Trino from too much data coming back from Prometheus. The prometheus.max.query.range.duration is the item of particular interest.

On a Prometheus instance that has been running for awhile and depending on data retention settings, 21d might be far too much. Perhaps 1h might be a more reasonable setting. In the case of 1h it might be then useful to set prometheus.query.chunk.size.duration to 10m, dividing the query window into 6 queries each of which can be handled in a Trino split.

Primarily query issuers can limit the amount of data returned by Prometheus by taking advantage of WHERE clause limits on timestamp, setting an upper bound and lower bound that define a relatively small window. For example:

SELECT * FROM prometheus.default.up WHERE timestamp > (NOW() - INTERVAL '10' second);

If the query does not include a WHERE clause limit, these config settings are meant to protect against an unlimited query.

Bearer token authentication

Prometheus can be setup to require a Authorization header with every query. The value in prometheus.bearer.token.file allows for a bearer token to be read from the configured file. This file is optional and not required unless your Prometheus setup requires it.

SQL support

The connector provides globally available and read operation statements to access data and metadata in Prometheus.