Prometheus connector
The Prometheus connector allows reading Prometheus metrics as tables in Trino.
The mechanism for querying Prometheus is to use the Prometheus HTTP API.
Specifically, all queries are resolved to Prometheus Instant queries
with a form like:
http://localhost:9090/api/v1/query?query=up5B21d5D&time=1568229904.000
.
In this case the up
metric is taken from the Trino query table name,
21d
is the duration of the query. The Prometheus time
value
corresponds to the timestamp
field. Trino queries are translated from
their use of the timestamp
field to a duration and time value as
needed. Trino splits are generated by dividing the query range into
attempted equal chunks.
Requirements
To query Prometheus, you need:
- Network access from the Trino coordinator and workers to the Prometheus server. The default port is 9090.
- Prometheus version 2.15.1 or later.
Configuration
Create etc/catalog/prometheus.properties
to mount the Prometheus
connector as the prometheus
catalog, replacing the properties as
appropriate:
connector.name=prometheus
prometheus.uri=http://localhost:9090
prometheus.query.chunk.size.duration=1d
prometheus.max.query.range.duration=21d
prometheus.cache.ttl=30s
prometheus.bearer.token.file=/path/to/bearer/token/file
prometheus.read-timeout=10s
Configuration properties
The following configuration properties are available:
Property Name | Description |
---|---|
prometheus.uri | Where to find Prometheus coordinator host |
prometheus.query.chunk.size.duration | The duration of each query to Prometheus |
prometheus.max.query.range.duration | Width of overall query to Prometheus, will be divided into query-chunk-size-duration queries |
prometheus.cache.ttl | How long values from this config file are cached |
prometheus.auth.user | Username for basic authentication |
prometheus.auth.password | Password for basic authentication |
prometheus.bearer.token.file | File holding bearer token if needed for access to Prometheus |
prometheus.read-timeout | How much time a query to Prometheus has before timing out |
Not exhausting your Trino available heap
The prometheus.query.chunk.size.duration
and
prometheus.max.query.range.duration
are values to protect Trino from
too much data coming back from Prometheus. The
prometheus.max.query.range.duration
is the item of particular
interest.
On a Prometheus instance that has been running for awhile and depending
on data retention settings, 21d
might be far too much. Perhaps 1h
might be a more reasonable setting. In the case of 1h
it might be then
useful to set prometheus.query.chunk.size.duration
to 10m
, dividing
the query window into 6 queries each of which can be handled in a Trino
split.
Primarily query issuers can limit the amount of data returned by
Prometheus by taking advantage of WHERE
clause limits on timestamp
,
setting an upper bound and lower bound that define a relatively small
window. For example:
SELECT * FROM prometheus.default.up WHERE timestamp > (NOW() - INTERVAL '10' second);
If the query does not include a WHERE clause limit, these config settings are meant to protect against an unlimited query.
Bearer token authentication
Prometheus can be setup to require a Authorization header with every
query. The value in prometheus.bearer.token.file
allows for a bearer
token to be read from the configured file. This file is optional and not
required unless your Prometheus setup requires it.
SQL support
The connector provides globally available and read operation statements to access data and metadata in Prometheus.