Monitoring and Alarms
Monitoring and Alarms¶
Overview¶
This doc goes over the tools and infrastructure that power Firefly’s metric collection and alarming.
Metrics Ingestion¶

During execution of a request, Firefly publishes metrics as logs to the
/aws/lambda/MusicFirefly-prod-graphqlCloudWatch log group. This is typically done withlogger.metricin the Firefly code.A subscription filter exists on the
/aws/lambda/MusicFirefly-prod-graphqllog group that filters to only include logs with the log levelMETRIC, and sends those to our Kinesis streamThe
kinesisLogHandlerlambda ingests the records from the kinesis stream and publishes them to the relevant Timestream DB table. Currently we have 4 tables:FireFly-Clients,FireFly-Platform,FireFly-Services, andFireFly-ResolversThe Timestream DB is set up as a “Data source” in our AWS-Hosted Grafana environment, where we can perform queries against it.
Alarms/Cutting Tickets¶

A Firefly operator creates the Grafana Alerts
Incoming metrics trigger a Grafana Alert
Grafana publishes an SNS event on the Firefly Grafana alerts topic
The Grafana alerts queue listens to this topic and ingests the event
A lambda ingests the events and leverages the Tickety API to create SIM tickets
FAQ¶
How are Grafana Alerts created?¶
Currently all Grafana Alerts have been created by hand by a Firefly operator. In the near future, we hope to have the proper tooling in place create alarms programatically.
Do you have a list of all Grafana Alerts defined somewhere?¶
Yes! We have all our targetting alarms and their creation statuses defined here: Firefly Grafana Alerts
