Project

AI Operations Monitor

The dashboard that tells you when your AI is slow, expensive, or quietly failing.

The problem

Most teams ship AI into production blind. They learn about latency spikes, runaway cost, and silent failures from customers rather than from a dashboard, which means they learn late.

What I built

A lightweight telemetry layer that captures chat events, workflow runs, and model calls into Postgres and surfaces them in provisioned Grafana dashboards, tracking latency, cost, schema validity, escalations, and failure trends. The whole stack comes up with one command.

My approach

I kept ingestion deliberately thin: accept telemetry, store it, visualise it. Postgres plus provisioned Grafana means dashboards are reproducible and the whole stack starts with one command, so observability is something you switch on, not a project in itself.

The result

Problems surface on a dashboard before they reach the support queue, and cost and reliability stop being guesswork.

Stack

PythonFastAPIPostgresGrafanaDocker Compose

Connect

Building something, or thinking about it? Book a 1:1 and let's dig in.

Book a 1:1 consult X GitHub Email

Gbolagade Ishola