Projects
Infrastructure · Enterprise Software

Real-Time
Ops

A high-throughput monitoring suite processing 4M electrical signals per second with sub-10ms alert latency. Built on Go, Redis Streams, and Kafka — each layer decoupled so no consumer path can block another.

The Problem03
01

Millions of Events Per Second

The platform needed to ingest and process electrical signal data from thousands of sensors simultaneously. A Node.js monolith was saturating CPU at 12% of the target ingest rate — a structural ceiling, not a tuning problem.

02

Sub-10ms Alert Latency

Operators required alerts within 10ms of a threshold crossing. The existing architecture had alert latency measured in seconds, rendering the alerting system unreliable for safety-critical use cases and failing SLA requirements.

03

Stateful Fan-Out at Scale

Each incoming signal needed to be routed to potentially hundreds of downstream consumers — dashboards, alert evaluators, loggers — without any consumer blocking others or causing head-of-line delays under burst conditions.

What We Built04
01

Go Ingest Service

Rewrote the ingest layer in Go with goroutine-per-connection handling and a lock-free ring buffer for batching. The new service handles 4M events/sec on 4 vCPUs with median ingest latency of 1.2ms — a 30× throughput improvement with no hardware change.

02

Redis Streams Fan-Out

Implemented a Redis Streams-based fan-out layer for real-time consumers. Each consumer type — dashboards, alert evaluators, audit loggers — reads from its own consumer group. A slow dashboard never blocks an alert evaluator.

03

Kafka Durable Event Log

All events are durably committed to Kafka before acknowledgment. Downstream consumers — data lake writer, ML feature pipeline, audit log — replay from Kafka independently with no coupling to the real-time path. Retention configured at 7 days.

04

Grafana Operations Stack

Built a Grafana-based operations stack with real-time signal dashboards, alert history, on-call routing via PagerDuty, and SLA tracking — all backed by metrics exported from the Go service via Prometheus.

Technology
Ingest
  • Go 1.22
  • Goroutines
  • Lock-free ring buffer
  • AWS EKS
Real-Time
  • Redis 7 Streams
  • Consumer groups
  • AWS ElastiCache
Durable Log
  • Apache Kafka 3.x
  • AWS MSK
  • 7-day retention
  • S3 + Parquet
Observability
  • Grafana
  • Prometheus
  • PagerDuty
  • TimescaleDB

Building something similar?

We've solved these problems before. Let's talk about yours.

Get in Touch