← Writing

Distributed Tracing Without the Pain: A Practical OpenTelemetry Guide

Why Tracing Matters

When a request touches 8 services and takes 3 seconds, logs alone won't tell you where 2 of those seconds went. A trace is a tree of operations, where each leaf shows duration and metadata.

Quick Wins

Install OpenTelemetry

npm install @opentelemetry/api @opentelemetry/sdk-node @opentelemetry/sdk-trace-node @opentelemetry/exporter-trace-otlp-http @opentelemetry/auto

Bootstrap in index.ts

import { NodeSDK } from '@opentelemetry/sdk-node';
import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http';

const sdk = new NodeSDK({
  exporter: new OTLPTraceExporter({
    url: 'http://localhost:4318/v1/traces'
  })
});

sdk.start();
process.on('SIGTERM', () => sdk.shutdown());

Instrument HTTP

import { registerInstrumentations } from '@opentelemetry/instrumentation';
import { HttpInstrumentation } from '@opentelemetry/instrumentation-http';

registerInstrumentations({
  instrumentations: [new HttpInstrumentation()]
});

Now every HTTP request and response is automatically traced.

Manual Spans

For custom logic:

import { trace } from '@opentelemetry/api';

const tracer = trace.getTracer('my-service');

const span = tracer.startSpan('process-payment');
try {
  // your code
  span.setStatus({ code: SpanStatusCode.OK });
} catch (err) {
  span.recordException(err);
} finally {
  span.end();
}

Correlation IDs

Every trace has a trace_id. To correlate logs with traces:

const { trace } = require('@opentelemetry/api');
const context = trace.getActiveSpan()?.spanContext();
const traceId = context?.traceId;

logger.info({ msg: 'payment processed', trace_id: traceId });

Storage

Use an open-source collector (Jaeger, Tempo) or commercial (Honeycomb, DataDog).

For local dev, Jaeger in Docker:

docker run -p 16686:16686 -p 4318:4318 jaegertracing/all-in-one
# Visit http://localhost:16686