Skip to main content

Traces & Spans

What is a Trace?

A trace represents a complete unit of work — like handling a user request, processing a document, or running an agent workflow. Think of it as a timeline:
User asks: "What's the weather in Paris?"

Trace Timeline:
├─ Agent receives query (0ms)
├─ LLM plans next action (200ms)
├─ Tool: weather_api called (400ms)
├─ Tool returns result (800ms)
├─ LLM generates response (1000ms)
└─ Response sent to user (1200ms)
Each step in this timeline is a span. Together, all spans form the complete trace.
PropertyDescription
traceIdUnique identifier (UUID)
nameHuman-readable name (optional)
startTime / endTimeWhen the trace began and completed
durationTotal time in milliseconds
statuscompleted, error, or running
spansArray of child spans

What is a Span?

A span represents a single operation within a trace — like an LLM call, tool execution, or retrieval step.
PropertyDescription
spanIdUnique identifier
traceIdParent trace identifier
parentSpanIdParent span (for nesting)
spanKindType of operation
nameOperation name (e.g., model name, tool name)
startTime / endTimeWhen operation began and completed
durationMsDuration in milliseconds
input / outputInput and output data
statuscompleted or error
tokensToken usage (for LLM spans)

Span Types

TypeDescriptionConvenience Method
llmLLM API callsctx.llmCall(model, fn)
toolTool/function executionsctx.executeTools(response, toolMap) or ctx.tool(name, fn)
retrieverRAG retrieval operationsctx.retriever(name, fn)
embeddingEmbedding generationctx.embedding(model, fn)
agentHigh-level agent orchestrationctx.startSpan(SpanKind.AGENT, ...)
chainPipeline/chain stepsctx.startSpan(SpanKind.CHAIN, ...)
customAny other operationctx.wrapInSpan(SpanKind.CUSTOM, ...)
See the JavaScript SDK or Python SDK for code examples of each span type.

Trace Hierarchy

Traces contain spans organized in a tree structure:
Trace: customer-support-request

├── Span: agent (root)
│   │
│   ├── Span: llm (classify-intent)
│   │   └── Model: gpt-4o, Duration: 450ms
│   │
│   ├── Span: retriever (search-knowledge-base)
│   │   └── Documents: 5 retrieved
│   │
│   └── Span: llm (generate-response)
│       └── Model: gpt-4o, Duration: 800ms

└── Total Duration: 1450ms

Best Practices

Create a new trace for each distinct user interaction. Don’t reuse trace IDs across requests.
Use names that describe the workflow: process-customer-query, generate-report, analyze-document.
Choose the span type that best represents the operation. This enables better filtering and analytics in the dashboard.
Token counts are essential for cost tracking and optimization.
Always call span.end() even on errors. Use try/finally or convenience methods like ctx.llmCall() for automatic handling.
Keep span depth reasonable (typically 3-5 levels max). Deep nesting makes traces hard to read.

Next Steps