What is an agent, actually?
An agent is a program that uses a language model in a loop, where the model is allowed to call tools that change the environment, and where the next prompt depends on what the previous tool calls returned.
That's it. The model is the controller; the loop is the runtime; the tools are the actuators. Everything else — planners, critics, supervisors, vector stores, memory modules — is decoration on top of those three things.
A workflow has a flowchart you could draw before runtime. An agent has a flowchart that only exists after the run.
That distinction matters. If you can draw the flowchart in advance — "summarize, then classify, then translate" — write a workflow. You'll get faster, cheaper, more reliable software. Reach for an agent only when the path through the problem is genuinely unknown until the model sees the inputs.
The full loop, written out.
Below is a complete agent loop in Go, with no framework, using only net/http-style provider calls. Read it once top-to-bottom; the rest of this site is footnotes on these 60 lines.
package agent
import (
"context"
"errors"
"fmt"
"log/slog"
)
type Agent struct {
Model LLM // any provider, behind one tiny interface
Tools Registry // name -> typed handler
SystemPrompt string
MaxTurns int
}
func (a *Agent) Run(ctx context.Context, goal string) (string, error) {
history := []Message{
{Role: "system", Content: a.SystemPrompt},
{Role: "user", Content: goal},
}
for turn := 0; turn < a.MaxTurns; turn++ {
slog.Info("agent turn", "n", turn, "msgs", len(history))
resp, err := a.Model.Complete(ctx, CompletionInput{
Messages: history,
Tools: a.Tools.Schemas(),
})
if err != nil {
return "", fmt.Errorf("complete: %w", err)
}
// Branch 1: model returned a final answer.
if len(resp.ToolCalls) == 0 {
return resp.Text, nil
}
// Branch 2: model wants to call one or more tools — execute & observe.
history = append(history, Message{
Role: "assistant", ToolCalls: resp.ToolCalls,
})
for _, call := range resp.ToolCalls {
out, terr := a.Tools.Dispatch(ctx, call)
history = append(history, Message{
Role: "tool",
ToolCallID: call.ID,
Content: renderResult(out, terr),
})
}
}
return "", errors.New("agent: turn budget exceeded")
}
The model returns either tool calls or a final answer — never both.
Every major provider (Anthropic, OpenAI, Gemini) reflects this in their response shape. The branch on len(resp.ToolCalls) == 0 is the loop's exit condition.
The history grows linearly with turns and tool calls.
This is where context-window pressure shows up. By turn 8, a chatty agent will have stuffed several thousand tokens of tool output into history. Compaction strategies are §5.
Tool errors are not Go errors to bubble.
If a tool fails, render the failure as a string and feed it back to the model. The model will often retry or pick a different tool. Bubbling kills the loop and wastes the entire conversation.
The turn budget is a safety belt, not a target.
Most successful runs finish in 3–6 turns. A budget of 12 is generous; a budget of 50 means the agent is thrashing. Treat hitting the budget as a failure, not a "took longer than expected."
The message protocol every provider speaks.
Strip the wrappers and every chat completion API takes the same input — a list of typed messages — and returns the same output: a message with either text content, tool-use blocks, or both. The role names differ slightly across providers; the shape doesn't.
The four roles
- system — the long-lived instructions and persona.
- user — the human turn (or the upstream service's turn).
- assistant — the model's turn. Carries text or tool calls.
- tool — a tool result. Always references the tool call ID.
Two assistant content kinds
- Text — a final answer (or partial answer mid-stream).
- Tool use — a structured request to invoke a named tool with arguments.
- An assistant turn can carry both: a paragraph of reasoning followed by a tool call.
- Anthropic also has an extended thinking block — opaque to your code, treat it as pass-through.
// A single Go type that round-trips through every provider.
type Message struct {
Role string `json:"role"` // system | user | assistant | tool
Content string `json:"content,omitempty"`
ToolCalls []ToolCall `json:"tool_calls,omitempty"`
ToolCallID string `json:"tool_call_id,omitempty"` // on role=tool
Name string `json:"name,omitempty"` // on role=tool
}
type ToolCall struct {
ID string `json:"id"`
Name string `json:"name"`
Arguments json.RawMessage `json:"arguments"`
}
Tools are typed Go functions, registered by name.
A tool registry is just a map from name to a handler. The handler accepts a typed argument struct and returns either a result you can JSON-encode or an error. Schemas are derived from the struct via reflection; no hand-written JSON schema, ever.
type Tool[A any] struct {
Name string
Description string
Run func(ctx context.Context, args A) (any, error)
}
type Registry struct { tools map[string]handler }
func Register[A any](r *Registry, t Tool[A]) {
r.tools[t.Name] = handler{
name: t.Name,
description: t.Description,
schema: jsonSchemaOf[A](),
invoke: func(ctx context.Context, raw json.RawMessage) (any, error) {
var a A
if err := json.Unmarshal(raw, &a); err != nil {
return nil, fmt.Errorf("args for %s: %w", t.Name, err)
}
return t.Run(ctx, a)
},
}
}
Three rules for tool design.
- One responsibility per tool. A manage_users tool with a verb argument is a worse interface for the model than four small tools — create_user, delete_user, etc.
- Idempotent where possible. Models retry. A send_email tool that doesn't dedupe by request ID will eventually send three of the same email.
- Errors are observations. Return the error message as part of the tool result, not as a Go panic. The model will often correct on the next turn.
"Memory" is just a context-window strategy.
Agents have no built-in memory. What people call "agent memory" is one of three concrete strategies for managing the message history before it overflows the context window.
Sliding window
Keep the last N messages, drop the rest. Cheap, dumb, often enough. The right starting point.
Summarization
When history exceeds a threshold, replace older turns with a model-generated summary. Costs one extra LLM call per compaction.
Retrieval
Persist past turns to a vector store; on each turn, retrieve the K most relevant. The most powerful and most overrated approach.
For most production agents, sliding-window memory plus a single periodic summarization is enough. Vector retrieval is rarely the right answer for ephemeral conversation memory — it shines for permanent knowledge (docs, tickets, code), not for "what did the user say five turns ago."
// Bounded sliding window — preserves the system prompt and last N exchanges.
func TrimHistory(msgs []Message, keep int) []Message {
if len(msgs) <= keep+1 {
return msgs
}
out := make([]Message, 0, keep+1)
out = append(out, msgs[0]) // always keep the system prompt
out = append(out, msgs[len(msgs)-keep:]...)
return out
}
An agent stops when one of five things happens.
| Stop reason | Source | What you should do |
|---|---|---|
| Final answer | Model returns a turn with no tool calls | Happy path — return the text. |
| Turn budget | Loop counter exceeds MaxTurns | Surface as failure. Investigate why; don't just bump the budget. |
| Token budget | Cumulative input+output tokens exceed limit | Stop and log; the agent is in a context spiral. |
| Cost cap | Estimated $ for this run exceeds policy | Hard stop. The cap exists precisely for runaway loops. |
| Cancellation | ctx.Done() fires | Abort cleanly, persist state if the run is resumable. |
Two error classes, two retry strategies.
Agents face two completely different kinds of errors and you have to handle them differently.
Transport errors
5xx from the provider, network blips, rate limits. Retry the same request with exponential backoff. The model never sees these.
backoff := 500 * time.Millisecond
for attempt := 0; attempt < 5; attempt++ {
resp, err := llm.Complete(ctx, in)
if err == nil { return resp, nil }
if !isRetryable(err) { return nil, err }
time.Sleep(backoff + jitter())
backoff *= 2
}
Tool errors
A tool returned an error — the API was down, the SQL was malformed, the file didn't exist. Render as a tool result and feed it back to the model. Do not retry from your code. Let the model decide.
history = append(history, Message{
Role: "tool",
ToolCallID: call.ID,
Content: fmt.Sprintf(
"ERROR: %s", err.Error()),
})
// Loop continues — the model sees the error and adapts.
Streaming is two things at once: UX, and an event channel.
A streaming agent doesn't just emit tokens for the UI. It emits structured events: turn_started, tool_call, tool_result, text_delta, done. In Go, the right shape for this is a buffered channel of typed events that the agent loop pushes into and the HTTP handler drains.
type Event interface{ eventKind() string }
type TextDelta struct{ Text string }
type ToolCallEvt struct{ Name string; Args json.RawMessage }
type ToolResultEvt struct{ Name string; Output any; Err error }
type Done struct{ Reason string }
// Pump events to an SSE writer. Same channel feeds the UI and the trace exporter.
func (a *Agent) Stream(ctx context.Context, goal string) <-chan Event {
out := make(chan Event, 32)
go func() {
defer close(out)
// ...the loop, pushing events instead of returning a string.
}()
return out
}
A single-loop agent, in eight pieces.
A model, a tool registry, a message protocol, a loop, a memory strategy, stop conditions, retry policies, and a stream of events. Build these eight things and you have an agent. Skip the framework — you'll learn more in 200 lines than in any tutorial.