Runner & Sessions

Runner is the main entry point for session-managed execution. It wires an agent, a session service, and the invocation context together.

from orxhestra import Runner, InMemorySessionService

runner = Runner(
    agent=agent,
    app_name="my-app",
    session_service=InMemorySessionService(),
)

# Streaming is always on - partial events arrive as text is generated
async for event in runner.astream(
    user_id="user-1",
    session_id="session-abc",
    new_message="Hello!",
):
    if event.type == EventType.AGENT_MESSAGE and event.partial:
        print(event.text, end="", flush=True)
    elif event.is_final_response():
        print(f"\n{event.text}")

Runner automatically:

Fetches or creates the session
Persists the user’s message as a USER_MESSAGE event
Builds an Context with the session reference
Persists every agent event to the session via append_event()
Applies EventActions.state_delta to the session state

Multi-turn conversations work automatically - LlmAgent rebuilds LangChain message history from session.events on each turn, so the LLM sees the full conversation context.

Using sessions directly

from orxhestra import InMemorySessionService

svc = InMemorySessionService()
session = await svc.create_session(app_name="demo", user_id="user-1")

# All sessions for a user
sessions = await svc.list_sessions(app_name="demo", user_id="user-1")

# Delete
await svc.delete_session(session.id)

Implement BaseSessionService to back sessions with any database. See Architecture for an example.

Database-backed sessions

For production persistence, use DatabaseSessionService (requires pip install orxhestra[database]):

from orxhestra.sessions import DatabaseSessionService

svc = DatabaseSessionService("sqlite+aiosqlite:///sessions.db")
await svc.initialize()

session = await svc.create_session(app_name="demo", user_id="user-1")

Supports any SQLAlchemy async backend (SQLite via aiosqlite, PostgreSQL via asyncpg, etc.).

Session Compaction

Long conversations accumulate events that eventually exceed the LLM’s context window. The Runner supports automatic compaction — summarizing old events into a single condensed event while keeping recent events intact.

from orxhestra import Runner, InMemorySessionService
from orxhestra.sessions.compaction import CompactionConfig

runner = Runner(
    agent=agent,
    app_name="my-app",
    session_service=InMemorySessionService(),
    compaction_config=CompactionConfig(
        char_threshold=100_000,  # compact when content exceeds ~25k tokens
        retention_chars=20_000,  # keep ~5k tokens of recent events raw
        model=model,                 # optional: LLM for summarization
    ),
)

After each invocation, the Runner estimates the total character count of non-compacted events. If it exceeds char_threshold:

The most recent events totalling retention_chars characters are kept as-is
Older events are summarized into a single compaction event
The compaction event is appended to the session — originals are preserved

Compaction is non-destructive. Raw events are never deleted. Instead, LlmAgent applies apply_compaction() at the view layer to swap compacted ranges for their summaries when building LLM context. If model is provided, the summary is generated via an LLM call. Otherwise, a simple text extraction fallback is used.

Safety

Compaction never runs mid-stream — only after all agent events have been yielded
Events with unresolved tool calls are never compacted
Re-compaction is guarded by a timestamp boundary — previously compacted events are not re-summarized

Compaction events

Compaction events are regular AGENT_MESSAGE events with an EventActions.compaction field:

event.actions.compaction.summary        # the summary text
event.actions.compaction.event_count    # how many events were compacted
event.actions.compaction.start_timestamp
event.actions.compaction.end_timestamp

LlmAgent automatically detects compaction events and includes the summary in the LLM context via apply_compaction() from orxhestra.events.filters. This filter replaces raw events in the compacted range with the summary, while preserving all events after the compaction boundary.

Composer YAML

Enable compaction in the runner section:

runner:
  app_name: my-app
  session_service: memory
  compaction:
    char_threshold: 100000   # ~25k tokens
    retention_chars: 20000   # ~5k tokens

The Composer uses the default model for summarization when available.

Getting Started

Composer

Core Concepts

Tools

Orchestration

Runtime

CLI

Integrations

Using sessions directly

Database-backed sessions

Session Compaction

Safety

Compaction events

Composer YAML

Getting Started

Composer

Core Concepts

Tools

Orchestration

Runtime

CLI

Integrations

Documentation Index

​Using sessions directly

​Database-backed sessions

​Session Compaction

​Safety

​Compaction events

​Composer YAML

Using sessions directly

Database-backed sessions

Session Compaction

Safety

Compaction events

Composer YAML