Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.orxhestra.com/llms.txt

Use this file to discover all available pages before exploring further.

Runner is the main entry point for session-managed execution. It wires an agent, a session service, and the invocation context together.
from orxhestra import Runner, InMemorySessionService

runner = Runner(
    agent=agent,
    app_name="my-app",
    session_service=InMemorySessionService(),
)

# Streaming is always on - partial events arrive as text is generated
async for event in runner.astream(
    user_id="user-1",
    session_id="session-abc",
    new_message="Hello!",
):
    if event.type == EventType.AGENT_MESSAGE and event.partial:
        print(event.text, end="", flush=True)
    elif event.is_final_response():
        print(f"\n{event.text}")
Runner automatically:
  1. Fetches or creates the session
  2. Persists the user’s message as a USER_MESSAGE event
  3. Builds an Context with the session reference
  4. Persists every agent event to the session via append_event()
  5. Applies EventActions.state_delta to the session state
Multi-turn conversations work automatically - LlmAgent rebuilds LangChain message history from session.events on each turn, so the LLM sees the full conversation context.

Using sessions directly

from orxhestra import InMemorySessionService

svc = InMemorySessionService()
session = await svc.create_session(app_name="demo", user_id="user-1")

# All sessions for a user
sessions = await svc.list_sessions(app_name="demo", user_id="user-1")

# Delete
await svc.delete_session(session.id)
Implement BaseSessionService to back sessions with any database. See Architecture for an example.

Database-backed sessions

For production persistence, use DatabaseSessionService (requires pip install orxhestra[database]):
from orxhestra.sessions import DatabaseSessionService

svc = DatabaseSessionService("sqlite+aiosqlite:///sessions.db")
await svc.initialize()

session = await svc.create_session(app_name="demo", user_id="user-1")
Supports any SQLAlchemy async backend (SQLite via aiosqlite, PostgreSQL via asyncpg, etc.).

Session Compaction

Long conversations accumulate events that eventually exceed the LLM’s context window. The Runner supports automatic compaction — summarizing old events into a single condensed event while keeping recent events intact.
from orxhestra import Runner, InMemorySessionService
from orxhestra.sessions.compaction import CompactionConfig

runner = Runner(
    agent=agent,
    app_name="my-app",
    session_service=InMemorySessionService(),
    compaction_config=CompactionConfig(
        char_threshold=100_000,  # compact when content exceeds ~25k tokens
        retention_chars=20_000,  # keep ~5k tokens of recent events raw
        model=model,                 # optional: LLM for summarization
    ),
)
After each invocation, the Runner estimates the total character count of non-compacted events. If it exceeds char_threshold:
  1. The most recent events totalling retention_chars characters are kept as-is
  2. Older events are summarized into a single compaction event
  3. The compaction event is appended to the session — originals are preserved
Compaction is non-destructive. Raw events are never deleted. Instead, LlmAgent applies apply_compaction() at the view layer to swap compacted ranges for their summaries when building LLM context. If model is provided, the summary is generated via an LLM call. Otherwise, a simple text extraction fallback is used.

Safety

  • Compaction never runs mid-stream — only after all agent events have been yielded
  • Events with unresolved tool calls are never compacted
  • Re-compaction is guarded by a timestamp boundary — previously compacted events are not re-summarized

Compaction events

Compaction events are regular AGENT_MESSAGE events with an EventActions.compaction field:
event.actions.compaction.summary        # the summary text
event.actions.compaction.event_count    # how many events were compacted
event.actions.compaction.start_timestamp
event.actions.compaction.end_timestamp
LlmAgent automatically detects compaction events and includes the summary in the LLM context via apply_compaction() from orxhestra.events.filters. This filter replaces raw events in the compacted range with the summary, while preserving all events after the compaction boundary.

Composer YAML

Enable compaction in the runner section:
runner:
  app_name: my-app
  session_service: memory
  compaction:
    char_threshold: 100000   # ~25k tokens
    retention_chars: 20000   # ~5k tokens
The Composer uses the default model for summarization when available.