Documentation Index
Fetch the complete documentation index at: https://docs.nusomi.com/llms.txt
Use this file to discover all available pages before exploring further.
Nusomi has a small core model. Five primitives, in three layers.
The model in one diagram
┌──────────────┐
real work ──> │ Session │ <── you create one per workflow run
└──────┬───────┘
│
┌───────────────┼────────────────┐
▼ ▼ ▼
┌─────────┐ ┌─────────────┐ ┌──────────────┐
│ Frames │ │ Events │ │ Memory graph│
│ (video) │ │ (actions) │ │ (cross-run) │
└────┬────┘ └──────┬──────┘ └──────┬───────┘
│ │ │
└────────┬───────┴──────────────────┘
▼
┌─────────────────────────────────┐
│ Replay · Recovery · Datasets │
└─────────────────────────────────┘
Capture layer
| Primitive | What it is |
|---|
| Session | A single workflow run. Owns its frames, events, and metadata. Created when recording starts, sealed when it stops. |
| Frames | The raw video — screen captures at ~30 fps, plus DOM snapshots, browser metadata, and timestamps. The ground truth. |
| Events | Structured actions extracted from the recording: click_button, input_text, navigate, validation_error, retry, success. Each event is anchored to a specific frame. |
Indexing layer
| Primitive | What it is |
|---|
| Memory graph | A queryable graph across every session in your workspace. Nodes are workflow states, edges are transitions, leaves are outcomes. Lets you find similar runs, prior failures, recovery points. |
Output layer
| Primitive | What it is |
|---|
| Replay | Re-execute a session. Deterministic, LLM-guided when the UI shifted, or partial-resume from any frame. |
| Recovery | Pick up at the frame just before a workflow broke. Prior state attached. |
| Datasets | Export frame/action pairs as Parquet, WebDataset, or raw JSONL — for training computer-use models. |
Mental model
The capture happens once (a Session). What you do with it splits three ways:
- Replay the path the next time the work needs doing.
- Recover from a failure with the prior state attached.
- Train by exporting frames and actions as a dataset.
If you understand sessions, frames, and events, you understand the whole product. The rest is downstream.