Documentation Index
Fetch the complete documentation index at: https://docs.nusomi.com/llms.txt
Use this file to discover all available pages before exploring further.
Exports turn sealed sessions into Parquet, WebDataset, JSONL, or Arrow datasets for downstream training. See datasets concepts for the data shape.
Create an export
{
"workflow": "process_invoice",
"filter": {
"outcome": "success",
"since": "2026-01-01"
},
"format": "webdataset",
"frame_sampling": { "mode": "event_only", "keep_action_frames": true },
"destination": {
"kind": "s3",
"bucket": "my-training-bucket",
"prefix": "nusomi/process_invoice/v1/",
"region": "us-east-1",
"role_arn": "arn:aws:iam::123456789012:role/NusomiExport"
},
"tag": "process_invoice@v1"
}
Body
| Field | Required | Notes |
|---|
workflow | yes | One or more workflow slugs. |
filter | no | See filters below. |
format | yes | parquet | webdataset | jsonl | arrow. |
frame_sampling | no | See frame sampling. |
destination | yes | See destinations. |
tag | no | Free-form label for versioning. |
Response 202
{
"id": "exp_01HZ...",
"status": "queued",
"manifest_url": null,
"created_at": "2026-05-07T14:08:11Z"
}
Get an export
{
"id": "exp_01HZ...",
"status": "completed",
"rows": 184_312,
"shards": 24,
"shard_size_bytes": 268_435_456,
"frames": 184_312,
"actions": 184_312,
"manifest_url": "s3://my-training-bucket/.../manifest.json"
}
status progresses queued → running → completed | failed.
List exports
GET /v1/exports?workflow=process_invoice&status=completed
Filters
| Field | Notes |
|---|
outcome | success | error | abandoned. |
since / until | ISO or relative. |
actor.kind | human | model | script. |
min_duration_ms / max_duration_ms | Wall-clock bounds. |
tag | Sessions carrying a specific tag. |
path | Memory-graph subpath id. |
exclude_session_ids | Manually drop sessions. |
Frame sampling
{
"mode": "event_only" | "every_n_ms" | "keyframes_only" | "all",
"interval_ms": 100,
"keep_action_frames": true
}
| Mode | Notes |
|---|
event_only | Default. Frames where an event fires. |
every_n_ms | Sample at fixed cadence. |
keyframes_only | Only frames where the screen changed materially. |
all | Every captured frame. Costly. |
Destinations
{
"kind": "s3",
"bucket": "my-bucket",
"prefix": "nusomi/v1/",
"region": "us-east-1",
"role_arn": "arn:aws:iam::...:role/NusomiExport"
}
Cross-account assume-role is the recommended pattern. Nusomi will assume the role with external_id = workspace_id.
GCS
{
"kind": "gcs",
"bucket": "my-bucket",
"prefix": "nusomi/v1/",
"service_account": "nusomi-export@acme-prod.iam.gserviceaccount.com"
}
Workload-identity federation is supported.
Azure Blob
{
"kind": "azure_blob",
"account": "myaccount",
"container": "training",
"prefix": "nusomi/v1/"
}
Signed URL (Nusomi-hosted)
Returns time-limited URLs in the manifest. Useful for one-off pulls.
Determinism
Exports are deterministic for a given filter — re-run the same POST /v1/exports with the same filter and you’ll get the same row set. Pass an explicit until to freeze the upper bound.
Manifest
manifest_url points to a JSON object listing every shard, its size, its row count, and a SHA-256 hash. Use it to reproduce or version your training set.