Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.nusomi.com/llms.txt

Use this file to discover all available pages before exploring further.

Exports turn sealed sessions into Parquet, WebDataset, JSONL, or Arrow datasets for downstream training. See datasets concepts for the data shape.

Create an export

POST /v1/exports
{
  "workflow": "process_invoice",
  "filter": {
    "outcome": "success",
    "since": "2026-01-01"
  },
  "format": "webdataset",
  "frame_sampling": { "mode": "event_only", "keep_action_frames": true },
  "destination": {
    "kind": "s3",
    "bucket": "my-training-bucket",
    "prefix": "nusomi/process_invoice/v1/",
    "region": "us-east-1",
    "role_arn": "arn:aws:iam::123456789012:role/NusomiExport"
  },
  "tag": "process_invoice@v1"
}

Body

FieldRequiredNotes
workflowyesOne or more workflow slugs.
filternoSee filters below.
formatyesparquet | webdataset | jsonl | arrow.
frame_samplingnoSee frame sampling.
destinationyesSee destinations.
tagnoFree-form label for versioning.

Response 202

{
  "id": "exp_01HZ...",
  "status": "queued",
  "manifest_url": null,
  "created_at": "2026-05-07T14:08:11Z"
}

Get an export

GET /v1/exports/{id}
{
  "id": "exp_01HZ...",
  "status": "completed",
  "rows": 184_312,
  "shards": 24,
  "shard_size_bytes": 268_435_456,
  "frames": 184_312,
  "actions": 184_312,
  "manifest_url": "s3://my-training-bucket/.../manifest.json"
}
status progresses queuedrunningcompleted | failed.

List exports

GET /v1/exports?workflow=process_invoice&status=completed

Filters

FieldNotes
outcomesuccess | error | abandoned.
since / untilISO or relative.
actor.kindhuman | model | script.
min_duration_ms / max_duration_msWall-clock bounds.
tagSessions carrying a specific tag.
pathMemory-graph subpath id.
exclude_session_idsManually drop sessions.

Frame sampling

{
  "mode": "event_only" | "every_n_ms" | "keyframes_only" | "all",
  "interval_ms": 100,
  "keep_action_frames": true
}
ModeNotes
event_onlyDefault. Frames where an event fires.
every_n_msSample at fixed cadence.
keyframes_onlyOnly frames where the screen changed materially.
allEvery captured frame. Costly.

Destinations

S3

{
  "kind": "s3",
  "bucket": "my-bucket",
  "prefix": "nusomi/v1/",
  "region": "us-east-1",
  "role_arn": "arn:aws:iam::...:role/NusomiExport"
}
Cross-account assume-role is the recommended pattern. Nusomi will assume the role with external_id = workspace_id.

GCS

{
  "kind": "gcs",
  "bucket": "my-bucket",
  "prefix": "nusomi/v1/",
  "service_account": "nusomi-export@acme-prod.iam.gserviceaccount.com"
}
Workload-identity federation is supported.

Azure Blob

{
  "kind": "azure_blob",
  "account": "myaccount",
  "container": "training",
  "prefix": "nusomi/v1/"
}

Signed URL (Nusomi-hosted)

{ "kind": "signed_url" }
Returns time-limited URLs in the manifest. Useful for one-off pulls.

Determinism

Exports are deterministic for a given filter — re-run the same POST /v1/exports with the same filter and you’ll get the same row set. Pass an explicit until to freeze the upper bound.

Manifest

manifest_url points to a JSON object listing every shard, its size, its row count, and a SHA-256 hash. Use it to reproduce or version your training set.