Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.nusomi.com/llms.txt

Use this file to discover all available pages before exploring further.

Your Playwright suite already exercises every important path through your app. Nusomi turns those runs into recorded sessions you can train on.

Wrap each test in a session

// playwright/fixtures.ts
import { test as base } from "@playwright/test";
import { Nusomi } from "@nusomi/sdk";

export const test = base.extend<{ session: Awaited<ReturnType<Nusomi["sessions"]["create"]>> }>({
  session: async ({ context }, use, testInfo) => {
    const nusomi = new Nusomi({ apiKey: process.env.NUSOMI_API_KEY });

    const session = await nusomi.sessions.create({
      workflow: testInfo.title.replace(/[^a-z0-9]+/gi, "_").toLowerCase(),
      metadata: {
        suite: testInfo.titlePath.slice(0, -1).join(" > "),
        run_id: process.env.GITHUB_RUN_ID ?? "local",
        commit: process.env.GITHUB_SHA ?? "local",
      },
    });

    await session.attachPlaywright(context);
    await session.start();

    try {
      await use(session);
      await session.tag("success");
    } catch (e) {
      await session.tag("error", { message: (e as Error).message });
      throw e;
    } finally {
      await session.stop();
    }
  },
});
Use it in your tests:
import { test } from "./fixtures";

test("user can submit an invoice", async ({ page, session }) => {
  await page.goto("https://app.example.com/invoices/new");
  await page.fill("[name=vendor]", "Acme");
  await page.fill("[name=amount]", "4500");
  await page.click("button:has-text('Submit')");
  await page.locator("text=Submitted for approval").waitFor();
});
Every test run produces a Nusomi session with frames + events.

Filtering CI noise

If you only want to capture suites that run against your staging environment:
test.beforeEach(async ({}, testInfo) => {
  if (process.env.PLAYWRIGHT_ENV !== "staging") {
    testInfo.skip();
  }
});

Diffing flake from regression

When a test fails, the Nusomi session is a video of the failure. Memory-graph similarity lets you tell flake from real regression:
const failures = await nusomi.sessions.list({
  workflow: "user_can_submit_an_invoice",
  outcome: "error",
  since: "24h",
});

for (const f of failures) {
  const similar = await nusomi.memory.similar(f.id, { limit: 5 });
  const allSimilarFailed = similar.every((s) => s.outcome === "error");
  console.log(f.id, allSimilarFailed ? "REGRESSION" : "FLAKE");
}
A failing test whose nearest-neighbor sessions all succeeded is probably flake. Failing alongside structurally identical neighbors suggests a real regression.

Training from your suite

Playwright runs are deterministic by design — they end up as compact, high-signal training examples:
await nusomi.exports.create({
  workflow: ["user_can_submit_an_invoice", "user_can_void_an_invoice", ...],
  filter: { outcome: "success", since: "30d" },
  format: "webdataset",
  destination: { kind: "s3", bucket: "acme-training", prefix: "ci/" },
});
This is why we recommend the same workflow slug per test: it gives the trainer one cluster per intent.

Anchor + Browserbase

If your suites run on Browserbase or Anchor (or any remote-browser CDP endpoint):
session.attachCdp({ wsEndpoint: process.env.BROWSERBASE_WS_ENDPOINT! });
Same shape, same downstream — just the browser lives somewhere else.