Playwright suites

Your Playwright suite already exercises every important path through your app. Nusomi turns those runs into recorded sessions you can train on.

Wrap each test in a session

// playwright/fixtures.ts
import { test as base } from "@playwright/test";
import { Nusomi } from "@nusomi/sdk";

export const test = base.extend<{ session: Awaited<ReturnType<Nusomi["sessions"]["create"]>> }>({
  session: async ({ context }, use, testInfo) => {
    const nusomi = new Nusomi({ apiKey: process.env.NUSOMI_API_KEY });

    const session = await nusomi.sessions.create({
      workflow: testInfo.title.replace(/[^a-z0-9]+/gi, "_").toLowerCase(),
      metadata: {
        suite: testInfo.titlePath.slice(0, -1).join(" > "),
        run_id: process.env.GITHUB_RUN_ID ?? "local",
        commit: process.env.GITHUB_SHA ?? "local",
      },
    });

    await session.attachPlaywright(context);
    await session.start();

    try {
      await use(session);
      await session.tag("success");
    } catch (e) {
      await session.tag("error", { message: (e as Error).message });
      throw e;
    } finally {
      await session.stop();
    }
  },
});

Use it in your tests:

import { test } from "./fixtures";

test("user can submit an invoice", async ({ page, session }) => {
  await page.goto("https://app.example.com/invoices/new");
  await page.fill("[name=vendor]", "Acme");
  await page.fill("[name=amount]", "4500");
  await page.click("button:has-text('Submit')");
  await page.locator("text=Submitted for approval").waitFor();
});

Every test run produces a Nusomi session with frames + events.

Filtering CI noise

If you only want to capture suites that run against your staging environment:

test.beforeEach(async ({}, testInfo) => {
  if (process.env.PLAYWRIGHT_ENV !== "staging") {
    testInfo.skip();
  }
});

Diffing flake from regression

When a test fails, the Nusomi session is a video of the failure. Memory-graph similarity lets you tell flake from real regression:

const failures = await nusomi.sessions.list({
  workflow: "user_can_submit_an_invoice",
  outcome: "error",
  since: "24h",
});

for (const f of failures) {
  const similar = await nusomi.memory.similar(f.id, { limit: 5 });
  const allSimilarFailed = similar.every((s) => s.outcome === "error");
  console.log(f.id, allSimilarFailed ? "REGRESSION" : "FLAKE");
}

A failing test whose nearest-neighbor sessions all succeeded is probably flake. Failing alongside structurally identical neighbors suggests a real regression.

Training from your suite

Playwright runs are deterministic by design — they end up as compact, high-signal training examples:

await nusomi.exports.create({
  workflow: ["user_can_submit_an_invoice", "user_can_void_an_invoice", ...],
  filter: { outcome: "success", since: "30d" },
  format: "webdataset",
  destination: { kind: "s3", bucket: "acme-training", prefix: "ci/" },
});

This is why we recommend the same workflow slug per test: it gives the trainer one cluster per intent.

Anchor + Browserbase

If your suites run on Browserbase or Anchor (or any remote-browser CDP endpoint):

session.attachCdp({ wsEndpoint: process.env.BROWSERBASE_WS_ENDPOINT! });

Same shape, same downstream — just the browser lives somewhere else.

Get started

Concepts

Capture

Self-hosted

Security

Recipes

Reference

Playwright suites

Wrap each test in a session

Filtering CI noise

Diffing flake from regression

Training from your suite

Anchor + Browserbase

Get started

Concepts

Capture

Self-hosted

Security

Recipes

Reference

Documentation Index

​Wrap each test in a session

​Filtering CI noise

​Diffing flake from regression

​Training from your suite

​Anchor + Browserbase

Wrap each test in a session

Filtering CI noise

Diffing flake from regression

Training from your suite

Anchor + Browserbase