playwright

End-to-End Testing

Running tests

Spin up a full local E2E environment (backend, frontend, docker services, Playwright UI):

hogli test:e2e

This uses bin/mprocs-e2e.yaml under the hood. If you need to reset the E2E database, trigger the reset-db process in the phrocs UI.

To run tests against an already-running PostHog instance:

LOGIN_USERNAME='[email protected]' LOGIN_PASSWORD="the-password" BASE_URL='http://localhost:8010' pnpm --filter=@posthog/playwright exec playwright test --ui

You might need to install Playwright first: pnpm --filter=@posthog/playwright exec playwright install

Writing tests with Claude Code

Use the /playwright-test skill to have Claude Code write and validate end-to-end tests for you. It will explore the UI with Playwright MCP tools, plan the tests, implement them, and run them in a loop until they pass reliably (including a flakiness check with --repeat-each 10).

Writing tests

What belongs in this suite

This suite is expensive as it runs the full stack and a real browser, and every spec costs PR runtime, runner credits, and a slice of the team's flake budget. Use these principles to decide whether a test is worth that cost.

Test what only a browser can prove

E2E is uniquely suited to multi-step journeys where the frontend, network, backend, and datastores all have to agree at once. If a regression would surface in a Jest + kea test, a Storybook story, an API integration test, or a ClickHouse unit test, write it there instead — cheaper to run, easier to debug, no 8-vCPU runner tied up.

Prefer the cheapest layer that can catch the bug

"Page renders", "button is present", "heading reads X", "tab is active"
- Those are smoke checks, not e2e, and they belong in Jest or Storybook.
Visual regressions belong in Storybook visual review.
If a failure can be diagnosed without reading a backend log, the test probably didn't need the backend.

Each test should earn its slot

The suite stays small on purpose; the bigger it gets, the noisier the flake signal becomes, and we drift back into "ignore the red, it's probably flake". You should treat adding a spec like adding a CI job. Justify it in the PR description (which cross-stack flow, why won't a lower layer catch it, how it sits next to the existing specs, etc.). Reviewers should push back when that justification is thin. "Nice to have coverage" isn't enough, but "this flow has broken in prod and nothing below this layer would have caught it" is.

Best practices

Don't use CSS selectors — prefer accessibility roles (getByRole) or getByTestId() which maps to data-attr in our config. Add data-attr to components if needed.
Write fewer, longer tests that do multiple things. Split logical steps with test.step().
Use page object models for common tasks and accessing common elements (see page-models/).
After UI interactions, assert on UI changes — don't assert on network requests resolving.
Never put conditional logic (if) in a test.

Gotchas

Flaky tests are almost always due to not waiting for the right thing. Consider adding a better selector, an intermediate step like waiting for URL or page title to change, or waiting for a critical network request to complete.

Loose selectors cause strict mode violations. If a selector matches multiple elements, Playwright will show all matches — use the output to narrow down:

Error: locator.click: Error: strict mode violation: locator('text=Set a billing limit') resolved to 2 elements:
1) <span class="LemonButton__content">Set a billing limit</span> aka getByTestId('billing-limit-input-wrapper-product_analytics').getByRole('button', { name: 'Set a billing limit' })
2) <span class="LemonButton__content">Set a billing limit</span> aka getByTestId('billing-limit-input-wrapper-session_replay').getByRole('button', { name: 'Set a billing limit' })

Name		Name	Last commit message	Last commit date
parent directory ..
e2e		e2e
mocks		mocks
page-models		page-models
utils		utils
README.md		README.md
package.json		package.json
playwright.config.ts		playwright.config.ts
snapshots.yml		snapshots.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

End-to-End Testing

Running tests

Writing tests with Claude Code

Writing tests

What belongs in this suite

Test what only a browser can prove

Prefer the cheapest layer that can catch the bug

Each test should earn its slot

Best practices

Gotchas

FilesExpand file tree

playwright

Directory actions

More options

Directory actions

More options

Latest commit

History

playwright

Folders and files

parent directory

README.md

End-to-End Testing

Running tests

Writing tests with Claude Code

Writing tests

What belongs in this suite

Test what only a browser can prove

Prefer the cheapest layer that can catch the bug

Each test should earn its slot

Best practices

Gotchas