Playwright in GitLab CI: A Complete Setup and Configuration Guide

Running Playwright in GitLab CI requires the right Docker image, artifact config, and job structure. This guide covers everything from your first .gitlab-ci.yml to parallel execution and debugging failed runs.

Most teams writing end-to-end tests today are also trying to run those tests automatically on every push. GitLab CI/CD powers roughly 19% of organizational pipelines according to the JetBrains 2025 Developer Ecosystem Survey, and that number climbs higher in mid-to-large companies that rely on GitLab as their single DevOps platform.

The real challenge is not writing the tests. It is getting Playwright in GitLab CI to run reliably where browsers need system libraries and a 200-test suite blocks merge requests.

This guide walks through every step of setting up Playwright in GitLab CI, from your first                          .gitlab-ci.yml file to Docker images and artifact storage. By the end, you will have a pipeline that runs Playwright test automation reliably on every commit.

What is Playwright and why run it in GitLab CI?

What is Playwright and Gitlab CI?

Playwright is an open-source cross-browser testing framework by Microsoft. It controls Chromium, Firefox, and WebKit through a single API, runs browsers in headless mode by default, and handles auto-waiting out of the box.

GitLab CI/CD is GitLab's continuous integration platform that runs automated pipeline jobs inside Docker containers, triggered by code pushes or schedules. You define pipeline jobs in a YAML file called .gitlab-ci.yml, store it in your repository root, and GitLab spins up a runner every time a trigger fires. The runner executes your jobs inside Docker containers, shared runners, or self-hosted machines.

GitLab CI/CD is the continuous integration platform built into GitLab. Here is why running Playwright in GitLab CI makes sense:

  • Docker-first architecture. GitLab CI runs every job inside a Docker container by default. Playwright publishes official Docker images with all browser binaries and system dependencies pre-installed. No manual library setup.
  • Built-in artifact storage. GitLab lets you attach files to any job using the artifacts keyword. Screenshots, videos, HTML reports, and Playwright trace viewer files get stored and stay downloadable from the pipeline UI.
  • Native parallel sharding. GitLab's parallel keyword splits a single job into N identical jobs, each receiving a CI_NODE_INDEX and CI_NODE_TOTAL variable. Playwright's --shard flag maps directly onto these variables.
  • Integrated merge request feedback. When a pipeline runs on a merge request, GitLab shows the pass/fail status directly on the MR page. Failed Playwright tests block the merge until the issues are fixed.
See CI test results instantly
Stop downloading artifacts. View reports in one click.
Try free CTA Graphic

Prerequisites before you start

Before writing any YAML, make sure the following are in place.

Local environment:

  • Node.js 18 or later installed (LTS recommended)
  • A working Playwright project with at least one test file
  • @playwright/test added to your package.json dev dependencies

GitLab repository:

  • Your project code pushed to a GitLab repository
  • Access to create or edit .gitlab-ci.yml in the repository root
  • GitLab shared runners enabled (they are enabled by default on GitLab.com)

If you are starting from scratch, run this command to scaffold a new Playwright project:

terminal
npm init playwright@latest

The CLI will ask whether you want TypeScript or JavaScript, which browsers to install, and whether to add a sample test file. Once the setup completes, push your project to GitLab and you are ready to write the pipeline.

If you already have a project and just need the pipeline file, the next section walks through building one from scratch.

Writing your first .gitlab-ci.yml for Playwright

The pipeline file lives at the root of your repository as .gitlab-ci.yml. Here is a complete, working configuration for running Playwright in GitLab CI followed by a line-by-line explanation.

.gitlab-ci.yml
stages:
  - test
playwright_tests:
  stage: test
  image: mcr.microsoft.com/playwright:v1.52.0-noble
  script:
    - npm ci
    - npx playwright test
  artifacts:
    when: always
    paths:
      - playwright-report/
      - test-results/
    expire_in: 7 days

Breaking down each section

Stages

The stages block defines the pipeline structure. A single test stage is enough for most setups. You can add build or deploy stages later as your pipeline grows.

Job name (playwright_tests)

This is the name that appears in the GitLab pipeline UI. Use something descriptive so your team can identify what the job does at a glance.

Docker image (image)

The image keyword tells GitLab which Docker container to run the job inside. The official Playwright Docker image (mcr.microsoft.com/playwright) comes with all three browser engines and every system dependency pre-installed. No need for a separate browser install step.

Script

npm ci performs a clean install from package-lock.json. It is faster than npm install and guarantees reproducible builds. npx playwright test runs your entire test suite using the configuration in playwright.config.ts.

Artifacts

The when: always setting is critical. Without it, GitLab skips artifact upload when tests fail, and you lose the very report you need to debug the failure. The paths array tells GitLab which directories to save. expire_in: 7 days controls how long the artifacts remain downloadable.

Choosing the right Docker image and installing browsers

The Docker image you pick when running Playwright in GitLab CI determines whether your tests work out of the box. Choosing the wrong image often leads to missing library errors. There are two main approaches.

Playwright publishes Docker images on Microsoft Container Registry for every release. The image tag format is mcr.microsoft.com/playwright:v{VERSION}-{DISTRO}.

.gitlab-ci.yml
playwright_tests:
  image: mcr.microsoft.com/playwright:v1.52.0-noble
  script:
    - npm ci
    - npx playwright test

This image ships with:

  • Chromium, Firefox, and WebKit binaries matching that Playwright version
  • All required OS-level system dependencies (fonts, media codecs, shared libraries)
  • Node.js pre-installed

You do not need to run npx playwright install or npx playwright install --with-deps because everything is already in the image. This saves 30 to 60 seconds per pipeline run.

Option 2: use a generic Node.js image with manual browser install

If your organization restricts Docker images to an approved registry, you can use a standard Node.js image and install browsers manually:

.gitlab-ci.yml
playwright_tests:
  image: node:20
  script:
    - npm ci
    - npx playwright install --with-deps
    - npx playwright test

The --with-deps flag is essential here. It installs both the browser binaries and the OS-level system dependencies (like libgbm, libnss3, and font packages) that headless browsers need to launch on Linux. Without it, you will see errors like browserType.launch: Executable doesn't exist or error while loading shared libraries.

Version pinning

Always pin your Docker image version to match the Playwright version in your package.json. A mismatch between the installed @playwright/test npm package and the browsers in the Docker image causes unpredictable failures.

Approach Pros Cons
Official Playwright image Zero browser install time, all deps included, version-matched Larger image pull on first run (~1.2 GB)
Generic Node.js image + install Smaller base image, more control 30-60s extra per run, manual dep management

For most teams, the official image is the better choice. It removes an entire class of "missing dependency" bugs from your pipeline.

Tip: Check the Microsoft Container Registry for the latest Playwright Docker image tags. Always match the image tag version to your @playwright/test npm package version to avoid browser-binary mismatches.

Running tests, storing artifacts, and debugging failures

Once your pipeline runs, the next priorities are collecting test output, making reports accessible, and knowing how to debug when things break.

Running Playwright tests in headless mode

Playwright runs in headless mode by default. No special flag is needed. The npx playwright test command inside a GitLab CI job automatically launches browsers without a visible window.

If you need headed mode for debugging on a self-hosted runner that has a display, you can pass the --headed flag. On Linux-based GitLab runners without a display server, headed mode requires Xvfb:

.gitlab-ci.yml
playwright_tests:
  image: mcr.microsoft.com/playwright:v1.52.0-noble
  script:
    - npm ci
    - xvfb-run npx playwright test --headed

In practice, headless mode is what you want in CI. It is faster and requires no display server configuration.

Storing test artifacts: screenshots, traces, and reports

GitLab CI artifacts are the primary way to preserve test output between pipeline runs. Here is how to configure them for maximum debugging value:

.gitlab-ci.yml
playwright_tests:
  image: mcr.microsoft.com/playwright:v1.52.0-noble
  script:
    - npm ci
    - npx playwright test
  artifacts:
    when: always
    paths:
      - playwright-report/
      - test-results/
    reports:
      junit: test-results/results.xml
    expire_in: 7 days

Key points about the artifacts configuration:

  • when: always saves artifacts regardless of job outcome. This is non-negotiable. Failed test reports are the ones you need most.
  • playwright-report/ contains the HTML report generated by Playwright's default reporter.
  • test-results/ contains screenshots, videos, and trace files for failed tests.
  • reports: junit enables GitLab's built-in test report integration. When you add JUnit output, GitLab parses the XML and displays pass/fail results directly in the merge request UI.

Warning: Never omit when:always from your artifact configuration. Without it, GitLab skips artifact uploads on failed jobs, and you lose the screenshots, traces, and reports you need most for debugging. This is the single most common GitLab CI artifact mistake.

To generate JUnit output alongside the HTML report, update your playwright.config.ts:

playwright.config.ts
import { defineConfig } from "@playwright/test";
export default defineConfig({
  reporter: [
    ["html", { open: "never" }],
    ["junit", { outputFile: "test-results/results.xml" }],
  ],
  use: {
    trace: "on-first-retry",
    screenshot: "only-on-failure",
    video: "retain-on-failure",
  },
});

The trace: 'on-first-retry' setting records a full trace only when a test fails and is about to be retried. This gives you detailed debugging data without wasting storage on passing tests.

Debugging failed Playwright runs in GitLab

When tests fail in your Playwright GitLab CI pipeline, follow this standard debugging sequence:

  1. Check the job log. Click on the failed job in the GitLab pipeline UI. The console output shows which tests failed and the error messages.
  2. Download artifacts. Scroll to the job page and click the artifact download button. Extract the zip and open playwright-report/index.html locally.
  3. Use the trace viewer. If you enabled trace: 'on-first-retry', failed tests will have .zip trace files in test-results/. Open them at trace.playwright.dev or run npx playwright show-trace <trace-file> locally.

Check environment differences, most "passes locally, fails in CI" issues stem from:

  • Missing environment variables
  • Different screen resolution or font rendering
  • Network timeouts due to slower CI runners
  • Race conditions exposed by different CPU/memory constraints

For a systematic approach to diagnosing CI-only failures, the Playwright debugging guide covers trace analysis and common pitfalls in detail.

Configuring parallel test execution with sharding

When your Playwright in GitLab CI suite grows past 50 tests, running everything on a single runner starts blocking merge requests. A 200-test suite can easily take 15 to 20 minutes on a shared runner with 2 vCPUs. Sharding solves this by splitting the suite across multiple parallel jobs.

How GitLab CI parallel sharding works

GitLab's parallel keyword creates N copies of a job. Each copy receives two predefined variables:

  • CI_NODE_INDEX: the current job's index (1-based, so 1, 2, 3, ... N)
  • CI_NODE_TOTAL: the total number of parallel jobs

Playwright's --shard flag accepts the exact same format: --shard=INDEX/TOTAL. This makes the integration straightforward.

The sharding configuration

Here is a complete .gitlab-ci.yml that shards tests across 4 parallel jobs:

.gitlab-ci.yml
stages:
  - test
  - report
playwright_tests:
  stage: test
  image: mcr.microsoft.com/playwright:v1.52.0-noble
  parallel: 4
  script:
    - npm ci
    - npx playwright test --shard=$CI_NODE_INDEX/$CI_NODE_TOTAL
  artifacts:
    when: always
    paths:
      - blob-report/
    expire_in: 1 day
merge_reports:
  stage: report
  when: always
  image: mcr.microsoft.com/playwright:v1.52.0-noble
  script:
    - npm ci
    - npx playwright merge-reports --reporter html ./blob-report
  artifacts:
    when: always
    paths:
      - playwright-report/
    expire_in: 7 days

Configuring the blob reporter for sharding

When you shard tests, each job produces its own partial report. To merge them into a single HTML report, configure Playwright to use the blob reporter in CI:

playwright.config.ts
import { defineConfig } from "@playwright/test";
export default defineConfig({
  reporter: process.env.CI ? "blob" : "html",
  // ... rest of your config
});

The blob reporter produces a binary format optimized for merging. The merge_reports job downloads all blob files from the test stage, combines them, and generates a single HTML report.

When to shard and how many shards to use

Sharding introduces complexity (extra YAML, blob reporters, a merge job), so do not add it prematurely. Here is a practical guideline:

  • Under 50 tests: Probably not worth sharding. Optimize Playwright workers first.
  • 50 to 200 tests: 2 to 4 shards usually gives the best return.
  • 200+ tests: 4 to 8 shards are common. Beyond 8, the overhead of spinning up runners often outweighs the time savings.

Each shard has fixed overhead: pulling the Docker image, running npm ci, and initializing the runner. On GitLab shared runners, this overhead is roughly 30 to 60 seconds per job. So 10 shards adds up to 5 to 10 minutes of pure overhead, which is counterproductive for small suites.

Source: Playwright official documentation on sharding .Note: Values based on a 200-test suite running on standard GitHub-hosted ubuntu-latest runners with 2 vCPUs each. Actual results vary by test complexity.

For a deeper look at parallel execution strategies and how workers interact with CI runner CPU cores, the Playwright parallelism documentation explains the relationship between workers, fullyParallel, and sharding.

Multi-browser sharding with parallel:matrix

GitLab also supports parallel:matrix for more advanced setups. This lets you shard across both browsers and test partitions simultaneously:

.gitlab-ci.yml
playwright_tests:
  stage: test
  image: mcr.microsoft.com/playwright:v1.52.0-noble
  parallel:
    matrix:
      - PROJECT: ["chromium", "webkit"]
        SHARD: ["1/4", "2/4", "3/4", "4/4"]
  script:
    - npm ci
    - npx playwright test --project=$PROJECT --shard=$SHARD
  artifacts:
    when: always
    paths:
      - blob-report/
    expire_in: 1 day

This creates 8 parallel jobs (2 browsers x 4 shards). Each job runs one-quarter of the tests for one browser. The parallel:matrix approach is powerful for teams running cross-browser Playwright testing at scale.

Track flaky tests across shards
Detect patterns in failures that manual review misses.
Start now CTA Graphic

Best practices for a production-ready Playwright CI pipeline

After setting up the basics, these best practices separate a Playwright in GitLab CI setup that "works sometimes" from one that runs reliably at scale.

Pin your Docker image and Playwright versions

Always match the Playwright version in package.json with the Docker image tag. A mismatch causes browser launch failures because the installed npm package expects binaries that do not exist in the container.

package.json
{
  "devDependencies": {
    "@playwright/test": "1.52.0"
  }
}

The Docker image tag should be mcr.microsoft.com/playwright:v1.52.0-noble. When you upgrade Playwright, update both the npm package and the image tag in the same commit.

Configure your playwright.config.ts for CI

A production-ready configuration should behave differently in CI than on a developer's laptop:

playwright.config.ts
import { defineConfig, devices } from "@playwright/test";
export default defineConfig({
  testDir: "./tests",
  fullyParallel: true,
  forbidOnly: !!process.env.CI,
  retries: process.env.CI ? 2 : 0,
  workers: process.env.CI ? 1 : undefined,
  reporter: process.env.CI ? "blob" : "html",
  use: {
    baseURL: process.env.BASE_URL || "http://localhost:3000",
    trace: "on-first-retry",
    screenshot: "only-on-failure",
    video: "retain-on-failure",
  },
  projects: [
    {
      name: "chromium",
      use: { ...devices["Desktop Chrome"] },
    },
    {
      name: "firefox",
      use: { ...devices["Desktop Firefox"] },
    },
    {
      name: "webkit",
      use: { ...devices["Desktop Safari"] },
    },
  ],
});

Config option Purpose
fullyParallel: true Distributes individual tests across workers, not just files
forbidOnly: !!process.env.CI Fails the build if someone accidentally commits test.only()
retries: 2 (in CI) Auto-retries failed tests to catch transient flakiness
workers: 1 (in CI) Prioritizes stability on shared runners with limited CPU
reporter: 'blob' (in CI) Produces mergeable reports for the sharding workflow
trace: 'on-first-retry' Records a trace only when a test fails, reducing storage
screenshot: 'only-on-failure' Captures a screenshot at the moment of failure
video: 'retain-on-failure' Records video but only keeps it for failed tests

Playwright's official documentation recommends setting workers to 1 in CI environments to prioritize stability and reproducibility. If you have a powerful self-hosted runner, you can increase workers.

Handle environment variables and secrets

Real-world test suites need credentials, API keys, and environment-specific URLs. Hardcoding them is a security risk. GitLab CI/CD variables solve this:

  1. Go to your GitLab project
  2. Navigate to Settings > CI/CD > Variables
  3. Add your variables (e.g., BASE_URL, AUTH_TOKEN)
  4. Mark sensitive values as Masked and Protected

Reference these variables in your .gitlab-ci.yml:

.gitlab-ci.yml
playwright_tests:
  script:
    - npm ci
    - npx playwright test
  variables:
    BASE_URL: $STAGING_URL
    CI: "true"

Access them in your tests through process.env.BASE_URL or configure them in playwright.config.ts via the use.baseURL property.

Handle flaky tests and retries

Flaky tests undermine confidence in the entire pipeline. Playwright has built-in retry support that helps separate genuine failures from intermittent flakiness:

playwright.config.ts
export default defineConfig({
  retries: process.env.CI ? 2 : 0,
});

Tests that fail then pass on retry are marked as "flaky" in the report. This is a diagnostic tool, not a fix. Track your flaky test rate over time. Industry data from the flaky test benchmark report shows that teams with flaky rates above 15% spend significantly more engineering hours on test maintenance.

Note: Retries are a diagnostic signal, not a permanent fix. If a test consistently needs 2 retries to pass, the root cause is unresolved. Prioritize stabilizing these flaky tests rather than masking the problem with more retries.

Common causes of flaky tests in CI:

  • Hardcoded waits. Replace page.waitForTimeout(5000) with Playwright's auto-waiting locators like toBeVisible() or toBeEnabled(). This is one of the key Playwright mistakes to avoid.
  • Shared state between tests. Each test should create its own data. Use Playwright's browser context isolation or the Page object model pattern.
  • Network-dependent assertions. Mock external API calls using Playwright's route interception instead of hitting real endpoints.
  • Timing issues. CI runners have fewer CPU cores than developer machines. Tests that barely pass locally often fail under CI resource constraints.

For teams managing flaky test detection at scale, automated tools that track failure patterns across pipelines and branches save significant debugging time compared to manual triage.

Speed up slow pipelines

If your Playwright in GitLab CI pipeline exceeds 15 minutes, try these optimizations in order:

  1. Use the official Docker image to skip browser installation (saves 30-60s)
  2. Cache node_modules to avoid re-downloading npm packages:

.gitlab-ci.yml
playwright_tests:
  cache:
    key:
      files:
        - package-lock.json
    paths:
      - node_modules/
    policy: pull-push

         3. Enable fullyParallel: true to distribute individual tests across workers

         4. Reduce browser scope. If you only need Chromium for CI, run                                                                        npx playwright test --project=chromium instead of all three browsers

         5. Shard the suite across multiple parallel jobs (covered in the sharding section)

For a deeper analysis of execution speed optimization, the guide on slow playwright tests covers profiling techniques and config changes.

Complete production-ready .gitlab-ci.yml

Here is the full pipeline configuration combining sharding, caching, artifacts, JUnit reporting, and a merge job:

.gitlab-ci.yml
stages:
  - test
  - report
variables:
  CI: "true"
playwright_tests:
  stage: test
  image: mcr.microsoft.com/playwright:v1.52.0-noble
  parallel: 4
  cache:
    key:
      files:
        - package-lock.json
    paths:
      - node_modules/
    policy: pull-push
  script:
    - npm ci
    - npx playwright test --shard=$CI_NODE_INDEX/$CI_NODE_TOTAL
  variables:
    BASE_URL: $STAGING_URL
  artifacts:
    when: always
    paths:
      - blob-report/
      - test-results/
    reports:
      junit: test-results/results.xml
    expire_in: 1 day
merge_reports:
  stage: report
  when: always
  image: mcr.microsoft.com/playwright:v1.52.0-noble
  cache:
    key:
      files:
        - package-lock.json
    paths:
      - node_modules/
    policy: pull
  script:
    - npm ci
    - npx playwright merge-reports --reporter html ./blob-report
  artifacts:
    when: always
    paths:
      - playwright-report/
    expire_in: 30 days

This production-ready Playwright in GitLab CI configuration follows the patterns recommended in the Playwright automation checklist and works well for teams running Playwright CI/CD integrations across staging and production environments.

Scaling reports beyond artifact downloads

For teams running hundreds of tests across multiple shards and branches, downloading artifact zips from GitLab becomes unsustainable. This is where test automation reporting platforms add real value.

Tools like TestDino integrate directly with your CI pipeline and provide:

  • Pass/fail trends across branches and commits
  • Flaky test detection with automatic pattern recognition
  • Screenshots, videos, and traces viewable in the browser
  • Historical run comparisons to catch regressions early

The Playwright reporting ecosystem also includes reporters like list, json, junit, and the blob reporter used in sharding. You can stack multiple reporters in your config to serve different needs simultaneously.

Tracking reporting metrics like pass rate, flake rate, and average execution time gives your team data-driven signals about pipeline health rather than relying on gut feelings.

Debug CI failures faster
View traces, screenshots, and logs without downloading zips.
Get started CTA Graphic

Conclusion

Setting up Playwright in GitLab CI is not just about dropping a YAML file in your repository root. A production-grade pipeline involves making real decisions about Docker images, artifact storage, parallel sharding, environment variable management, and retry configuration.

Start simple. The basic single-job configuration from the first section is enough for small projects with under 50 tests. As your suite grows, add caching to speed up npm installs. When runs start exceeding 10 to 15 minutes, introduce sharding with GitLab's parallel keyword.

When your team grows to the point where multiple engineers triage failures every morning, invest in a proper test reporting tool that surfaces trends without manual artifact downloads. The key takeaway: treat your Playwright GitLab CI pipeline as a first-class part of your testing infrastructure and optimize it just like you would optimize the tests themselves.

FAQ

How do I run Playwright tests only on merge requests in GitLab CI?
Add a rules block to your job configuration that limits execution specifically to merge request pipelines. You can achieve this by setting an if condition where the pipeline source equals a merge request event. This prevents tests from running on direct branch pushes and only triggers them during MR updates.
How many shards should I use for my test suite in GitLab CI?
Divide your total test count by 50 as a reliable starting point. A 200-test suite works perfectly with 4 shards. Each shard carries fixed overhead like Docker image pulls and runner initialization. Going beyond 10 shards usually hits diminishing returns on shared runners.
Can I use GitLab CI cache for Playwright browser binaries?
Playwright's official documentation does not recommend caching browser binaries. The time to restore a large cache is comparable to downloading the binaries fresh. Instead, use the official Playwright Docker image to eliminate the download step entirely.
Why do my Playwright tests pass locally but fail in GitLab CI?
The most common cause is missing environment variables or using hardcoded waits that fail on slower runners. It can also happen if you use a generic Node.js image without installing system dependencies. Always check for font rendering differences between your local OS and the Ubuntu CI container.
How do I view the Playwright report from a failed GitLab CI run?
Go to the failed pipeline in GitLab, click on the job, and find the "Job artifacts" section. Click "Download" to get the zip file containing your test report. Extract it and run the Playwright show-report CLI command to view the interactive HTML report.
Is it possible to run only specific test files in GitLab CI?
Yes, you can target specific files or directories by modifying the script command in your pipeline. Instead of running the entire suite, append the directory path directly to the test command. You can also use the --grep flag to execute tests matching a specific tag or title pattern.
Savan Vaghani

Product Developer

Savan Vaghani builds the frontend at TestDino, a SaaS platform that turns Playwright test data into something teams actually want to look at.

His day to day sits at the intersection of product and engineering. He designs multi tenant dashboards that help QA and dev teams track test runs, surface flaky tests, and monitor CI health without forcing anyone to dig through raw logs.

The stack is React and TypeScript, but the real work is in the product decisions. He works on onboarding flows that reduce time to value, GitHub integrations that meet teams where they already work, and interface details that make complexity feel simple.

He thinks a lot about the gap between "technically correct" and "actually usable", and tends to close it.

Get started fast

Step-by-step guides, real-world examples, and proven strategies to maximize your test reporting success