AI-powered e2e testing: Getting started with Shortest


End-to-end (e2e) testing is essential for ensuring software applications function correctly. However, traditional testing tools like Selenium and Cypress can be difficult to use because they have steep learning curves, fragile tests, and require a lot of maintenance.

Simplifying E2E Testing With Open Source AI Testing Tools

AI-powered testing tools like Shortest, Testim, Mabl, and Functionize directly address these problems. They use natural language processing (NLP) and self-healing tests, making it easier to create and maintain tests, which means you don’t need to be a coding expert to use them.

This article looks at how AI-powered testing tools compare to traditional ones and their main benefits. We’ll take a close look at Shortest, an open source AI-powered testing library, its features, and how it simplifies the testing process.

Challenges with traditional testing frameworks

Traditional end-to-end testing frameworks are important for automated testing, but they have several drawbacks:

  • Steep learning curve: To write and maintain test scripts in tools like Selenium or Cypress, team members need coding skills. This makes it hard for those without a technical background (e.g., business analysts or product managers) to take part in the testing process
  • High maintenance overhead: When applications change, tests often fail and need manual updates, which leads to high maintenance costs
  • Slow test creation: Creating tests takes a lot of time when the scenarios are complex. This slows down development cycles

How AI-powered tools address these challenges

AI-driven testing solutions help solve common problems by introducing several key features:

  • Natural Language Processing (NLP): Users can write test cases in plain language, making it easier for those without coding skills to participate, improving teamwork among developers, QA engineers, and product teams
  • Self-healing tests: These tools can adjust to changes in the user interface, which means less manual work is needed to keep tests up to date
  • Smart test generation: AI creates test cases based on how users behave and use the application

AI-powered end-to-end (e2e) testing tools offer several benefits compared to traditional frameworks:

Time saving

AI testing tools help reduce the time needed to create and maintain tests. What once took hours or days can now often be done in minutes. You don’t need to write custom code for every test case. Less time is spent debugging tests that break easily, test maintenance becomes automatic when the application changes, and you get immediate feedback on whether tests are valid while you create them.

Studies show that switching to AI-powered tools can cut test creation time by up to 80%. This gives developers more time to focus on building features instead of maintaining tests.

Reduced maintenance overhead

AI testing tools have self-healing features that lower the maintenance load for teams, especially those dealing with fragile test suites that often break during development. When user interface elements change, these tools can automatically spot the changes, use machine learning to find replacement elements, continue running tests without needing manual fixes, and learn from successful changes to improve future performance.

Improved collaboration

AI testing tools help team members, both technical and non-technical, work better together. They help product managers check that tests accurately reflect user experiences. QA specialists and business stakeholders can create and maintain tests without coding skills. They also allow developers to concentrate on complex issues rather than basic tests. This teamwork ensures that testing aligns with business needs and that everyone shares responsibility for maintaining quality.

Scalability and reliability

AI tools make it easier to scale complex applications. They help teams create tests faster, run them on different devices in the cloud at the same time, choose the right tests intelligently, and reduce test failures. This leads to more reliable results. With this scalability, teams can keep their tests thorough even as applications grow and change. This ensures a smoother development and testing process.

Here’s a simple overview of four popular tools: Shortest, Testim, Mabl, and Functionize, each offering AI-driven end-to-end testing.

Shortest

Shortest Testing Framework

Shortest is an open source testing framework that uses NLP to understand test descriptions. This makes it easy for anyone, even those with limited technical skills, to create tests. Built on Playwright, Shortest can automate browser tasks with little coding. Shortest is great for teams looking for quick and easy test creation, though using an external API might slow it down.

Key features of Shortest include:

  • Natural language testing: Write tests in plain English (e.g., “Log in to the app using email and password.”), and the AI will take care of the interaction
  • Advanced features: Chain tests for workflows (e.g., login followed by updates) and conduct API testing using natural language
  • Integrations: Supports GitHub 2FA, CI/CD pipeline test, and email validation through Mailosaur for secure testing
  • Ease of use: The shortest init command sets up a project quickly, and tests can run in headless or visible modes

Testim

Testim Testing Framework

Testim by Tricentis is a testing platform that speeds up the creation and maintenance of tests for web and mobile apps. It uses machine learning to make tests stable and less flaky.

Testim is ideal for agile teams needing strong regression testing, but its pricing can be a hurdle for smaller projects. Some of its key features include:

  • AI-powered stabilizers: Its smart locators analyze UI elements, adapting tests to changes in layout
  • Low-code authoring: Its tests can be recorded visually or coded, so both non-technical users and developers can use it
  • Scalability: It runs thousands of tests quickly across different browsers, with detailed reports on failures
  • CI/CD integration: It easily fits into DevOps pipelines for continuous testing

Mabl

Mabl Testing Framework

Mabl is an AI-based test automation platform for web, mobile, and API testing. It focuses on accessibility and collaboration. Mabl is great for teams that want speed and minimal coding, but some of its advanced features may take some time to learn.

Key features of Mabl include:

  • Intuitive AI: Quickly creates tests, auto-fixes tests for UI changes, and uses computer vision to find visual issues
  • Comprehensive testing: Supports functional, performance, and accessibility testing, along with API tests through Postman
  • Performance insights: Monitors page load times and test runs to catch problems early
  • Team collaboration: Works with CI/CD tools and communication platforms like Slack for smoother teamwork

Functionize

Functionize Testing Framework

Functionize is a high-end testing platform that uses machine learning and computer vision for functional, performance, and visual testing. It features self-healing tests and scalability. Functionize is ideal for large projects that change often, but its costs and Windows-only design might make it less accessible for smaller teams.

Key features of Functionize include:

  • Self-healing tests: Automatically updates tests when the UI or functions change, cutting down on maintenance
  • Visual AI: Uses computer vision for accurate recognition of elements, so tests adapt to changing interfaces
  • Parallel testing: Runs tests on multiple browsers and devices at the same time for faster execution
  • Root cause analysis: Helps find the reasons behind test failures, making debugging easier for complex systems
Feature Shortest Testim Mabl Functionize
Core technology AI-powered (Anthropic Claude API), built on Playwright Machine Learning (Smart Locators), Cloud-based AI-native, low-code, uses ML and computer vision AI and ML with NLP and computer vision, cloud-based
Test creation Natural language descriptions (e.g., “Login with email”) Record-and-replay, low-code visual editor, supports coded enhancements Low-code, AI-powered action words, visual recorder NLP for scriptless tests, visual test editor
Ease of use High: Plain English tests, minimal setup with
shortest init
High: Codeless for non-technical users, intuitive UI High: Codeless focus, accessible for beginners Moderate: Scriptless but may require learning for advanced features
Self-healing tests Limited: Relies on AI to adapt to minor changes, no explicit self-healing Yes: Smart Locators auto-update element references Yes: Auto-heals tests for UI/data changes Yes: Strong self-healing with ML-driven updates
Supported test types Functional, API, UI, GitHub 2FA authentication Functional, UI, mobile (web/native), visual testing Functional, performance, accessibility, API, visual regression Functional, performance, load, visual, API
Integration GitHub, Mailosaur, basic CI/CD support CI/CD (Jenkins, Azure DevOps), Jira, Slack, Tricentis Device Cloud CI/CD (GitHub, Azure, Bitbucket), Postman, Slack CI/CD (Jenkins, GitLab), third-party apps via API Explorer
Cross-browser/Device support Yes: Playwright-based, supports multiple browsers Yes: Real browsers, iOS/Android native apps Yes: Web, mobile, cross-browser/devices Yes: Extensive browser/device coverage, parallel testing
Pricing model Open source and depends on Anthropic API usage Free tier, Essentials/Pro plans, custom pricing Pay-as-you-go, subscription plans, custom pricing Custom pricing, potentially high for small teams
Learning curve Low: Natural language reduces technical barriers Low: Codeless options, moderate for coded enhancements Low: Intuitive GUI, low-code approach Moderate: Advanced features require familiarity
Scalability Moderate: Suitable for small to medium projects, API dependency High: Scales for agile teams, parallel testing High: Cloud-based, scales for continuous testing High: Enterprise-grade, supports large-scale parallel testing
Unique strength Natural language simplicity, GitHub 2FA support Smart Locators for flaky test reduction, mobile native app support AI-driven test generation, performance insights Visual AI, comprehensive test coverage for complex apps
Best for Teams wanting simple, scriptless E2E testing with minimal coding Agile teams needing fast test creation and maintenance DevOps teams prioritizing codeless, continuous testing Enterprises with complex apps needing robust, scalable testing
Limitations External API reliance, limited performance/accessibility testing Less focus on performance, pricing complexity Limited customization for advanced users, higher cost High cost, Windows-centric design, less flexible for small teams

Testing with Shortest: A case study

In this section, we’ll look at how to test a demo application using Shortest. We’ll cover setup, writing a natural language test, and demonstrate advanced features like test chaining and API testing. Our demo app will be a simple React-based to-do list application using Next.js, which allows users to add, view, and delete tasks. The application will have a frontend UI and a basic API endpoint to fetch tasks.

To follow along, you can clone the GitHub repo. cd into the project directory and run npm install && npm run dev. This app creates a simple UI where users can add and delete tasks, stored in the component’s state, and an API that returns a static list of tasks, simulating a backend response.

Adding Shortest to our application

To install Shortest, the command below will help you set up the process in a new or existing project:

npx @antiwork/shortest init

This command will:

  • Install @antiwork/shortest as a dev dependency
  • Create a shortest.config.ts file
  • Generate a .env.local file with placeholders
  • Update .gitignore to include .env.local and .shortest/

Now edit shortest.config.ts to match the application setup:

import type { ShortestConfig } from "@antiwork/shortest";
export default {
  headless: false,
  baseUrl: "http://localhost:3000",
  browser: {
    contextOptions: {
      ignoreHTTPSErrors: true
    },
  },
  testPattern: "**/*.test.ts",
  ai: {
    provider: "anthropic",
    apiKey: process.env.ANTHROPIC_API_KEY
  },
} satisfies ShortestConfig;

Edit .env.local and add your Anthropic API key (you’ll need to sign up for one here). You can also configure browser behavior using the browser.contextOptions property in your config file. This will allow you to pass custom Playwright browser context options.

Ensure .env.local is in .gitignore to avoid committing sensitive data.

Writing and executing a natural language test

In this section, we’ll explore how to write and execute tests using Shortest. We’ll write a test to verify adding a task to the to-do list.



Create a test file using the specified pattern in the config file app/todo.test.ts:

import { shortest } from '@antiwork/shortest';

shortest('Add a new task to the to-do list', {
  task: 'Buy groceries',
});

This test instructs Shortest to add a task with the text “Buy groceries” to the list. Now run the test using this command:

npx shortest app/todo.test.ts 

Here’s what happens:

  • Shortest launches a browser (Playwright-based) in non-headless mode (headless: false)
  • It navigates to http://localhost:3000
  • The Anthropic Claude API interprets the natural language description, identifies the input field and “Add” button, enters “Buy groceries,” and clicks the button
  • A screenshot is saved in .shortest/ for verification

The test passes if the task appears in the list. You’ll see the browser perform the actions live, and the console will report success:

Found 1 test file(s)
❯ app/todo.test.ts (1)
  ● Add a new task to the to-do list
    ✓ passed
  ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ 

   Tests          1 passed (1)
   Duration       10.84s
   Started at     3:47:47 PM
   Tokens         0 tokens (≈ $0.00)

 ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯

Demonstrating advanced features like test chaining and API testing

Let’s demonstrate test chaining and API testing to showcase Shortest’s advanced capabilities. We’ll chain tests to add a task and then delete it. Edit the app/todo.test.ts file and execute the test:

import { shortest } from '@antiwork/shortest';

shortest([
  'Add a new task to the to-do list with text Buy groceries',
  'Delete the task with text Buy groceries from the to-do list',
]);

Shortest will make sure that:

  • The first test adds “Buy groceries” to the list
  • The second test locates the task and clicks its “Delete” button
  • Shortest’s AI ensures the sequence executes correctly, maintaining browser state between tests

Now, let’s test the /api/tasks endpoint to ensure it returns the expected tasks. Add the code below to the app/todo.test.ts file and execute the test:

import { shortest } from '@antiwork/shortest';

const API_BASE_URI = 'http://localhost:3000/api';

// UI Test Chain
shortest([
  'Add a new task to the to-do list with text Buy groceries',
  'Delete the task with text Buy groceries from the to-do list',
]);
// API Test
shortest(`
  Test the API GET endpoint ${API_BASE_URI}/tasks
  Expect the response to contain a list of tasks including Sample Task 1
`);

Here’s what happens:

  • The UI tests run as before
  • The API test sends a GET request to /api/tasks
  • Shortest’s AI verifies that the response includes “Sample Task 1” (from our static tasks array)

The API test passes if the response contains the expected task. Shortest then logs the API response details, and the test suite completes successfully:

  ● Test the API GET endpoint http://localhost:3000/api/tasks Expect the response to contain a list of tasks including Sample Task 1
    ✓ passed
    ↳ 6,414 tokens (≈ $0.02)
  ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ 

   Tests          1 passed (1)
   Duration       19.65s
   Started at     4:32:52 PM
   Tokens         6,414 tokens (≈ $0.02)

 ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯

Using callbacks for custom assertions

Shortest allows you to use callback functions for custom checks and actions after your browser tests run. This feature lets you create more complex test scenarios, like checking your database or making API calls, to see how your application is doing after user interactions.

To demonstrate callbacks, let’s add a test with a custom assertion to verify the task count after adding a task. Add this to the app/todo.test.ts file and run the test:

shortest('Add a task and verify task count', {
  task: 'Learn TypeScript',
}).after(async ({ page }) => {
  const taskCount = await page.locator('li').count();
  if (taskCount < 1) {
    throw new Error('No tasks found in the list');
  }
});

The test confirms the task was added, enhancing reliability with custom logic. What happens:

  • The test adds “Learn TypeScript” to the list
  • The .after callback uses Playwright’s API to count
  • elements (tasks)
  • If at least one task exists, the assertion passes
  ● Add a task and verify task count
    ✓ passed
    ↳ 40,100 tokens (≈ $0.13)
  ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ 

   Tests          1 passed (1)
   Duration       55.93s
   Started at     4:35:45 PM
   Tokens         40,100 tokens (≈ $0.13)
  
 ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯

Using lifecycle hooks

Lifecycle hooks let you run code before and after tests. This helps with tasks like setting up the task list, navigating to the app, cleaning the UI state, and more. In your app/todo.test.ts file, add the code below and run the test:

shortest.beforeAll(async ({ page }) => {
  await page.goto('http://localhost:3000');
  // Clear any existing tasks by deleting all visible tasks
  while (await page.locator('button:text("Delete")').count() > 0) {
    await page.locator('button:text("Delete")').first().click();
  }
});

shortest.beforeEach(async ({ page }) => {
  await page.reload();
});

shortest.afterEach(async ({ page }) => {
  // Clear the input field to prevent carryover
  await page.locator('input[placeholder="Enter a new task"]').fill('');
});

shortest.afterAll(async ({ page }) => {
  await page.close();
});

Here are the lifecycle hooks Shortest provides:

  • beforeAll: Executes once before all tests. Ideal for initial setup, such as navigating to the app and clearing any pre-existing tasks by clicking all “Delete” buttons
  • beforeEach: Executes before each test. Useful for resetting the UI state, like reloading the page to clear tasks stored in the component’s state
  • afterEach: Executes after each test. Handy for cleanup, such as clearing the input field to ensure no text persists between tests
  • afterAll: Executes once after all tests. Suitable for final cleanup, like closing the browser to free system resources

The hooks ensure a consistent and isolated testing environment. In the code above, each test starts with an empty task list, the input field is cleared post-test, and the browser is closed at the end, preventing state leakage and ensuring reliable test execution.

Comparing Shortest with traditional testing frameworks

Shortest has many advantages over traditional frameworks like Selenium and Cypress.

Traditional testing tools require long and complicated code for browser tasks, and they don’t have built-in AI support, making them slow and prone to errors. For example, Cypress, while modern, uses a lot of JavaScript. Even though it has started to implement some AI features like automatic test creation for missing UI elements, it is not primarily AI-driven.

Shortest’s AI features offer a different approach, allowing testers to write shorter, human-friendly tests and reducing the time and technical skills required to set up tests. For example, a login test in Selenium can take dozens of lines of code to navigate the website and manage waits, while Shortest can achieve this with just one simple sentence. Similarly, Cypress simplifies some tasks, but still needs specific commands like cy.get() and cy.click() to do so.

Shortest uses Playwright to provide performance similar to Cypress, but it also integrates with the Claude API to handle complex tasks automatically, such as managing dynamic forms or validating API responses. These are tasks for which traditional frameworks require manual coding.

It is important to note, however, that Shortest relies on Anthropic’s API, which means it depends on an external service. This is different from Selenium and Cypress, which are self-contained. Another thing to consider is that Shortest’s natural language method might feel less precise for developers who want detailed control over their tests.


More great articles from LogRocket:


However, for most end-to-end testing scenarios, Shortest’s ease of use and AI features make it a strong option, especially for teams that value speed and accessibility over deep customization.

Conclusion

AI-driven testing tools like Shortest, Testim, Mabl, and Functionize are changing how we do end-to-end testing. These tools use automation to help teams spend less time on maintenance and allow non-coders to take part in testing, resulting in higher quality software. While traditional tools like Selenium and Cypress are still effective, AI-powered tools offer a strong option for teams that want to improve their testing processes.

As AI technology advances, we will likely see even more improvements that simplify testing and strengthen software reliability.


Share this content:

I am a passionate blogger with extensive experience in web design. As a seasoned YouTube SEO expert, I have helped numerous creators optimize their content for maximum visibility.

Leave a Comment