Testing on Thin Ice: Chipping Away at Test Unpredictability

A presentation at DevNexus in March 2025 in Atlanta, GA, USA by Brian Demers

Slide 1

Slide 1

Testing on Thin Ice Chipping Away at Test Unpredictability

Slide 2

Slide 2

DevNexus 2025 | Testing on Thin Ice LOTS OF RERUN TESTS Brian Demers François Martin Java Champion Developer Advocate @bdemers Senior Full Stack Software Engineer @martinfrancois

Slide 3

Slide 3

DevNexus 2025 | Testing on Thin Ice $36 Billion /year (USD) Loss Productivity $40 Billion End World Hunger by 2030 🦋bdemers.io | bdemers /year (USD)

Slide 4

Slide 4

DevNexus 2025 | Testing on Thin Ice Topics What is a Flaky Test? How to find them How to deal with them How to fix them Slides & eBook 🦋fmartin.ch | FrançoisMartin

Slide 5

Slide 5

DevNexus 2025 | Testing on Thin Ice What is a Flaky Test? No Flake News! 🦋fmartin.ch | FrançoisMartin

Slide 6

Slide 6

DevNexus 2025 | Testing on Thin Ice Definition Test that exhibits both a passing and failing test result with the same code if run multiple times 🦋fmartin.ch | FrançoisMartin

Slide 7

Slide 7

DevNexus 2025 | Testing on Thin Ice Timing related Example: 1. 2. 3. 4. Enter search term into textbox Click on button “Search” Wait for one second Click on first result 🦋fmartin.ch | FrançoisMartin

Slide 8

Slide 8

DevNexus 2025 | Testing on Thin Ice Test data related Examples: ● Input field for prename doesn’t allow special characters ○ Some randomly generated prenames in tests contain special characters (“François”) ● Tests reliant on predefined test data in the database ○ Dependencies between tests ○ Failing tests cause inconsistencies 🦋fmartin.ch | FrançoisMartin

Slide 9

Slide 9

DevNexus 2025 | Testing on Thin Ice Infrastructure related Unstable infrastructure Tests time out Most difficult to reproduce Examples: ● Build runners under heavy load ● Microservices with networking issues 🦋fmartin.ch | FrançoisMartin

Slide 10

Slide 10

DevNexus 2025 | Testing on Thin Ice Resource contention related Mostly when running tests in parallel Examples: Multiple tests trying to… ● … write to the same file at the same time ○ File lock ○ Concurrent modification errors ● … create a user with the same username at the same time ○ Unique constraint violations 🦋fmartin.ch | FrançoisMartin

Slide 11

Slide 11

DevNexus 2025 | Testing on Thin Ice Why Worry about Flaky Tests? ● Re-running tests wastes a lot of time ● Repetitive investigation efforts to determine if flaky or legitimate failure ● Team starts losing trust in test results ● Bad software quality ● Pilots frequently ignoring an alarm in cockpits of a Boeing 737 resulted in fatal plane crash in 2005 🦋fmartin.ch | FrançoisMartin

Slide 12

Slide 12

DevNexus 2025 | Testing on Thin Ice Unit Tests vs Integration Tests UT vs IT 🦋bdemers.io | bdemers

Slide 13

Slide 13

DevNexus 2025 | Testing on Thin Ice End-to-End Tests Integration Tests Unit Tests Testing Pyramid 🦋bdemers.io | bdemers

Slide 14

Slide 14

DevNexus 2025 | Testing on Thin Ice Unit Tests Integration Tests Flaky Foundations End-to-End Tests 🦋bdemers.io | bdemers

Slide 15

Slide 15

DevNexus 2025 | Testing on Thin Ice Te nd -E to dEn io n at gr te In Un it Te st s Te s st s ts Complexity Flakiness 🦋bdemers.io | bdemers

Slide 16

Slide 16

DevNexus 2025 | Testing on Thin Ice How to find Flaky Tests Flake Busters 🦋fmartin.ch | FrançoisMartin

Slide 17

Slide 17

DevNexus 2025 | Testing on Thin Ice Running tests in parallel Can reveal many issues: ● Resource contention ● Dependency on test data ● Dependency on test execution order Side benefit: Tests execute faster ● Playwright & WebdriverIO: parallel by default ● Cypress: only in cloud (paid) ● JUnit and Gradle: through configuration 🦋fmartin.ch | FrançoisMartin

Slide 18

Slide 18

DevNexus 2025 | Testing on Thin Ice Rerun Tests Most test frameworks allow for executing a test multiple times. Slides & eBook 🦋bdemers.io | bdemers

Slide 19

Slide 19

Maven Surefire Plugin DevNexus 2025 | Testing on Thin Ice <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-surefire-plugin</artifactId> <configuration> <rerunFailingTestsCount>3//rerunFailingTestsCount> <failOnFlakeCount>0//failOnFlakeCount> </configuration> </plugin> 🦋bdemers.io | bdemers

Slide 20

Slide 20

Rerun Gotcha Increased time to fail 🦋bdemers.io | bdemers

Slide 21

Slide 21

DevNexus 2025 | Testing on Thin Ice 🦋bdemers.io | bdemers

Slide 22

Slide 22

DevNexus 2025 | Testing on Thin Ice Dealing with Flaky Tests 🦋fmartin.ch | FrançoisMartin

Slide 23

Slide 23

DevNexus 2025 | Testing on Thin Ice Prevention In Pull / Merge Requests, run new or changed tests 3, 10, 50, or 1000 times (depending on resources and test type) Alternatively run all tests multiple times nightly 🦋fmartin.ch | FrançoisMartin

Slide 24

Slide 24

DevNexus 2025 | Testing on Thin Ice Tell Your Team! Don’t let them waste time Uncover deeper issues Saves morale 🦋bdemers.io | bdemers

Slide 25

Slide 25

DevNexus 2025 | Testing on Thin Ice New Sprint Checklist Fix a flaky test Pull a new issue 🦋fmartin.ch | FrançoisMartin

Slide 26

Slide 26

DevNexus 2025 | Testing on Thin Ice Flaky Test Days 🦋bdemers.io | bdemers

Slide 27

Slide 27

DevNexus 2025 | Testing on Thin Ice Temporarily Quarantine Tests // JUnit @Disabled(“quarantine: reason or link to issue here”) @Test void flakyTest() { } // Jest </ quarantine: reason or link to issue here it.skip(‘should throw an error’, () <> { expect(response).toThrowError(expected_error); }); 🦋fmartin.ch | FrançoisMartin

Slide 28

Slide 28

DevNexus 2025 | Testing on Thin Ice Quarantine ● Quarantine test after first failure ● Fix the quarantined tests timely ● Set a limit for tests in quarantine ● Define time limit for quarantined tests 🦋fmartin.ch | FrançoisMartin

Slide 29

Slide 29

DevNexus 2025 | Testing on Thin Ice Delete Flaky Tests Does the test provide value? Can it be fixed? Keeping a flaky test provides negative value 🦋bdemers.io | bdemers

Slide 30

Slide 30

SSH into CI Agent $ ssh ci-agent cd /your-project # rerun build ./mvnw clean install 🦋bdemers.io | bdemers DevNexus 2025 | Testing on Thin Ice

Slide 31

Slide 31

DevNexus 2025 | Testing on Thin Ice How to fix Flaky Tests Defeating Murphy’s Law 🦋fmartin.ch | FrançoisMartin

Slide 32

Slide 32

DevNexus 2025 | Testing on Thin Ice Test Data Consistency // JUnit 5 // Jest @BeforeEach void setup() { </ Create test data } beforeEach(() <> { </ Create test data }); @AfterEach void teardown() { </ Delete test data } afterEach(() <> { </ Delete test data }); 🦋fmartin.ch | FrançoisMartin

Slide 33

Slide 33

DevNexus 2025 | Testing on Thin Ice Test Data Consistency // Using Testcontainers </ build.gradle testImplementation “org.testcontainers:postgres:1.19.8” testImplementation “org.testcontainers:junit-jupiter:1.19.8” </ JUnit Test @Testcontainers public class IntegrationTest { @Container PostgreSQLContainer<?> postgresContainer = new PostgreSQLContainer<>(“postgres:16.3”); } 🦋fmartin.ch | FrançoisMartin

Slide 34

Slide 34

DevNexus 2025 | Testing on Thin Ice Ports Source Data Tests Data 🦋bdemers.io | bdemers Isolate Tests

Slide 35

Slide 35

DevNexus 2025 | Testing on Thin Ice Don’t Reuse System Resources ● Setting System Properties System.setProperty(“key”, “value”); ● Use unique files for each test temp directories, output files, etc. ● Use a random free ephemeral port (49152 - 65535) new ServerSocket(0).getLocalPort() 🦋bdemers.io | bdemers

Slide 36

Slide 36

DevNexus 2025 | Testing on Thin Ice In E2E Tests ● Use deep links to navigate directly to the page to be tested ● Start the test with the test user already logged in ● Make it possible to run each test spec independently ● Do not wait in predefined time intervals, wait for a condition to be fulfilled with a timeout ● Use stable test ids in selectors ● Consider mocking external APIs 🦋fmartin.ch | FrançoisMartin

Slide 37

Slide 37

DevNexus 2025 | Testing on Thin Ice Te nd -E to dEn io n at gr te In Un it Te st s Te s st s ts Convert ITs and E2E into UTs Flakiness 🦋bdemers.io | bdemers

Slide 38

Slide 38

DevNexus 2025 | Testing on Thin Ice Te nd -E to dEn io n at gr te In Un it Te st s Te s st s ts Convert ITs and E2E into UTs Stability 🦋bdemers.io | bdemers

Slide 39

Slide 39

DevNexus 2025 | Testing on Thin Ice Fix the Culture of Flaky Tests ● Brown Bag / Lunch & Learn ● Make flaky tests visible ● Track them ● Use manager terms 💰 🦋bdemers.io | bdemers

Slide 40

Slide 40

DevNexus 2025 | Testing on Thin Ice THANKS! QUESTIONS? bdemers.io fmartin.ch bdemers FrançoisMartin CREDITS: This presentation template was created by Slidesgo, and includes icons by Flaticon, and infographics & images by Freepik and images generated by OpenAI’s DALL-E model via ChatGPT Slides & eBook