Maybe it’s because I work in automated testing, but I get an ad for Rainforest QA and their self-healing tests on LinkedIn every single day.
This idea of a self-healing test caught my eye, especially as someone working on automated game testing in Unity. Games are often an order of magnitude more dynamic than that of web or app software. Even small changes in game mechanics and graphics can cause tests to break, not to mention the inherent randomness and non-deterministic behavior that make games replayable and fun.
Can we implement an effective self-healing test system for games? Let's explore this idea further and see where it leads.
At its core, a self-healing test is a test in which AI can rewrite portions of the test when it fails by identifying elements in a piece of software based on the intent of the original test.
It seems like the space has grown quite a bit over the past few years in the wake of LLMs. Before, pure machine learning approaches would focus on classifying new UI elements on a page when the expected elements could not be located. With LLMs, training complicated models is no longer necessary, and tools like Healenium and Rainforest QA can generally write new rules on the fly using natural-language descriptions of the desired tests. These tools primarily focus on regenerating the rules for finding elements on the page (e.g. the expected login button tag was missing, but we see this new login button - let’s use that), with quite a few limitations on wildly changing the actions taken as part of the test.
Not everyone is sold on the idea of self-healing tests. For instance, take this quote from an excellent article by Eliya Hasan regarding the practical downsides of this approach:
... the reliability of self-healing locators comes into question due to their susceptibility to false positives — misinterpreting changes and leading to incorrect adjustments in tests. This not only undermines the integrity of test results but also complicates the debugging and maintenance processes ...
We still have a long way to go, but if implemented in a way that gives humans the ultimate control over how these tests are accepted, this has the potential to really change how testing is done.
Self-healing tests in games is an interesting concept because the problems that healing tests face in regular software are even more pronounced in this domain. In a game test, it's not often just a "missing element" causing the test to become flaky.
These facts of game testing often cause tests to break and become flaky, and this is cited as a common reason why developers abandon the idea of automated tests. Of course, developers need to write better tests, but this fact alone leads me to think that a system that can heal game tests could be beneficial. Just see this post from Reddit when someone asked why game developers don’t create tests early in the game development process.
I’ve seen this firsthand in Unity Test Runner files as well, where quality assurance teams will start with writing a simple script for automating their gameplay, but it eventually gets removed due to the difficulty in maintaining that file as the game grows. Especially in Unity, the support for dynamic and conditional behavior in tests is lacking (a problem we are looking to solve with smarter agents).
I suspect that a combination of multiple test runs, using humans in the loop, and providing better testing tools that support the dynamics of a game could be a decent way to support self-healing game tests and solve some of these issues.
There are a few things I want to think more deeply about and experiment with at Regression Games:
There is a lot more to say on this subject - let's chat about it as a community! Join the conversation on our Discord.
References
https://help.rainforestqa.com/docs/how-to-generate-self-healing-tests
https://www.linkedin.com/pulse/automation-self-healing-healenium-venkata-goday/
https://www.testim.io/test-stability/