My work at Google
I worked as an intern on the Chromium infra team. Chromium has a continually running test suite (https://build.chromium.org/p/chromium/waterfall) and is continually updated with new versions of the (very large) codebase. Then, if you notice that a test has become flaky (success rate between 0-100, non-inclusive) in a certain version of the code, it is non-trivial to identify at what version the test became flaky. Knowing this code version has large value to the engineer, because she can then compare that version of the code with the previous version to figure out what is causing the flaky behavior (or an automated script can point at what it thinks might be the culprit), and the test can be fixed.
My project was to automate the task of figuring out the version of the code where a test became flaky. The previous versions of the code were explored using a sequential search with a capped increasing step size until the state of the code is found to switch between flaky and stable, and then the exact version of the code is narrowed down with a sequential search. Both the cap and the step increment were tuned based on existing flaky tests.
The project is live and the dashboard can be found here .
The code can be found in the chromium codesearch . The code in the flake directory is all related to this project (but the project is actively being developed by the chromium infra team, so not all work is mine).