Background: currently we use dependency "analyze" to determine which targets to be built at a given revision/commit. This does reduce the number of targets to be built than a full build. However, it is still not fast enough (still over an hour on average).
With info from ninja, we could do more fine-grained rerun of compile. When a compile failure occurs, ninja knows which edges failed. An edge here could be a CXX or CC for compile from source files to object files, a LINK for link from object files to libs or executables, an ACTION for running python to generate code, etc. We could retrieve the output nodes (object files, libs/executables, generated code, etc) from ninja, and then feed them back into ninja to rerun the compile step. In this way, we could minimize the number of tasks to be executed by ninja, eg: if a CXX/CC compile fails, we don't have to compile all source files and run the LINK for the executable which runs much longer than a single CXX.
Experiments shows that compile time could be reduced from 1+ hour to ~10 minutes. (Time for bot_update and gclient sync are excluded here.)
And CXX/CC failures are the majority of all compile failures. (The statistic data are from April 2015 to Feb 2016 of UNIQUE* failures in the compile steps.)
Faiure types | # of occurrence | Percentage
compile (CXX/CC) | 1272 | 70.70%
link (LINK) | 191 | 10.62%
others(ACTION,etc) | 336 | 18.68%
*In the same builder, first-time failures of compile step after a green build are treated as unique.
The main tasks to achieve this are:
1. Collect output nodes of failed edges from ninja on the main waterfall.
Note: need to wait for the next release of ninja 1.7
2. Pass these info to the culprit-finding recipe findit/chromium/compile.
3. Check if the given nodes still exist at a revision/commit against which compile is to be re-run.
Eg: a source file could be added or deleted by some revision in the regression range.
4. Re-run compile for the existing nodes only.
Comment 1 by bugdroid1@chromium.org
, Mar 10 2016