Luci builds should be using the same path on every build |
|||||||
Issue descriptionUnfortunately, some build systems like CMake hardcode the path to the source tree in the generated Ninja files. This poses a problem when using named caches, since these are currently symlinked to a different location on every build which causes CMake builds to fail because it cannot find the original source tree on incremental rebuilds.
,
Mar 2 2017
,
Mar 2 2017
Would it be possible to add realpath method to PathApi in recipes?
,
Mar 2 2017
I think it would be valid. +iannucci as owner of recipes in general. Feel free to send a CL to iannucci
,
Mar 6 2017
maruel, what was the rationale behind using a random run dir? I assume handles on Windows?
,
Mar 7 2017
Random directory is intentional: - Works around zombie processes interference (on all OSes). - Works around directory deletion on Windows. - Remove the possibility of hardcoding the path in the tasks, this is bound to happen. - Enforces the use of relative path throughout the system, like it was detected here. I'm primarily concerned about the third point. I quick search led me to CMAKE_CURRENT_SOURCE_DIR as documented https://cmake.org/Wiki/CMake_Useful_Variables which should be leveraged if possible at low cost. That will save a lot of pain down the road. If this is deemed too much work, we can work at other options but they are all worse IMHO. Last point: the named cache *has* the same path for every task, it's not being renamed. It is the symlink that is different. So the workaround is to use realpath and move on but I'd recommend trying to remove absolute path first if possible.
,
Mar 7 2017
The problem is that CMake always uses absolute paths in generated Makefiles/Ninja files, see https://cmake.org/Wiki/CMake_FAQ#Why_does_CMake_use_full_paths.2C_or_can_I_copy_my_build_tree.3F, so this is not something we can change without changing the design of CMake. We can work around this by using realpath, but what about issue 694911 ? We would really like to use the Goma cache, especially for clean builds. Shall we always use the named cache (a different one than for the incremental build) with realpath and clean up the content of the cache after the build? That somewhat defeats the purpose of the cache, but it's the only solution that's going to work at the moment.
,
Mar 7 2017
Fair enough, I guess realpath it is. This doesn't "defeat" the purpose of the cache: the bot still manages it and can delete it whenever needed, so we still have the value of automatic self-cleaning.
,
Mar 7 2017
Would it be possible to add an option to clean up the cache after each task? Doing it outside of recipes is going to be more reliable, in recipes we'd have wrap the entire recipe in the try block to make sure the cache always gets cleaned up at the end.
,
Mar 7 2017
I don't understand your last comment, I think I'm confused about the overall constraints, let's chat over IM.
,
Mar 27 2017
phosek, what's the status of this bug? did realpath help?
,
Apr 3 2017
in issue 706224 maruel agreed to implement this.
,
Apr 4 2017
realpath helped with CMake builds but not Goma, 706224 should address this.
,
Apr 4 2017
Forgot to generalize this bug to cover run dir
,
Apr 4 2017
,
Apr 7 2017
The following revision refers to this bug: https://chromium.googlesource.com/external/github.com/luci/luci-py.git/+/15283a543acb4912643539215968f04fbc4f7e02 commit 15283a543acb4912643539215968f04fbc4f7e02 Author: maruel <maruel@chromium.org> Date: Fri Apr 07 15:39:20 2017 Switch run_isolated.py to use a constant path 'ir'. This means that for a Swarming bot running at /b/s, the run dir root will be /b/s/w/ir/. The output and temporary directories remain random. All a variable to map_and_run() in run_isolated.py, as maybe it could be made a task option (?) Still undecided. R=nodir@chromium.org BUG= 697759 Review-Url: https://codereview.chromium.org/2800643003 [modify] https://crrev.com/15283a543acb4912643539215968f04fbc4f7e02/client/run_isolated.py [modify] https://crrev.com/15283a543acb4912643539215968f04fbc4f7e02/client/tests/run_isolated_test.py
,
Apr 7 2017
2736-15283a5 was deployed to prod. /b/s/w/ir/ is the common root now.
,
Apr 8 2017
phosek, can you verify it helped?
,
Apr 8 2017
I think it did help Goma. I'm looking at https://luci-scheduler.appspot.com/jobs/manifest/linux-x86-64-release. Before this change, there were consistently 0 cache hits (see 'build Fuchsia.goma_stat' step): request: total=9387 success=9386 failure=0 compiler_proxy: fail=0 compiler_info: stores=28 store_dups=0 miss=0 fail=0 goma: finished=9314 cache_hit=0 local_cachehit=0 aborted=17 retry=10 fail=1 After workdir change I see goma: finished=9055 cache_hit=1619 local_cachehit=0 aborted=32 retry=3 fail=1 So, hurray, 18% cache hit is better than 0%.
,
Apr 8 2017
Yes, seems like it did help, but I'm still trying to understand why the cache hit is so low, locally we usually see around 70%.
,
Apr 8 2017
Thank you. I'm glad to hear that we stared to observe cache hits. This bug is about constant rundir. Bug 706224 is specifically about Goma on LUCI. Let's continue the discussion there. |
|||||||
►
Sign in to add a comment |
|||||||
Comment 1 by no...@chromium.org
, Mar 2 2017Labels: -Restrict-View-Google luci Pri-1 Type-Feature