New issue
Advanced search Search tips

Issue 672841 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Dec 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 3
Type: Bug



Sign in to add a comment

CQ failed because base_unittests.exe is dirty

Project Member Reported by primiano@chromium.org, Dec 9 2016

Issue description

crrev.com/2549103003 failed CQ twice back to back on win_chromium_compile_gdb with this:

I'm trying to figure out if this is a flake or I am doing something wrong in my CL. Any idea?
I think that the linker warnings few lines should be unrelated (See  Issue 669352 , they have been there for a while)


From full log in [1]

[1/2] LINK base_unittests.exe base_unittests.exe.pdb
[2/2] STAMP obj/base/base_unittests_run.stamp
ninja explain: output base_unittests.exe doesn't exist
ninja explain: base_unittests.exe is dirty
ninja explain: base_unittests.exe is dirty
ninja explain: obj/base/base_unittests_run.stamp is dirty
ninja explain: base_unittests.exe is dirty
ninja explain: obj/base/base_unittests_run.stamp is dirty
<Thread(Thread-1, started 2928)> ProcessRead: proc.stdout finished.
<Thread(Thread-1, started 2928)> ProcessRead: cleaning up.
<Thread(Thread-2, started daemon 3868)> TimedFlush: Finished
<Thread(Thread-1, started 2928)> ProcessRead: finished.
Failing build because ninja reported work to do.
This means that after completing a compile, another was run and
it resulted in still having work to do (that is, a no-op build
wasn't a no-op). Consult the first "ninja explain:" line for a
likely culprit.


[1] https://luci-logdog.appspot.com/v/?s=chromium%2Fbb%2Ftryserver.chromium.win%2Fwin_chromium_compile_dbg_ng%2F311875%2F%2B%2Frecipes%2Fsteps%2Fcompile__with_patch_%2F0%2Fstdout
 
Actually it is not true that it failed twice, only once in the last patchset.
The previous win failure on the win box was another unrelated flake ( Issue 672838 )
Cc: scottmg@chromium.org
(https://luci-logdog.appspot.com/v/?s=chromium%2Fbb%2Ftryserver.chromium.win%2Fwin_chromium_compile_dbg_ng%2F311875%2F%2B%2Frecipes%2Fsteps%2Fcompile__with_patch_%2F0%2Fstdout has been "loading streams" for like a minute now without any output, do you have a normal stdout link too?)

`ninja explain: output base_unittests.exe doesn't exist`

So I guess something deleted base_unittests.exe between the first and the second build? Weird.
> has been "loading streams" for like a minute now without any output, do you have a normal stdout link too?)
Aha so it's not slow only in EMEA.

Old style link (until it becomes deprecated :P)
https://build.chromium.org/p/tryserver.chromium.win/builders/win_chromium_compile_dbg_ng/builds/311875/steps/compile%20%28with%20patch%29/logs/stdio
[776/13752] LINK base_unittests.exe base_unittests.exe.pdb
trace_config_unittest.obj : error LNK2019: unresolved external symbol "public: bool __thiscall base::trace_event::TraceConfig::EventFilterConfig::GetArgAsSet(char const *,class std::unordered_set<class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> >,struct std::hash<class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > >,struct std::equal_to<class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > >,class std::allocator<class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > > > *)const " (?GetArgAsSet@EventFilterConfig@TraceConfig@trace_event@base@@QBE_NPBDPAV?$unordered_set@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@U?$hash@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@@2@U?$equal_to@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@@2@V?$allocator@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@@2@@std@@@Z) referenced in function "private: virtual void __thiscall base::trace_event::TraceConfigTest_TraceConfigFromValidString_Test::TestBody(void)" (?TestBody@TraceConfigTest_TraceConfigFromValidString_Test@trace_event@base@@EAEXXZ)

./base_unittests.exe : fatal error LNK1120: 1 unresolved externals


But I have no idea why the build didn't fail there.


(new output thing is painfully slow for me too)
Cc: brucedaw...@chromium.org
Oh! I might actually. Some goof did a bad job of reviewing https://codereview.chromium.org/2558153002/diff/20001/build/toolchain/win/tool_wrapper.py .

I think it might require the Popen.wait() still Bruce? (that was in an earlier patchset)
Ahh ouch, that's definitely me missing a BASE_EXPORT.
Managed to spot another link error but not that one. Thanks.

But, yes, usually it stops there.
I am used to grep for "FAILED" in the log, but in this log there is no failed message anywhere.

I'm reverting that CL for now https://codereview.chromium.org/2564893002/ just in case, as unexpectedly passing links are bad news (and the change was made for a relatively niche problem that doesn't affect everyone).
My assumption about link.wait() was that if reading from .stdout stopped returning data (as opposed to blocking while waiting for more data) then that must mean that the process is done, although I guess what stage of "done" is unclear, so maybe the link.wait() is needed. I'll do some local testing to see if I can reproduce the failure to notice the failure.

Long-term I definitely want the unlimited log-size capability but nobody will be harmed by a temporary reverting.

That would make sense, but it is Python! You could try a clobber tryjob with an assert(link.returncode != None). From skimming c:\Python27\Lib\subprocess.py, I think the wait() might be required (and in fact, it returns the returncode, so the code doesn't need to get any longer.)
Yep, all true. I confirmed that the .wait() is needed (else ninja returns %errorlevel% == 0) and that return link.wait() works.

That's what I get for simplifying code too much. “Everything should be made as simple as possible, but no simpler.”
Project Member

Comment 11 by bugdroid1@chromium.org, Dec 9 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/81dfeb7be03b9b3b7311ede93908fd8159761446

commit 81dfeb7be03b9b3b7311ede93908fd8159761446
Author: brucedawson <brucedawson@chromium.org>
Date: Fri Dec 09 23:39:05 2016

Change ExecLinkWrapper to not buffer all tool output

/verbose linking of chrome.dll creates over one GB of output. This
causes ExecLinkWrapper to consume over two GB of memory which leads to
an OOM failure in the 32-bit depot_tools python, and the loss of all
of the valuable output. This change modifies ExecLinkWrapper to
process the output one line at a time, thus avoiding the OOM.

I've tested that this handles the 1.1 GB of output which the previous
version of this function failed on and I've visually confirmed that the
output looks the same - no extraneous blank lines, for instance, when
displaying warnings, errors, or 1.9 million lines of verbose output.

I also verified that the script stays idle when waiting for output -
blocking on .readline().

This fixes the previous attempt which omitted the vital link.wait() call
which meant that error codes were not propagated.

R=scottmg@chromium.org
BUG= 672182 , 672841 

Review-Url: https://codereview.chromium.org/2568563002
Cr-Commit-Position: refs/heads/master@{#437696}

[modify] https://crrev.com/81dfeb7be03b9b3b7311ede93908fd8159761446/build/toolchain/win/tool_wrapper.py

Owner: brucedaw...@chromium.org
Status: Fixed (was: Untriaged)

Sign in to add a comment