I think that CL was intended to fix another problem we had where the "without patch" results would get written into the same directory as the "with patch" results, overwriting things.
Nowadays we store things under the step name, so we shouldn't have that issue any more. I don't think there's another good reason not to upload the "without patch" results for webkit_layout_tests.
I have a similar issue with "webkit_layout_tests (with patch)" on win7_chromium_rel_ng.
It failed last 4 times for angle rolls:
https://ci.chromium.org/p/chromium/builders/luci.chromium.try/win7_chromium_rel_ng/82861https://ci.chromium.org/p/chromium/builders/luci.chromium.try/win7_chromium_rel_ng/82937https://ci.chromium.org/p/chromium/builders/luci.chromium.try/win7_chromium_rel_ng/83046https://ci.chromium.org/p/chromium/builders/luci.chromium.try/win7_chromium_rel_ng/83135
The problem seems to be in "Merge script log" step:
2018-09-11 12:12:05,206 - root: [WARNING] Output directory exists 'C:\\b\\swarming\\w\\ir\\kitchen-workdir'
2018-09-11 12:12:05,206 - blinkpy.common.system.filesystem: [INFO] Removing contents of C:\b\swarming\w\ir\kitchen-workdir
2018-09-11 12:12:05,206 - blinkpy.common.system.filesystem: [DEBUG] Removing directory C:\b\swarming\w\ir\kitchen-workdir\recipe_cleanup
2018-09-11 12:12:05,237 - blinkpy.common.system.filesystem: [ERROR] Failed at <built-in function remove> C:\b\swarming\w\ir\kitchen-workdir\recipe_cleanup\bot_update\src_20dbeaa87ce64704939ee59ea978f4aa\.git\objects\00\031f291c967520f83243ff4cdb5a46aa781455: (<type 'exceptions.WindowsError'>, WindowsError(5, 'Access is denied'), <traceback object at 0x0000000003209FC8>)
Traceback (most recent call last):
File "C:\b\swarming\w\ir\cipd_bin_packages\bin\Lib\shutil.py", line 250, in rmtree
os.remove(fullname)
WindowsError: [Error 5] Access is denied: 'C:\\b\\swarming\\w\\ir\\kitchen-workdir\\recipe_cleanup\\bot_update\\src_20dbeaa87ce64704939ee59ea978f4aa\\.git\\objects\\00\\031f291c967520f83243ff4cdb5a46aa781455'
Re #3:
* If the test output is not in valid format, we could leave it out for now. I filed a separate bug 883113 for those perf gtests.
* For other cases where we have test results in valid json format, we'd better upload them. Otherwise, the upper-layer toolings or services have to handle them specially and that introduces complexity and is hard to maintain.
I tried turning this on for layout tests, and this broke. It broke because layout tests have a special results viewer hosted on the test results server, which doesn't know about with and without patch. I'll fix that, and then try re-landing.
I had to revert the CL in #12, because (I'm guessing) it broke rebaseline_cl. vmpstr@ in person asked me for some help because a rebaseline was failing, and I guessed it had to do with this CL landing.
It's a bit unclear what the correct path forward is. I guess that we should keep the current behavior that there's an existing default set of results for a build. Currently anything uploaded in a build which isn't a weird version of layout tests (site per process) is uploaded to this default path. This broke when I enabled without patch uploads, because both with and without results were being uploaded to the same path, which obviously broke.
I can either change everything to explicitly reference the step name, which would have to land after a reland of #12 lands, or special case the first run of "webkit_layout_tests (with patch)" to always be uploaded to the default path, and anything else gets uploaded with a step name. There might be other solutions I'm not seeing.
Anyone have opinions about this?
I would probably change rebaseline-cl to explicitly reference the step name, and not worry about handling results from older try jobs, if that was an issue. People don't re-run the command that often, and the workaround should be straightforward: rebase your CL and re-run the try jobs.
But, you could be paranoid and check for "with patch" first and then fall back to "" if you didn't find it.
The latest problem is that the test results from the bots triggered by rebaseline_cl.py are uploaded to the staging test results server (test-results-test.appspot.com). I had thought it was uploaded to the main one.
We should probably change this to either upload them to the main test results server, or change rebaseline_cl.py to point at the staging server. Both of these options are doable.
I also noticed the results URLs were incorrect and dug into the code a bit. I think it's just the logging that's incorrect. The actual network requests do include step names properly.
Locally with all the fixes above, I can now rebaseline Linux & Windows. Mac still doesn't work; we need to land https://crrev.com/c/1273687 .
Re #31, unfortunately in your case Linux & Windows are also broken, because your try jobs were started before the change in #28.
Are you sure you're running the correct (latest) code locally? I'd expect the mac bots to not be working; the CL I mention in #33 are supposed to fix that. rebaseline_cl.py currently should be looking at step names, so I'm surprised it's fetching results from the URLs you list. It should be trying to fetch results from a URL like https://test-results.appspot.com/data/layout_results/linux-blink-rel/1180/webkit_layout_tests%20%28with%20patch%29/layout-test-results/results.html, but it's not, which looks maybe like a bug.
Can you re-run rebaseline-cl, and tell me what git revision you're running it from locally?
I have to go for ~1.5 hours, but I'll take a look when I get back. You'll probably need to re-run your tryjobs once https://crrev.com/c/1273687 lands, unfortunately, due to a bug with running layout tests on mac bots.
Comment 1 by dpranke@chromium.org
, Sep 11