New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.
Starred by 1 user

Issue metadata

Status: WontFix
Closed: Apr 16
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug

Sign in to add a comment

android-container-nyc failure got into tree due to ebuild skipping testing

Project Member Reported by, Apr 12

Issue description

I have a CL that modifies an Autotest test:

android-container-nyc fails:
=== Start output for job android-container-nyc-4717008-r1 (0m13.7s) ===
android-container-nyc-4717008-r1: >>> Emerging (1 of 1) chromeos-base/android-container-nyc-4717008-r1::cheets-private for /build/betty/
android-container-nyc-4717008-r1:  * SHA256 SHA512 WHIRLPOOL size ;-) ...            [ ok ]
android-container-nyc-4717008-r1: 13:24:12: INFO: RunCommand: /mnt/host/source/.cache/common/gsutil_4.30.tar.gz/gsutil/gsutil -o 'Boto:num_retries=10' cp -v -- gs://chromeos-arc-images/builds/git_nyc-mr1-arc-linux-static_sdk_tools/4717008/aapt /var/cache/chromeos-cache/distfiles/target/aapt.tmp
android-container-nyc-4717008-r1: !!! Fetched file: aapt VERIFY FAILED!
android-container-nyc-4717008-r1: !!! Reason: Failed on SHA256 verification
android-container-nyc-4717008-r1: !!! Got:      a3d82bca505fdded001a33cc347697cdb8c5f50ec3281f65b4e60c05f01d39b5
android-container-nyc-4717008-r1: !!! Expected: 1f314e854fca30643c3a244ce55a75ca4b963f8ef45092d8d28d37e3a5852e17
android-container-nyc-4717008-r1: Refetching... File renamed to '/var/cache/chromeos-cache/distfiles/target/aapt._checksum_failure_.C6ey7E'
android-container-nyc-4717008-r1: !!! Couldn't download 'aapt'. Aborting.
android-container-nyc-4717008-r1:  * Fetch failed for 'chromeos-base/android-container-nyc-4717008-r1', Log file:
android-container-nyc-4717008-r1:  *  '/build/betty/tmp/portage/logs/chromeos-base:android-container-nyc-4717008-r1:20180412-202359.log'
android-container-nyc-4717008-r1: >>> Failed to emerge chromeos-base/android-container-nyc-4717008-r1 for /build/betty/, Log file:
android-container-nyc-4717008-r1: >>>  '/build/betty/tmp/portage/logs/chromeos-base:android-container-nyc-4717008-r1:20180412-202359.log'
android-container-nyc-4717008-r1:  * Messages for package chromeos-base/android-container-nyc-4717008-r1 merged to /build/betty/:
android-container-nyc-4717008-r1:  * Fetch failed for 'chromeos-base/android-container-nyc-4717008-r1', Log file:
android-container-nyc-4717008-r1:  *  '/build/betty/tmp/portage/logs/chromeos-base:android-container-nyc-4717008-r1:20180412-202359.log'
=== Complete: job android-container-nyc-4717008-r1 (0m13.7s) ===
Failed chromeos-base/android-container-nyc-4717008-r1 (in 0m13.7s). Your build has failed.

This seems to be blocking the PreCQ for me.
Labels: -Pri-1 Pri-0
Status: Assigned (was: Untriaged)
The fix is being tested here:


But we need to record the series of related events, and figure out how things broke.
Labels: Hotlist-CrOS-Sheriffing
The change that introduced the problem (or at least the one whose revert fixed it) was
Can we explain why the Android PFQ passed and performed an uprev after the initial change landed?

I was curious about the same thing that Don is asking about in c#5.
Labels: -Pri-0 Chase-Pending Pri-1
Owner: ----
Status: Available (was: Assigned)
Summary: android-container-nyc failure got into tree (was: android-container-nyc failing BuildPackages on an unrelated CL)
A good question, which needs followup.
Components: -Infra>Client>ChromeOS Infra>Client>ChromeOS>CI
Labels: -Chase-Pending
Status: Assigned (was: Available)
-> jclinton@
Need more data. How can be related to the SHA256 of aapt changing?
The Comment #1 fix being tested doesn't really tell me anything either.
That was the point, essentially. A change in Autotest should not fail in BuildPackages.  The relevant CL was Reverting that CL fixed the problem. So, how did that CL (604790) make it in the first place? Aviv seems to think you should own figuring this out.
I don't understand this section of OS platform. What's going on here? How can these two things possibly be related? You made the change to the software; do you have any guesses?
Labels: OS-Chrome
From offline conversation: the initial bug report was simply that the PreCQ was broken, not that the CL mentioned was the cause of that breakage. It seems that Don knew at the time what the root cause was (Comment #1). Over to him to see if he knows of another bug that tracked that root cause or more information. Assign back to me once info. is added. Thank you.
From #4, this is the CL that caused the break.

I don't know if the script change or the ebuild change caused the problem (no idea what the actual problem was), just got dragged into reverting a .9999 ebuild revert.

.9999 ebuilds are really just templates that are unused until the next time the relevant ebuild is uprevved. Since the ebuild in question is only uprevved by the Android PFQ, it couldn't have caused a problem until the Android PFQ passed.

So.... how did the Android PFQ pass when the result broke everything else?

Or was the script change the problem?

Or is my analysis wrong somehow?

I think those are the right questions, especially "how did the Android PFQ pass when the result broke everything else?".
Components: -Infra>Client>ChromeOS>CI Platform>ARC
Summary: android-container-nyc failure got into tree due to ebuild skipping testing (was: android-container-nyc failure got into tree)
This ebuild specifically skips SRC_URI in the CQ if the 9999.ebuild is being modified. . Later, when the uprev happens, we are no longer hitting the skip logic.

Over to the team who owns this ebuild and the CL in question to fix this. This ebuild should not have a 9999 PV short-circuit of all testing in the CQ. Just delete that entire if statement; I can't think of any reason it should be there.

>> So.... how did the Android PFQ pass when the result broke everything else?

Android PFQ had non-clean $DISTDIR and did not fetch fresh tools from new Android build (aapt). That is why Manifest was not updated. However during build_packages phase build system fetched new tool and failed due Manifest pointed to old file.

I traced how ebuild was updating the manifest and this is portage behavior. From one point it uses $DISDIR during the Manifest update. From other side it tries to download new binaries during build_packages phase.

We have alternative solution to use build build suffix for such tools. That should avoid this problem.

Status: WontFix (was: Assigned)
I think problem with CQ was resolved last week. I also explained why Android PFQ passed but build failed.

Sign in to add a comment