New issue
Advanced search Search tips

Issue 913078 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Dec 11
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug



Sign in to add a comment

coral-paladin: CQ build failing with "file collision

Project Member Reported by dburger@google.com, Dec 7

Issue description

In recent CQ runs:

https://cros-goldeneye.corp.google.com/chromeos/healthmonitoring/buildDetails?buildbucketId=8927773249130514032

coral-paladin build is failing and showing "file collision" in the build packages log:

Started chromeos-base/chromeos-config-bsp-coral-0.0.1-r37 (logged in /tmp/chromeos-config-bsp-coral-0.0.1-r37-lpYCqG)
=== Start output for job chromeos-config-bsp-coral-0.0.1-r37 (0m6.9s) ===
chromeos-config-bsp-coral-0.0.1-r37: >>> Emerging binary (1 of 1) chromeos-base/chromeos-config-bsp-coral-0.0.1-r37::coral for /build/coral/
chromeos-config-bsp-coral-0.0.1-r37:  * Running stacked hooks for pre_pkg_setup
chromeos-config-bsp-coral-0.0.1-r37:  *    sysroot_build_bin_dir ...
chromeos-config-bsp-coral-0.0.1-r37:  [ ok ]
chromeos-config-bsp-coral-0.0.1-r37:  * Running stacked hooks for post_pkg_setup
chromeos-config-bsp-coral-0.0.1-r37:  *    python_eclass_hack ...
chromeos-config-bsp-coral-0.0.1-r37:  [ ok ]
chromeos-config-bsp-coral-0.0.1-r37: pbzip2: *WARNING: Trailing garbage after EOF ignored!
chromeos-config-bsp-coral-0.0.1-r37: >>> Installing (1 of 1) chromeos-base/chromeos-config-bsp-coral-0.0.1-r37::coral to /build/coral/
chromeos-config-bsp-coral-0.0.1-r37:  * This package will overwrite one or more files that may belong to other
chromeos-config-bsp-coral-0.0.1-r37:  * packages (see list below). You can use a command such as `portageq
chromeos-config-bsp-coral-0.0.1-r37:  * owners / <filename>` to identify the installed package that owns a
chromeos-config-bsp-coral-0.0.1-r37:  * file. If portageq reports that only one package owns a file then do
chromeos-config-bsp-coral-0.0.1-r37:  * NOT file a bug report. A bug report is only useful if it identifies at
chromeos-config-bsp-coral-0.0.1-r37:  * least two or more packages that are known to install the same file(s).
chromeos-config-bsp-coral-0.0.1-r37:  * If a collision occurs and you can not explain where the file came from
chromeos-config-bsp-coral-0.0.1-r37:  * then you should simply ignore the collision since there is not enough
chromeos-config-bsp-coral-0.0.1-r37:  * information to determine if a real problem exists. Please do NOT file
chromeos-config-bsp-coral-0.0.1-r37:  * a bug report at http://bugs.gentoo.org unless you report exactly which
chromeos-config-bsp-coral-0.0.1-r37:  * two packages install the same file(s). See
chromeos-config-bsp-coral-0.0.1-r37:  * http://wiki.gentoo.org/wiki/Knowledge_Base:Blockers for tips on how to
chromeos-config-bsp-coral-0.0.1-r37:  * solve the problem. And once again, please do NOT file a bug report
chromeos-config-bsp-coral-0.0.1-r37:  * unless you have completely understood the above message.
chromeos-config-bsp-coral-0.0.1-r37:  * 
chromeos-config-bsp-coral-0.0.1-r37:  * Detected file collision(s):
chromeos-config-bsp-coral-0.0.1-r37:  * 
chromeos-config-bsp-coral-0.0.1-r37:  * 	/build/coral/tmp/chromeos-config/config_dump.json
chromeos-config-bsp-coral-0.0.1-r37:  * 
chromeos-config-bsp-coral-0.0.1-r37:  * Searching all installed packages for file collisions...
chromeos-config-bsp-coral-0.0.1-r37:  * 
chromeos-config-bsp-coral-0.0.1-r37:  * Press Ctrl-C to Stop
chromeos-config-bsp-coral-0.0.1-r37:  * 
chromeos-config-bsp-coral-0.0.1-r37:  * chromeos-base/chromeos-config-bsp-coral-private-0.0.1-r1097:0::coral-private
chromeos-config-bsp-coral-0.0.1-r37:  * 	/build/coral/tmp/chromeos-config/config_dump.json
chromeos-config-bsp-coral-0.0.1-r37:  * 
chromeos-config-bsp-coral-0.0.1-r37:  * Package 'chromeos-base/chromeos-config-bsp-coral-0.0.1-r37' NOT merged
chromeos-config-bsp-coral-0.0.1-r37:  * due to file collisions. If necessary, refer to your elog messages for
chromeos-config-bsp-coral-0.0.1-r37:  * the whole content of the above message.
chromeos-config-bsp-coral-0.0.1-r37: >>> Failed to install chromeos-base/chromeos-config-bsp-coral-0.0.1-r37 to /build/coral/, Log file:
chromeos-config-bsp-coral-0.0.1-r37: >>>  '/build/coral/tmp/portage/logs/chromeos-base:chromeos-config-bsp-coral-0.0.1-r37:20181207-204434.log'
chromeos-config-bsp-coral-0.0.1-r37: 
chromeos-config-bsp-coral-0.0.1-r37:  * Messages for package chromeos-base/chromeos-config-bsp-coral-0.0.1-r37 merged to /build/coral/:
chromeos-config-bsp-coral-0.0.1-r37: 
chromeos-config-bsp-coral-0.0.1-r37:  * This package will overwrite one or more files that may belong to other
chromeos-config-bsp-coral-0.0.1-r37:  * packages (see list below). You can use a command such as `portageq
chromeos-config-bsp-coral-0.0.1-r37:  * owners / <filename>` to identify the installed package that owns a
chromeos-config-bsp-coral-0.0.1-r37:  * file. If portageq reports that only one package owns a file then do
chromeos-config-bsp-coral-0.0.1-r37:  * NOT file a bug report. A bug report is only useful if it identifies at
chromeos-config-bsp-coral-0.0.1-r37:  * least two or more packages that are known to install the same file(s).
chromeos-config-bsp-coral-0.0.1-r37:  * If a collision occurs and you can not explain where the file came from
chromeos-config-bsp-coral-0.0.1-r37:  * then you should simply ignore the collision since there is not enough
chromeos-config-bsp-coral-0.0.1-r37:  * information to determine if a real problem exists. Please do NOT file
chromeos-config-bsp-coral-0.0.1-r37:  * a bug report at http://bugs.gentoo.org unless you report exactly which
chromeos-config-bsp-coral-0.0.1-r37:  * two packages install the same file(s). See
chromeos-config-bsp-coral-0.0.1-r37:  * http://wiki.gentoo.org/wiki/Knowledge_Base:Blockers for tips on how to
chromeos-config-bsp-coral-0.0.1-r37:  * solve the problem. And once again, please do NOT file a bug report
chromeos-config-bsp-coral-0.0.1-r37:  * unless you have completely understood the above message.
chromeos-config-bsp-coral-0.0.1-r37:  * 
chromeos-config-bsp-coral-0.0.1-r37:  * Detected file collision(s):
chromeos-config-bsp-coral-0.0.1-r37:  * 
chromeos-config-bsp-coral-0.0.1-r37:  * 	/build/coral/tmp/chromeos-config/config_dump.json
chromeos-config-bsp-coral-0.0.1-r37:  * 
chromeos-config-bsp-coral-0.0.1-r37:  * Searching all installed packages for file collisions...
chromeos-config-bsp-coral-0.0.1-r37:  * 
chromeos-config-bsp-coral-0.0.1-r37:  * Press Ctrl-C to Stop
chromeos-config-bsp-coral-0.0.1-r37:  * 
chromeos-config-bsp-coral-0.0.1-r37:  * chromeos-base/chromeos-config-bsp-coral-private-0.0.1-r1097:0::coral-private
chromeos-config-bsp-coral-0.0.1-r37:  * 	/build/coral/tmp/chromeos-config/config_dump.json
chromeos-config-bsp-coral-0.0.1-r37:  * 
chromeos-config-bsp-coral-0.0.1-r37:  * Package 'chromeos-base/chromeos-config-bsp-coral-0.0.1-r37' NOT merged
chromeos-config-bsp-coral-0.0.1-r37:  * due to file collisions. If necessary, refer to your elog messages for
chromeos-config-bsp-coral-0.0.1-r37:  * the whole content of the above message.
=== Complete: job chromeos-config-bsp-coral-0.0.1-r37 (0m6.9s) ===

 
Labels: -Pri-3 OS-Chrome Pri-1
Status: Assigned (was: Untriaged)
Mark the builder experimental for now. Seems that ToT is broken; not a bad CL.
Owner: athilenius@google.com
See comment in:
https://chromium-review.googlesource.com/c/chromiumos/overlays/board-overlays/+/1369731

and bug:
b/120774883

for context

This file was deleted by chromeos-config-bsp-coral-private a long time ago.  I don't understand why the portage package is so stale in the first place.

to current bobby
IIUC, when you move a file between ebuilds, you need to add a blocker between them. I see no such blocker.

Also, this isn't just a CQ issue. I see it on my local build with build_packages. My local build root still has -r1100 installed:

chromeos-base/chromeos-config-bsp-coral-private-0.0.1-r1100::coral-private

And you killed config_dump.json in the privater overlay here:

https://chrome-internal-review.googlesource.com/c/chromeos/overlays/overlay-coral-private/+/721813/

which should show up in r1102.
Summary: coral-paladin: CQ build failing with "file collision (was: coral-padlin: CQ build failing with "file collision")
I'm testing this on my local build:

https://chromium-review.googlesource.com/c/chromiumos/overlays/board-overlays/+/1370424
Is the missing blocker what caused the ebuild to be stale? I'm looking into this but still very disoriented.
Yes, I believe that is the root cause of the failure. For example, in the above linked paladin run, I see:

  [ebuild     U  ] chromeos-base/chromeos-config-bsp-coral-private-0.0.1-r1103:0/chromeos-config-bsp-coral-private-0.0.1-r1103::coral-private [0.0.1-r1097:0/chromeos-config-bsp-coral-private-0.0.1-r1097::coral-private] to /build/coral/ USE="-cros_host" 0 KiB

So, the -private build was slated for upgrade, but portage didn't know it needed to *really* upgrade it before the public ebuild, because we didn't have the blocker.
> Is the missing blocker what caused the ebuild to be stale?

Sorry, I don't think I answered the "stale" part completely. IIUC, some builders do incremental builds, and this means they can still have "stale" ebuilds (depending on your definition of "stale") installed from old runs. The current build would eventually upgrade the package, but if we don't have an explicit reason (e.g., blockers) then it might not be done at the appropriate time.

Still, I'm not sure the exact root cause of why this particular builder still had -r1097 installed. That is a few weeks old still (-r1097 -> -r1098 was committed on Nov 26), so one might not be faulted for thinking that it should have been upgraded by now.
Thank you for the info Brian! I'm still unclear on if this is a CI issue though, it sounds a lot like malformed ebuilds. Is it expected that CI would uprev packages it doesn't think are depended on? If I follow, it only worked because there was likely already a previous version of the ebuild installed locally on the builders, so the real error should have been 'you're trying to use a file from an ebuild you didn't declare a dependency on'?
File collision means two packages are providing the same file, which is illegal. So this isn't a case of a missing "dependency" in the colloquial sense, but a missing *blocker* (two packages that can't be installed together at the same time).

So:

> Is it expected that CI would uprev packages it doesn't think are depended on?

Not exactly. Yes, they should get upgraded at some point in the build (we tell portage to upgrade everything, not just stuff that's required via dependency). But we don't guarantee a particular ordering or safe upgrade-handling if the ebuilds didn't declare a dependency (or in this case, a type of dependency called a blocker).

> it only worked because there was likely already a previous version of the ebuild installed locally on the builders

No, the opposite: it only *failed* because there was a previous version installed. If it was a clean build, things would have been fine.

> the real error should have been 'you're trying to use a file from an ebuild you didn't declare a dependency on'?

No, not using a file you didn't declare, but providing a replacement file without an appropriate "blocker."

FWIW, this is all a problem inherent to incremental builds, where we don't have 100% test coverage of "incremental build from version (X-N) to version (X)", where N is...anything. Dunno if there is something that can be done to improve CI around this. It's definitely a recurring problem.

HTH.
Owner: briannorris@chromium.org
Ahh! Okay that makes way more sense now.

So, to reiterate, in an ideal world incremental builds would reliably be equivalent to clean builds (-looks longingly at g3 Blaze-) which seems like the real answer to this. Detecting that two ebuilds failed to declare blockers on each other despite not both being installed seems outside the scope of any build system; it's only the build system's responsibility to deterministically fail when you try to build both of them and to explain why. Portage is falling short here because the incremental build didn't cleanup something it should have.

I don't see anything actionable for a CI on-call here so I'm going to assign this over to you. Feel free to bounce it back if you're the wrong owner and I'll ask around.

Interesting problem though. I played with an idea a year or so ago to run all ebuilds inside a FuseFS and watch all the things they open/read/write and build dep graphs from that. It would allow us to replicate a lot of ObjFS/SrcFS and Forge for ChromeOS because we could build 'guaranteed correct' (compile time) dep graphs and short-circuit entire builds by simply mounting their outputs into the FuseFS. That would also fix the incremental builds not being the same as clean builds problem.
Cc: athilenius@chromium.org
Yeah, I've got a handle on the $subject bug.

Agreed to most of your comment. (Ambivalent about the g3/blaze/etc. stuff, as I'm not familiar.)
Project Member

Comment 12 by bugdroid1@chromium.org, Dec 11

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/overlays/board-overlays/+/ebef192c708cc5245e036ed5e3538e494228ee00

commit ebef192c708cc5245e036ed5e3538e494228ee00
Author: Brian Norris <briannorris@chromium.org>
Date: Tue Dec 11 11:50:20 2018

coral: add blocker for private chromeos-config-bsp

/tmp/chromeos-config/config_dump.json has moved from the private to
public overlays, but there was no blocker added. This means upgrades
aren't always smooth and various builders might see file conflicts.

Add the blocker, so the private ebuild will be removed/upgraded cleanly.

BUG= chromium:913078 , b:120774883
TEST=build coral

Change-Id: I15a1ada5eae5bfaf6a311b26934c316c532efd52
Signed-off-by: Brian Norris <briannorris@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/1370424
Commit-Ready: ChromeOS CL Exonerator Bot <chromiumos-cl-exonerator@appspot.gserviceaccount.com>
Reviewed-by: Jesse Schettler <jschettler@chromium.org>
Reviewed-by: C Shapiro <shapiroc@chromium.org>
Reviewed-by: Mike Frysinger <vapier@chromium.org>

[rename] https://crrev.com/ebef192c708cc5245e036ed5e3538e494228ee00/overlay-coral/chromeos-base/chromeos-config-bsp-coral/chromeos-config-bsp-coral-0.0.1-r38.ebuild
[modify] https://crrev.com/ebef192c708cc5245e036ed5e3538e494228ee00/overlay-coral/chromeos-base/chromeos-config-bsp-coral/chromeos-config-bsp-coral-0.0.1.ebuild

Status: Fixed (was: Assigned)
Should be fixed.

Sign in to add a comment