New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 667549 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Dec 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug

Blocking:
issue 665691



Sign in to add a comment

LogDog Butler panic during build.

Project Member Reported by erikc...@chromium.org, Nov 22 2016

Issue description

This is on the 10.11 Force Mac Toolchain bot. 50% of builds are failing at the "compile" step.

https://build.chromium.org/p/chromium.fyi/builders/Chromium%20Mac%2010.11%20Force%20Mac%20Toolchain/builds/14746

https://build.chromium.org/p/chromium.fyi/builders/Chromium%20Mac%2010.11%20Force%20Mac%20Toolchain/builds/14746/steps/compile/logs/stdio

But stdio is reporting success?
"""
started: Mon Nov 21 14:41:26 2016
ended: Mon Nov 21 14:41:27 2016
duration: 0 secs
status: SUCCESS
status reason: 
"""
 

Comment 1 by aga...@chromium.org, Nov 23 2016

Components: -Infra Infra>Platform>LogDog
Owner: d...@chromium.org
Status: Assigned (was: Untriaged)
Something went incredibly wrong with that build; it's not just compile. The whole thing ended up hanging and timing out after a long time with no output.

I'm putting this on dnj@'s plate because a naive understanding says that logdog has something to do with it. Feel free to reassign or move components if all those red logdog failures are just a red herring.

Comment 2 by d...@chromium.org, Nov 23 2016

Summary: LogDog Butler panic during build. (was: Compile steps reports "success", but marked as "failure".)
Looks like the LogDog Butler panic'd:

[signal 0xb code=0x1 addr=0x1007094006 pc=0x299b3d]

goroutine 25 [running]:
runtime.throw(0x772858, 0x5)
	/tmp/_chrome_infra_build/golang/go/src/runtime/panic.go:547 +0x90 fp=0xc820503b40 sp=0xc820503b28
runtime.sigpanic()
	/tmp/_chrome_infra_build/golang/go/src/runtime/sigpanic_unix.go:27 +0x2ba fp=0xc820503b90 sp=0xc820503b40
sync/atomic.CompareAndSwapUint32(0x1007094006, 0x100000000)
	/tmp/_chrome_infra_build/golang/go/src/sync/atomic/asm_amd64.s:39 +0xd fp=0xc820503b98 sp=0xc820503b90
sync.(*Mutex).Lock(0x1007094006)
	/tmp/_chrome_infra_build/golang/go/src/sync/mutex.go:44 +0x40 fp=0xc820503be8 sp=0xc820503b98
golang.org/x/net/context.WithDeadline(0xf08710, 0xc82006cda8, 0xecfc56efe, 0x1c1d28a2, 0xb2a480, 0x0, 0x0, 0x0)
	/tmp/_chrome_infra_build/infra/go/.vendor/src/golang.org/x/net/context/pre_go17.go:219 +0x458 fp=0xc820503cf8 sp=0xc820503be8
github.com/luci/luci-go/logdog/client/butler/bundler.(*Bundler).makeBundles(0xc8203e0140)
	/tmp/_chrome_infra_build/infra/go/src/github.com/luci/luci-go/logdog/client/butler/bundler/bundler.go:263 +0x829 fp=0xc820503fa8 sp=0xc820503cf8
runtime.goexit()
	/tmp/_chrome_infra_build/golang/go/src/runtime/asm_amd64.s:1998 +0x1 fp=0xc820503fb0 sp=0xc820503fa8
created by github.com/luci/luci-go/logdog/client/butler/bundler.New
	/tmp/_chrome_infra_build/infra/go/src/github.com/luci/luci-go/logdog/client/butler/bundler/bundler.go:90 +0x24d

goroutine 1 [chan receive]:
github.com/luci/luci-go/common/system/ctxcmd.(*CtxCmd).Wait(0xc820167720, 0x0, 0x0)
	/tmp/_chrome_infra_build/infra/go/src/github.com/luci/luci-go/common/system/ctxcmd/proc.go:189 +0x170
main.(*runCommandRun).Run.func2(0xf45548, 0xc8203e0240, 0xc8202c4000, 0x0, 0x0)
	/tmp/_chrome_infra_build/infra/go/src/github.com/luci/luci-go/logdog/client/cmd/logdog_butler/subcommand_run.go:317 +0x6ad
main.(*application).runWithButler(0xc82017e1e0, 0xf45588, 0xc82017e1e0, 0xf0b5a8, 0xc820068240, 0xc82014f7f8, 0x0, 0x0)
	/tmp/_chrome_infra_build/infra/go/src/github.com/luci/luci-go/logdog/client/cmd/logdog_butler/main.go:195 +0x44f
main.(*runCommandRun).Run(0xc8201bc680, 0xf45678, 0xc82017e1e0, 0xc820092160, 0x12, 0x12, 0x0)
	/tmp/_chrome_infra_build/infra/go/src/github.com/luci/luci-go/logdog/client/cmd/logdog_butler/subcommand_run.go:341 +0x203f
github.com/maruel/subcommands.Run(0xf45678, 0xc82017e1e0, 0xc8200920d0, 0x1b, 0x1b, 0x0)
	/tmp/_chrome_infra_build/infra/go/.vendor/src/github.com/maruel/subcommands/subcommands.go:262 +0x3e5
main.mainImpl(0xf44ff8, 0xc8201cc660, 0xc820092010, 0x27, 0x27, 0x0)
	/tmp/_chrome_infra_build/infra/go/src/github.com/luci/luci-go/logdog/client/cmd/logdog_butler/main.go:324 +0x12df
main.main.func2()
	/tmp/_chrome_infra_build/infra/go/src/github.com/luci/luci-go/logdog/client/cmd/logdog_butler/main.go:341 +0x8c
github.com/luci/luci-go/common/runtime/paniccatcher.Do(0xc82014ff18, 0xc82014fef8)
	/tmp/_chrome_infra_build/infra/go/src/github.com/luci/luci-go/common/runtime/paniccatcher/catch.go:45 +0x44
main.main()
	/tmp/_chrome_infra_build/infra/go/src/github.com/luci/luci-go/logdog/client/cmd/logdog_butler/main.go:347 +0x128

goroutine 17 [syscall, locked to thread]:
runtime.goexit()
	/tmp/_chrome_infra_build/golang/go/src/runtime/asm_amd64.s:1998 +0x1

goroutine 19 [syscall]:
os/signal.signal_recv(0x0)
	/tmp/_chrome_infra_build/golang/go/src/runtime/sigqueue.go:116 +0x132
os/signal.loop()
	/tmp/_chrome_infra_build/golang/go/src/os/signal/signal_unix.go:22 +0x18
created by os/signal.init.1
	/tmp/_chrome_infra_build/golang/go/src/os/signal/signal_unix.go:28 +0x37

goroutine 98 [select]:
github.com/luci/luci-go/logdog/client/butler.(*Butler).runStreams.func2(0xc820171bc0, 0xc8203e4ee0, 0xc82035cc00, 0xc8202c4000)
	/tmp/_chrome_infra_build/infra/go/src/github.com/luci/luci-go/logdog/client/butler/butler.go:456 +0x23b
created by github.com/luci/luci-go/logdog/client/butler.(*Butler).runStreams
	/tmp/_chrome_infra_build/infra/go/src/github.com/luci/luci-go/logdog/client/butler/butler.go:468 +0x21a

goroutine 35 [chan receive]:
main.mainImpl.func2(0xc820170360, 0xc82017e1e0)
	/tmp/_chrome_infra_build/infra/go/src/github.com/luci/luci-go/logdog/client/cmd/logdog_butler/main.go:297 +0x56
created by main.mainImpl
	/tmp/_chrome_infra_build/infra/go/src/github.com/luci/luci-go/logdog/client/cmd/logdog_butler/main.go:309 +0xec3

goroutine 34 [select, locked to thread]:
runtime.gopark(0x8c3cc0, 0xc820028728, 0x775aa0, 0x6, 0x8c3c18, 0x2)
	/tmp/_chrome_infra_build/golang/go/src/runtime/proc.go:262 +0x163
runtime.selectgoImpl(0xc820028728, 0x0, 0x18)
	/tmp/_chrome_infra_build/golang/go/src/runtime/select.go:392 +0xa67
runtime.selectgo(0xc820028728)
	/tmp/_chrome_infra_build/golang/go/src/runtime/select.go:215 +0x12
runtime.ensureSigM.func1()
	/tmp/_chrome_infra_build/golang/go/src/runtime/signal1_unix.go:279 +0x32c
runtime.goexit()
	/tmp/_chrome_infra_build/golang/go/src/runtime/asm_amd64.s:1998 +0x1

goroutine 36 [IO wait]:
net.runtime_pollWait(0xf46940, 0x72, 0xf040f0)
	/tmp/_chrome_infra_build/golang/go/src/runtime/netpoll.go:160 +0x60
net.(*pollDesc).Wait(0xc8201c4e60, 0x72, 0x0, 0x0)
	/tmp/_chrome_infra_build/golang/go/src/net/fd_poll_runtime.go:73 +0x3a
net.(*pollDesc).WaitRead(0xc8201c4e60, 0x0, 0x0)
	/tmp/_chrome_infra_build/golang/go/src/net/fd_poll_runtime.go:78 +0x36
net.(*netFD).accept(0xc8201c4e00, 0x0, 0xf09360, 0xc8203fb840)
	/tmp/_chrome_infra_build/golang/go/src/net/fd_unix.go:426 +0x27c
net.(*UnixListener).AcceptUnix(0xc8201d4680, 0xc8203fb820, 0x0, 0x0)
	/tmp/_chrome_infra_build/golang/go/src/net/unixsock_posix.go:305 +0x53
net.(*UnixListener).Accept(0xc8201d4680, 0x0, 0x0, 0x0, 0x0)
	/tmp/_chrome_infra_build/golang/go/src/net/unixsock_posix.go:315 +0x41
github.com/luci/luci-go/logdog/client/butler/streamserver.(*selfCleaningUNIXListener).Accept(0xc8201cd140, 0x0, 0x0, 0x0, 0x0)
	<autogenerated>:28 +0x6c
github.com/luci/luci-go/logdog/client/butler/streamserver.(*listenerStreamServer).serve(0xc820170600)
	/tmp/_chrome_infra_build/infra/go/src/github.com/luci/luci-go/logdog/client/butler/streamserver/base.go:116 +0x122
created by github.com/luci/luci-go/logdog/client/butler/streamserver.(*listenerStreamServer).Listen
	/tmp/_chrome_infra_build/infra/go/src/github.com/luci/luci-go/logdog/client/butler/streamserver/base.go:76 +0x1f7

goroutine 59 [IO wait]:
net.runtime_pollWait(0xf46700, 0x72, 0xc82059e000)
	/tmp/_chrome_infra_build/golang/go/src/runtime/netpoll.go:160 +0x60
net.(*pollDesc).Wait(0xc8201c4060, 0x72, 0x0, 0x0)
	/tmp/_chrome_infra_build/golang/go/src/net/fd_poll_runtime.go:73 +0x3a
net.(*pollDesc).WaitRead(0xc8201c4060, 0x0, 0x0)
	/tmp/_chrome_infra_build/golang/go/src/net/fd_poll_runtime.go:78 +0x36
net.(*netFD).Read(0xc8201c4000, 0xc82059e000, 0x1000, 0x1000, 0x0, 0xf040f0, 0xc82006c0b0)
	/tmp/_chrome_infra_build/golang/go/src/net/fd_unix.go:250 +0x23a
net.(*conn).Read(0xc820516008, 0xc82059e000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
	/tmp/_chrome_infra_build/golang/go/src/net/net.go:172 +0xe4
github.com/luci/luci-go/common/iotools.(*DeadlineReader).Read(0xc8203f20a0, 0xc82059e000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
	/tmp/_chrome_infra_build/infra/go/src/github.com/luci/luci-go/common/iotools/deadlinereader.go:50 +0x1ec
github.com/luci/luci-go/logdog/client/butler.(*stream).readChunk(0xc82039cb00, 0x8c3a00)
	/tmp/_chrome_infra_build/infra/go/src/github.com/luci/luci-go/logdog/client/butler/stream.go:33 +0x12a
github.com/luci/luci-go/logdog/client/butler.(*Butler).runStreams.func2.2(0xc820171f80, 0xc82039cb00)
	/tmp/_chrome_infra_build/infra/go/src/github.com/luci/luci-go/logdog/client/butler/butler.go:450 +0x48
created by github.com/luci/luci-go/logdog/client/butler.(*Butler).runStreams.func2
	/tmp/_chrome_infra_build/infra/go/src/github.com/luci/luci-go/logdog/client/butler/butler.go:452 +0xbc


Comment 3 by d...@chromium.org, Nov 23 2016

Cc: vadimsh@chromium.org iannucci@chromium.org
+iannucci@, +vadimsh@

This is interesting. This is a segmentation fault in the middle of Go's context=>mutex=>atomic library.

The Butler code in question:
https://github.com/luci/luci-go/blob/164a2f5b9e304d3487b162679d3f409fbc1ce4cd/logdog/client/butler/bundler/bundler.go#L263

I don't see a path for "condC" to be nil. The panic line in Context is:
https://github.com/golang/net/blob/4971afdc2f162e82d185353533d3cf16188a9f4e/context/pre_go17.go#L219

That should only panic if "c" or "c.mu" (static) is nil. "c" is generated from newCancelCtx (https://github.com/golang/net/blob/4971afdc2f162e82d185353533d3cf16188a9f4e/context/pre_go17.go#L71), which returns a new static struct with its own mutex. I don't see how that could be nil.

This could be a bug in the Go runtime. This binary is pretty old IIRC. Any insights? I'm not seeing anything at the moment.
build9-m1 is OSX 10.12.1 despite what builder name says.

Go binaries built with go 1.6 are extremely unstable on OSX 10.12. 

Please switch to newer version of butler build with go 1.7.

Comment 5 by d...@chromium.org, Nov 23 2016

eep okay, I'll do that in the morning.
Project Member

Comment 6 by bugdroid1@chromium.org, Nov 23 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/tools/build.git/+/3729cb1adc3a0850d4d766eb7d47a52933fe96ea

commit 3729cb1adc3a0850d4d766eb7d47a52933fe96ea
Author: Dan Jacques <dnj@chromium.org>
Date: Wed Nov 23 16:56:07 2016

Canary new Butler version.

Canary the Butler version tagged at:
git_revision:dd611fe7c2ee1214713f30c5a9dab8316242d23c

This is built with Go 1.7 and should avoid some of the instability
issues that are being seen with Butler versions compiled with older Go
versions.

TBR=agable@chromium.org
BUG= chromium:667549 
TESt=None

Change-Id: I6a52d0807fa5986319eac997cba74f51bb3f755a
Reviewed-on: https://chromium-review.googlesource.com/414065
Reviewed-by: Daniel Jacques <dnj@chromium.org>
Commit-Queue: Daniel Jacques <dnj@chromium.org>

[modify] https://crrev.com/3729cb1adc3a0850d4d766eb7d47a52933fe96ea/scripts/slave/logdog-params.pyl

Comment 7 by d...@chromium.org, Nov 24 2016

Status: Fixed (was: Assigned)
The new Go 1.7-compiled version has been instituted.
Project Member

Comment 8 by bugdroid1@chromium.org, Nov 24 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/tools/build.git/+/348b7b282e6413be5210eb8877faccd57fff9050

commit 348b7b282e6413be5210eb8877faccd57fff9050
Author: Dan Jacques <dnj@chromium.org>
Date: Thu Nov 24 04:58:02 2016

Update stable LogDog bootstrap revision.

Update stable to: dd611fe7c2ee1214713f30c5a9dab8316242d23c

TBR=agable@chromium.org
BUG= chromium:667549 
TEST=canary
  - Canaried on "chromium.infra" waterfall, all architectures passed
    a round of green.

Change-Id: Ia8ce1c880fd23312beb9c0e9dd841da05cef0c97
Reviewed-on: https://chromium-review.googlesource.com/414185
Reviewed-by: Daniel Jacques <dnj@chromium.org>
Commit-Queue: Daniel Jacques <dnj@chromium.org>

[modify] https://crrev.com/348b7b282e6413be5210eb8877faccd57fff9050/scripts/slave/logdog_bootstrap.py

Project Member

Comment 9 by bugdroid1@chromium.org, Nov 30 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/tools/build.git/+/50bcf670c0522243f05b1fe54db193c211c3d593

commit 50bcf670c0522243f05b1fe54db193c211c3d593
Author: Robbie Iannucci <iannucci@chromium.org>
Date: Wed Nov 30 02:43:06 2016

Revert "Update stable LogDog bootstrap revision."

This reverts commit 348b7b282e6413be5210eb8877faccd57fff9050.

Reason for revert: May be related to https://bugs.chromium.org/p/chromium/issues/detail?id=669194. Reverting for now to see if it has an appreciable effect. Recognize that this will break OS X 10.12 builders in the mean time.

Original change's description:
> Update stable LogDog bootstrap revision.
> 
> Update stable to: dd611fe7c2ee1214713f30c5a9dab8316242d23c
> 
> TBR=agable@chromium.org
> BUG= chromium:667549 
> TEST=canary
>   - Canaried on "chromium.infra" waterfall, all architectures passed
>     a round of green.
> 
> Change-Id: Ia8ce1c880fd23312beb9c0e9dd841da05cef0c97
> Reviewed-on: https://chromium-review.googlesource.com/414185
> Reviewed-by: Daniel Jacques <dnj@chromium.org>
> Commit-Queue: Daniel Jacques <dnj@chromium.org>
> 

TBR=agable@chromium.org,dnj@chromium.org,chromium-reviews@chromium.org,infra-reviews+build@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true

Change-Id: I21882e681251664c3b9f876b653c590f0039f74a
Reviewed-on: https://chromium-review.googlesource.com/414641
Commit-Queue: Robbie Iannucci <iannucci@chromium.org>
Reviewed-by: Robbie Iannucci <iannucci@chromium.org>

[modify] https://crrev.com/50bcf670c0522243f05b1fe54db193c211c3d593/scripts/slave/logdog_bootstrap.py

Status: Assigned (was: Fixed)
Re-opening the bug, since the fix has been reverted.
What's the status here? There is a long chain of bugs being blocked by this.

Comment 12 by d...@chromium.org, Dec 6 2016

I'm canarying an updated revision now, and should have it pushed put fleet-wide mid-day.
Haven't seen this error since Dec 6. Is this fixed?

Comment 14 by d...@chromium.org, Dec 10 2016

Status: Fixed (was: Assigned)
D'oh, yes, bumped past the broken revision shortly after #12.

Sign in to add a comment