Concurrent map read/write in jslog package |
|||
Issue descriptionI got this error running a modified version of the test in memory_pressure.go (under review at https://crrev.com/c/1330081/18, without ExtraArgs passed to chrome.New so it went to live sites instead of WPR). This is just the beginning of the log. The full log is attached. There are LOTS of active goroutines, which makes me suspect I am leaking them (which may be a separate bug). The error is NOT reproducible. 2018/12/12 09:40:11 Got global error mid-test: runner.go:136: Bundle /usr/local/libexec/tast/bundles/local_pushed/cros failed: exit status 2 (fatal error: concurrent map read and map write goroutine 54 [running]: runtime.throw(0xb6f5ec, 0x21) /usr/lib/go/x86_64-cros-linux-gnu/src/runtime/panic.go:616 +0x81 fp=0xc4203a7e90 sp=0xc4203a7e70 pc=0x42bc91 runtime.mapaccess1_faststr(0xa50680, 0xc42037d8c0, 0xc4202c8720, 0x20, 0xc4201dea30) /usr/lib/go/x86_64-cros-linux-gnu/src/runtime/hashmap_fast.go:181 +0x421 fp=0xc4203a7f00 sp=0xc4203a7e90 pc=0x40c831 chromiumos/tast/local/chrome/jslog.(*Master).run(0xc42000d0e0) /home/semenzato/trunk/src/platform/tast-tests/src/chromiumos/tast/local/chrome/jslog/jslog.go:123 +0x6c fp=0xc4203a7fd8 sp=0xc4203a7f00 pc=0x7d679c runtime.goexit() /usr/lib/go/x86_64-cros-linux-gnu/src/runtime/asm_amd64.s:2361 +0x1 fp=0xc4203a7fe0 sp=0xc4203a7fd8 pc=0x457d31 created by chromiumos/tast/local/chrome/jslog.NewMaster /home/semenzato/trunk/src/platform/tast-tests/src/chromiumos/tast/local/chrome/jslog/jslog.go:71 +0xd8 goroutine 1 [select, 2 minutes]: chromiumos/tast/testing.runStages(0xbea360, 0xc42037d6b0, 0xc42037d680, 0xc4201e88a0, 0x3, 0x4, 0x2) /home/semenzato/trunk/src/platform/tast/src/chromiumos/tast/testing/stage.go:46 +0x199 chromiumos/tast/testing.(*Test).Run(0xc4201cafd0, 0xbea360, 0xc42037d6b0, 0xc420326480, 0xc42007c400, 0xc42021c230) /home/semenzato/trunk/src/platform/tast/src/chromiumos/tast/testing/test.go:227 +0x3f9 chromiumos/tast/bundle.runTest(0xbea2e0, 0xc4200a6030, 0xc4200ab3b0, 0xc4201d0000, 0xc4202a1f00, 0xc4201cafd0, 0x0, 0x0, 0xc4201c4d80, 0xc4201d0050) /home/semenzato/trunk/src/platform/tast/src/chromiumos/tast/bundle/bundl:
,
Dec 12
Uploaded https://crrev.com/c/1374117 as a likely fix, but nya@ knows this code much better than I do.
,
Dec 12
Thanks for catching the issue. Reassigning to derat@ since he already prepared a fix.
,
Dec 12
Great, thanks! So I don't have an obvious goroutine leak?
,
Dec 12
It is possible your code has a goroutine leak. The crash indicates Save() and NewWorker() are called concurrently, which means an unusual situation that a new chrome.Conn is created while chrome.Chrome is being Closed. Though we should fix the crash anyway, you might also want to check goroutine leaks.
,
Dec 12
I don't explicitly create goroutines, but I've been wondering if I am supposed to clean up after myself (I've used "defer" in my other go program) and specifically what. (Just in case, the code that produced this error is https://crrev.com/c/1330081/18)
,
Dec 13
I looked at the full stack trace (thanks for it) and I found I was wrong. It was data race between NewWorker() and run(), which will happen in usual situations. So in short, you're innocent. :) Goroutine leak would be caused by not calling Conn.Close() (and Chrome.Close()).
,
Dec 13
Thank you very much for checking and for the advice. I need all those connections until the test ends, so I don't think I am leaking goroutines there. I will add the Close() calls to be clean.
,
Dec 14
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform/tast-tests/+/fe31fd707088cc14f5f06022e50de97dcb3855b5 commit fe31fd707088cc14f5f06022e50de97dcb3855b5 Author: Daniel Erat <derat@chromium.org> Date: Fri Dec 14 03:28:05 2018 tast-tests: Fix data race in jslog. Introduce a mutex to avoid concurrent reads and writes to jslog.Master.targets. BUG= chromium:914420 TEST=unit tests still pass Change-Id: I6c0cd416b1064cf0e8ae42eaa7f9cf5b10820411 Reviewed-on: https://chromium-review.googlesource.com/1374117 Commit-Ready: Dan Erat <derat@chromium.org> Tested-by: Dan Erat <derat@chromium.org> Reviewed-by: Shuhei Takahashi <nya@chromium.org> [modify] https://crrev.com/fe31fd707088cc14f5f06022e50de97dcb3855b5/src/chromiumos/tast/local/chrome/jslog/jslog.go
,
Dec 14
|
|||
►
Sign in to add a comment |
|||
Comment 1 by derat@chromium.org
, Dec 12Owner: nya@chromium.org
Status: Assigned (was: Untriaged)
Summary: Concurrent map read/write in jslog package (was: tast: global error mid-test: concurrent map read/write)
From jslog.go: 116 // run is executed on a background goroutine to collect logs from workers. 117 func (m *Master) run() { 118 for { 119 select { 120 case <-m.finCh: 121 return 122 case e := <-m.logCh: 123 buf := &m.targets[e.targetID].buf 124 e.writeTo(buf) 125 } 126 } 127 } nya@, m.targets likely needs to be protected by a mutex. It looks like it's inserted into by (probably) the main goroutine in NewWorker, so Master.run (which is documented as running in a background goroutine) shouldn't be doing unprotected reads from it. (Okay, I'll admit that Rust probably would've caught this during compilation. :-P)