New issue
Advanced search Search tips

Issue 914420 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Dec 14
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug



Sign in to add a comment

Concurrent map read/write in jslog package

Project Member Reported by semenzato@chromium.org, Dec 12

Issue description

I got this error running a modified version of the test in memory_pressure.go (under review at https://crrev.com/c/1330081/18, without ExtraArgs passed to chrome.New so it went to live sites instead of WPR).

This is just the beginning of the log.  The full log is attached.  There are LOTS of active goroutines, which makes me suspect I am leaking them (which may be a separate bug).

The error is NOT reproducible.

2018/12/12 09:40:11 Got global error mid-test: runner.go:136: Bundle /usr/local/libexec/tast/bundles/local_pushed/cros failed: exit status 2 (fatal error: concurrent map read and map write

goroutine 54 [running]:
runtime.throw(0xb6f5ec, 0x21)
        /usr/lib/go/x86_64-cros-linux-gnu/src/runtime/panic.go:616 +0x81 fp=0xc4203a7e90 sp=0xc4203a7e70 pc=0x42bc91
runtime.mapaccess1_faststr(0xa50680, 0xc42037d8c0, 0xc4202c8720, 0x20, 0xc4201dea30)
        /usr/lib/go/x86_64-cros-linux-gnu/src/runtime/hashmap_fast.go:181 +0x421 fp=0xc4203a7f00 sp=0xc4203a7e90 pc=0x40c831
chromiumos/tast/local/chrome/jslog.(*Master).run(0xc42000d0e0)
        /home/semenzato/trunk/src/platform/tast-tests/src/chromiumos/tast/local/chrome/jslog/jslog.go:123 +0x6c fp=0xc4203a7fd8 sp=0xc4203a7f00 pc=0x7d679c
runtime.goexit()
        /usr/lib/go/x86_64-cros-linux-gnu/src/runtime/asm_amd64.s:2361 +0x1 fp=0xc4203a7fe0 sp=0xc4203a7fd8 pc=0x457d31
created by chromiumos/tast/local/chrome/jslog.NewMaster
        /home/semenzato/trunk/src/platform/tast-tests/src/chromiumos/tast/local/chrome/jslog/jslog.go:71 +0xd8

goroutine 1 [select, 2 minutes]:
chromiumos/tast/testing.runStages(0xbea360, 0xc42037d6b0, 0xc42037d680, 0xc4201e88a0, 0x3, 0x4, 0x2)
        /home/semenzato/trunk/src/platform/tast/src/chromiumos/tast/testing/stage.go:46 +0x199
chromiumos/tast/testing.(*Test).Run(0xc4201cafd0, 0xbea360, 0xc42037d6b0, 0xc420326480, 0xc42007c400, 0xc42021c230)
        /home/semenzato/trunk/src/platform/tast/src/chromiumos/tast/testing/test.go:227 +0x3f9
chromiumos/tast/bundle.runTest(0xbea2e0, 0xc4200a6030, 0xc4200ab3b0, 0xc4201d0000, 0xc4202a1f00, 0xc4201cafd0, 0x0, 0x0, 0xc4201c4d80, 0xc4201d0050)
        /home/semenzato/trunk/src/platform/tast/src/chromiumos/tast/bundle/bundl:

 
full.txt
234 KB View Download
Labels: -Pri-2 Pri-1
Owner: nya@chromium.org
Status: Assigned (was: Untriaged)
Summary: Concurrent map read/write in jslog package (was: tast: global error mid-test: concurrent map read/write)
From jslog.go:

116 // run is executed on a background goroutine to collect logs from workers.
117 func (m *Master) run() {
118     for {
119         select {
120         case <-m.finCh:
121             return
122         case e := <-m.logCh:
123             buf := &m.targets[e.targetID].buf
124             e.writeTo(buf)
125         }
126     }   
127 }

nya@, m.targets likely needs to be protected by a mutex. It looks like it's inserted into by (probably) the main goroutine in NewWorker, so Master.run (which is documented as running in a background goroutine) shouldn't be doing unprotected reads from it.

(Okay, I'll admit that Rust probably would've caught this during compilation. :-P)
Uploaded https://crrev.com/c/1374117 as a likely fix, but nya@ knows this code much better than I do.
Owner: derat@chromium.org
Thanks for catching the issue. Reassigning to derat@ since he already prepared a fix.

Great, thanks!  So I don't have an obvious goroutine leak?
It is possible your code has a goroutine leak. The crash indicates Save() and NewWorker() are called concurrently, which means an unusual situation that a new chrome.Conn is created while chrome.Chrome is being Closed. Though we should fix the crash anyway, you might also want to check goroutine leaks.


I don't explicitly create goroutines, but I've been wondering if I am supposed to clean up after myself (I've used "defer" in my other go program) and specifically what.  (Just in case, the code that produced this error is https://crrev.com/c/1330081/18)
I looked at the full stack trace (thanks for it) and I found I was wrong. It was data race between NewWorker() and run(), which will happen in usual situations. So in short, you're innocent. :)

Goroutine leak would be caused by not calling Conn.Close() (and Chrome.Close()).

Thank you very much for checking and for the advice.  I need all those connections until the test ends, so I don't think I am leaking goroutines there.  I will add the Close() calls to be clean.
Project Member

Comment 9 by bugdroid1@chromium.org, Dec 14

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform/tast-tests/+/fe31fd707088cc14f5f06022e50de97dcb3855b5

commit fe31fd707088cc14f5f06022e50de97dcb3855b5
Author: Daniel Erat <derat@chromium.org>
Date: Fri Dec 14 03:28:05 2018

tast-tests: Fix data race in jslog.

Introduce a mutex to avoid concurrent reads and writes to
jslog.Master.targets.

BUG= chromium:914420 
TEST=unit tests still pass

Change-Id: I6c0cd416b1064cf0e8ae42eaa7f9cf5b10820411
Reviewed-on: https://chromium-review.googlesource.com/1374117
Commit-Ready: Dan Erat <derat@chromium.org>
Tested-by: Dan Erat <derat@chromium.org>
Reviewed-by: Shuhei Takahashi <nya@chromium.org>

[modify] https://crrev.com/fe31fd707088cc14f5f06022e50de97dcb3855b5/src/chromiumos/tast/local/chrome/jslog/jslog.go

Status: Fixed (was: Assigned)

Sign in to add a comment