New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 839465 link

Starred by 2 users

Issue metadata

Status: Assigned
Owner:
Last visit > 30 days ago
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: ----



Sign in to add a comment

Swarming cannot delete run directory: dummy_benchmark.noisy_benchmark_1 failing on chromium.perf/Win 10 Perf

Project Member Reported by sheriff-...@appspot.gserviceaccount.com, May 3 2018

Issue description

Filed by sheriff-o-matic@appspot.gserviceaccount.com on behalf of sullivan@google.com

This failed on 4 of the last 8 builds. Hopefully it's a reduced test case for figuring out the problem?

Builders failed on: 
- Win 10 Perf: 
  https://ci.chromium.org/buildbot/chromium.perf/Win%2010%20Perf

Sample log:
https://logs.chromium.org/v/?s=chrome%2Fbb%2Fchromium.perf%2FWin_10_Perf%2F2444%2F%2B%2Frecipes%2Fsteps%2Fdummy_benchmark.noisy_benchmark_1_on_Intel_GPU_on_Windows_on_Windows-10%2F0%2Fstdout


[  PASSED  ] 1 test.
View result at file://c:\b\s\w\itrbon3x\tmpbdwijztelemetry\histograms.json
View result at file://c:\b\s\w\itrbon3x\tmpbdwijztelemetry\test-results.json
Running ['c:\\b\\s\\w\\ir\\.swarming_module_cache\\vpython\\8dc308\\Scripts\\python.exe', '../../tools/perf/run_benchmark', 'dummy_benchmark.noisy_benchmark_1', '-v', '--upload-results', '--browser=release_x64', '--output-format=histograms', '--output-dir', 'c:\\b\\s\\w\\itrbon3x\\tmpbdwijztelemetry', '--output-format=json-test-results'] in None (env: {'TMP': 'C:\\Users\\CHROME~1\\AppData\\Local\\Temp', 'LC_NUMERIC': 'English_United States.UTF-8', 'COMPUTERNAME': 'BUILD190-A9', 'VS140COMNTOOLS': 'C:\\Program Files (x86)\\Microsoft Visual Studio 14.0\\Common7\\Tools\\', 'USERDOMAIN': 'LABS', 'LC_CTYPE': 'English_United States.UTF-8', 'PSMODULEPATH': 'C:\\Program Files\\WindowsPowerShell\\Modules;C:\\WINDOWS\\system32\\WindowsPowerShell\\v1.0\\Modules', 'COMMONPROGRAMFILES': 'C:\\Program Files\\Common Files', 'PROCESSOR_IDENTIFIER': 'Intel64 Family 6 Model 158 Stepping 9, GenuineIntel', 'PROGRAMFILES': 'C:\\Program Files', 'PROCESSOR_REVISION': '9e09', 'HOME': 'c:\\Users\\chrome-bot', 'BOTO_CONFIG': 'c:\\Users\\chrome-bot\\.boto', 'SYSTEMROOT': 'C:\\WINDOWS', 'SWARMING_BOT_ID': 'build190-a9', 'PROGRAMFILES(X86)': 'C:\\Program Files (x86)', 'LANG': 'en_US.UTF-8', 'VIRTUAL_ENV': 'c:\\b\\s\\w\\ir\\.swarming_module_cache\\vpython\\8dc308', 'SWARMING_SERVER': 'https://chromium-swarm.appspot.com', 'TEMP': 'c:\\b\\s\\w\\itrbon3x', 'LC_MONETARY': 'English_United States.UTF-8', 'CHROME_DEVEL_SANDBOX': '/opt/chromium/chrome_sandbox', 'COMMONPROGRAMFILES(X86)': 'C:\\Program Files (x86)\\Common Files', 'PROCESSOR_ARCHITECTURE': 'AMD64', 'SWARMING_HEADLESS': '1', 'ALLUSERSPROFILE': 'C:\\ProgramData', 'USERPROFILE': 'C:\\Users\\chrome-bot', 'LOCALAPPDATA': 'C:\\Users\\chrome-bot\\AppData\\Local', 'PYTHONNOUSERSITE': '1', 'HOMEPATH': '\\Users\\chrome-bot', 'LUCI_CONTEXT': 'c:\\b\\s\\w\\itrbon3x\\luci_ctx.tt0xu7.json', 'CIPD_CACHE_DIR': 'c:\\b\\s\\cipd_cache\\cache', 'PROGRAMW6432': 'C:\\Program Files', 'USERNAME': 'chrome-bot', 'LC_ALL': 'English_United States.UTF-8', 'LOGONSERVER': '\\\\AD1-A', 'PROMPT': '$P$G', 'COMSPEC': 'C:\\WINDOWS\\system32\\cmd.exe', 'PROGRAMDATA': 'C:\\ProgramData', 'USERDOMAIN_ROAMINGPROFILE': 'LABS', 'ONEDRIVE': 'C:\\Users\\chrome-bot\\OneDrive', 'VPYTHON_VIRTUALENV_ROOT': 'c:\\b\\s\\w\\ir\\.swarming_module_cache\\vpython', 'PATH': 'c:\\b\\s\\w\\ir\\.swarming_module_cache\\vpython\\8dc308\\Scripts;c:\\b\\s\\w\\ir\\.swarming_module;c:\\b\\s\\w\\ir\\.swarming_module\\bin;c:\\b\\s\\cipd_cache\\bin;C:\\WINDOWS\\system32;C:\\WINDOWS;C:\\WINDOWS\\System32\\Wbem;C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\;C:\\Program Files (x86)\\Windows Kits\\8.1\\Windows Performance Toolkit\\;C:\\Tools;C:\\b\\depot_tools;C:\\CMake\\bin;C:\\Program Files\\Puppet Labs\\Puppet\\bin;C:\\Users\\chrome-bot\\AppData\\Local\\Microsoft\\WindowsApps;;c:\\b\\s\\w\\ir\\.swarming_module_cache\\vpython\\8dc308\\lib\\site-packages\\pywin32_system32;c:\\b\\s\\w\\ir\\.swarming_module_cache\\vpython\\8dc308\\lib\\site-packages\\pywin32_system32', 'USERDNSDOMAIN': 'LABS.CHROMIUM.ORG', 'NO_GCE_CHECK': 'False', 'CHROME_HEADLESS': '1', 'PATHEXT': '.COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH;.MSC', 'SESSIONNAME': 'Console', 'SWARMING_TASK_ID': '3d3c14535aeb3d11', 'GIT_USER_AGENT': 'git/2.15.1.windows.2 win32 build190-a9.labs.chromium.org', 'HOMEDRIVE': 'C:', 'APPDATA': 'C:\\Users\\chrome-bot\\AppData\\Roaming', 'SYSTEMDRIVE': 'C:', 'PUBLIC': 'C:\\Users\\Public', 'NUMBER_OF_PROCESSORS': '8', 'PROCESSOR_LEVEL': '6', 'LC_TIME': 'English_United States.UTF-8', 'COMMONPROGRAMW6432': 'C:\\Program Files\\Common Files', 'OS': 'Windows_NT', 'LC_COLLATE': 'English_United States.UTF-8', 'WINDIR': 'C:\\WINDOWS'})
Command ['c:\\b\\s\\w\\ir\\.swarming_module_cache\\vpython\\8dc308\\Scripts\\python.exe', '../../tools/perf/run_benchmark', 'dummy_benchmark.noisy_benchmark_1', '-v', '--upload-results', '--browser=release_x64', '--output-format=histograms', '--output-dir', 'c:\\b\\s\\w\\itrbon3x\\tmpbdwijztelemetry', '--output-format=json-test-results'] returned exit code 0
Failed to delete c:\b\s\w\ir (8 files remaining).
  Maybe the test has a subprocess outliving it.
  Sleeping 2 seconds.
Failed to delete c:\b\s\w\ir (8 files remaining).
  Maybe the test has a subprocess outliving it.
  Sleeping 4 seconds.
Failed to delete c:\b\s\w\ir. The following files remain:
- \\?\c:\b\s\w\ir
- \\?\c:\b\s\w\ir\out
- \\?\c:\b\s\w\ir\out\Release_x64
- \\?\c:\b\s\w\ir\out\Release_x64\chrome.exe
- \\?\c:\b\s\w\ir\out\Release_x64\chrome_elf.dll
- \\?\c:\b\s\w\ir\out\Release_x64\chrome_watcher.dll
- \\?\c:\b\s\w\ir\out\Release_x64\initialexe
- \\?\c:\b\s\w\ir\out\Release_x64\initialexe\chrome.exe
Enumerating processes:
- pid 1280; Handles: 149; Exe: c:\b\s\w\ir\out\Release_x64\chrome.exe; Cmd: c:\b\s\w\ir\out\Release_x64\chrome.exe --type=crashpad-handler --user-data-dir=c:\b\s\w\itrbon3x\tmpeyrqne /prefetch:7 --monitor-self-annotation=ptype=crashpad-handler --database=c:\b\s\w\itrbon3x\tmpwgnlse --metrics-dir=c:\b\s\w\itrbon3x\tmpeyrqne --url=https://clients2.google.com/cr/report --annotation=channel= --annotation=plat=Win64 --annotation=prod=Chrome --annotation=ver=68.0.3418.0 --initial-client-data=0x1f8,0x1fc,0x200,0x1f4,0x204,0x7ffedace4228,0x7ffedace4238,0x7ffedace4248
- pid 6212; Handles: 137; Exe: c:\b\s\w\ir\out\Release_x64\chrome.exe; Cmd: "c:\b\s\w\ir\out\Release_x64\chrome.exe" --type=watcher --main-thread-id=624 --on-initialized-event-handle=680 --parent-handle=684 /prefetch:6
Terminating 2 processes:
- 1280 killed
- 6212 killed
*** Swarming tried multiple times to delete the run directory and failed ***
*** Hard failing the task ***
Swarming detected that your testing script ran an executable, which may have
started a child executable, and the main script returned early, leaving the
children executables playing around unguided.
You don't want to leave children processes outliving the task on the Swarming
bot, do you? The Swarming bot doesn't.
How to fix?
- For any process that starts children processes, make sure all children
  processes terminated properly before each parent process exits. This is
  especially important in very deep process trees.
  - This must be done properly both in normal successful task and in case of
    task failure. Cleanup is very important.
- The Swarming bot sends a SIGTERM in case of timeout.
  - You have 30.0 seconds to comply after the signal was sent to the process
    before the process is forcibly killed.
- To achieve not leaking children processes in case of signals on timeout, you
  MUST handle signals in each executable / python script and propagate them to
  children processes.
  - When your test script (python or binary) receives a signal like SIGTERM or
    CTRL_BREAK_EVENT on Windows), send it to all children processes and wait for
    them to terminate before quitting.
See
https://github.com/luci/luci-py/blob/master/appengine/swarming/doc/Bot.md#graceful-termination-aka-the-sigterm-and-sigkill-dance
for more information.
*** May the SIGKILL force be with you ***
 
Components: Speed>Telemetry
Owner: nednguyen@chromium.org
Ned, can you help triage?
Status: Assigned (was: Available)

Comment 3 by benhenry@google.com, Jan 16 (6 days ago)

Components: Test>Telemetry

Comment 4 by benhenry@google.com, Jan 16 (6 days ago)

Components: -Speed>Telemetry

Sign in to add a comment