New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 671359 link

Starred by 1 user

Issue metadata

Status: Untriaged
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment

security_SandboxedServices is flaky: fix or remove

Project Member Reported by semenzato@chromium.org, Dec 5 2016

Issue description

As seen in issue 658945 comment 5, security_SandboxedServices appears to be inherently flaky.  Even though it can be potentially useful, we can't have knowingly flaky tests in our waterfalls, so it should be fixed.  (Maybe use lastcomm?)
 
i don't think comment 5 over there suggests the test is flaky.  they had a service, even if it was short lived, running as root.  we want those services to either not exist (be properly sandboxed) or be explicitly whitelisted.

flaky tests imply the test sometimes pass, but then sometimes (wrongly) fails.  in this case, the test sometimes pass, but then sometimes (correctly) fails.

the only way to make this robust and always fail would be to somehow collect perfect process spawning information and the process it all after the fact.  that's somewhat feasible with the audit functionality in the kernel, but seems like it might be overkill.

do we have cases where it's wrongly failing ?
#1: no known cases of wrong failures, only wrong successes.

Ideally these test failures should be caught in the CQ.  If the test is flaky in this direction (false positives), the bad CL will go through, and create more work later (build failures, inexperienced sheriffs spending time tracking down the culprits, etc.)  Also confuses the picture of build reliability as we have to keep track of good vs. bad flake.

Also, this test might not notice security violations for a long time (or ever).

I don't think we need to change the kernel in order to run lastcomm.  It may be that we just need to add the acct package (to test images) and run accton early at boot.

I am OK either way but right now I have to advocate for build/infra/sheriffs.



i'm not familiar with lastcomm or how it works.  auditing in the kernel doesn't require modifying the kernel, "just" invoking the audit userland to get the info we want.
Hi Kees, there are actually security issues on this bug, please let us know if you have any comments.

Comment 5 by vapier@chromium.org, Jun 22 2018

in the meantime, we could have security_SandboxedServices poll the boot state until init scripts have quiesced.  perhaps even going as far as waiting for all "tasks" to complete.

if we simply wait until boot-complete or system-services have finished, there's a number of jobs that kick off in parallel.  for example, crash-boot-collect.conf runs after system-service starts so that it can collect any boot crashes in the background and w/out blocking the UI.  then userfeedback/init/firmware-version.conf will run after that process finishes to collect/cache some feedback related logs.

Sign in to add a comment