New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 863473 link

Starred by 1 user

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Bug



Sign in to add a comment

Add all_tests and disabled_tests to json test result format

Project Member Reported by chanli@chromium.org, Jul 13

Issue description

In gtest results, there are a field 'all_tests' which provides all the tests that are run in the task(the whole task, not just one shard) and 'disabled_tests' which provides all the disabled tests. With these fields, it's very easy to check whether a given test exists or is disabled by checking test result of just one sample shard.

But the fields are not provided in json_test_results, so the same kind of checking will require to download and parse test results of ALL shards.

Is it possible to also add these fields to json test results? 
 
How is this different from just have the test framework always write out all the test entries it's supposed to run & having the missing tests' values are: 
{ 'test_foo': 
     'expected': 'PASS'
     'actual': 'SKIP'
}
Components: Blink>Infra
Re #2: yeah I think that'd be the implementation.

It won't be hard to do, but as I commented in the CL there'll be some overhead because we need to walk the directory and scan the WPT manifest even when a test list is provided, and it will bloat the size of the output. Also, the result merger also needs to be changed.
Re #2, what I meant for the disabled_tests should be the tests that are expected to be skipped.

But it looks like it's easy to add those info right? I guess now we only need to discuss if it's worth it to adding them.
I am concerned by the size bloat as Robert mentioned. Why would we want to check "whether a given test exists or is disabled by checking test result of just one sample shard"? Can't we use the merge results to check?
Re #5, so the thing is it seems a little bit overkill to download and merge *All* shards'results then parse to only check if one test exists/enabled: that's definitely doable (in fact that's what the CL's for), but it's worth checking if we can have same information as gtest results so that we don't have to do that.

Or maybe we can store all_tests and disabled_tests to a new file?
Yeah that's also an approach to think about. 

But I have some concerns:
  1. Findit also triggers swarming tasks itself, so it needs to download swarming tasks from isolate anyway. If we download 'json.output' of the test step from logdog for waterfall builds, we'll have two sources for the same kind of data
  2. we need to rely on the json.output being uploaded to logdog so that we can download, meaning one added dependency.
  3. The merged result might be too big to download as a whole (just my guess, not confirmed)
robertocn@ and I just had a discussion offline, what do you think if we isolate a new file with all_tests and disabled_tests? If that sounds good, we might also change gtest results to have the same file.

And from that, is it possible to have a unified test result format for all test types?
#8 Thanks for explaining the FindIt usage, now I have better understanding why you wanted this feature.

+1 to adding an extra flag to produce a file with all_tests & disabled_tests. 

I also think it's a goal to have a unified test result format for all test types. The fact that they are not unified is just a consequence with these test frameworks are not owned by the same team.
I think I don't understand what the flow of control is that leads to this problem. 

Why do you need to look at individual shard output, rather than the merged result of the step? When you are trying to reproduce failures, are you trying only re-run the shards that failed, rather than either all shards or just the tests that failed?
The use case here is that we want to find out that if one test exists or is enabled. 

Sure we can look at the merged results. It's just for gtest, we have the benefit to get what we want by only checking one shard, and we cannot do the same for layout tests.  So we want to ask if it's possible to have similar information in each test results, so we don't need to handle differently at Findit side.

Sign in to add a comment