New issue
Advanced search Search tips

Issue 863106 link

Starred by 1 user

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Feature



Sign in to add a comment

Help analyzers with path filtering

Project Member Reported by la...@chromium.org, Jul 12

Issue description

Currently if a single file matches an analyzer's path_filters then the analyzer will be run on all files in the input. While it is possible for the analyzer to do its own filtering, this duplicates logic and is a potential source of confusion.

A couple of ideas for improvement:

- Add a new e.g. $TRICIUM_FILTERED_FILES value to the function's env. This seems relatively straightforward to implement by extending tricium.Cmd.

- Push a new per-worker file with this kind of metadata.

Related: crbug.com/837024#c2
 
Cc: juliehockett@google.com mar...@chromium.org
Status: Available (was: Untriaged)
Excellent point.

As you you suggest, the thing I would expect is, if path_filters is "*.sh", then only .sh files are included in the metadata available to that analyzer; however, the list of files is the same for all analyzers, because it is the same isolated Files data for all analyzers.

 - Setting an environment variable does seem like a simpler change to make, but it makes me feel iffy for a few reasons:
  - What if a filename contains a comma (or whatever other character we use for delimeter)?
  - What if the list of files is super long (e.g. thousands of files, each with a long path)? This isn't just theoretical, some Blink rebaselining changes can have thousands of files with long paths...
  - In general using environment variables seems hacky for some reason.

Including a separate file seems like it might involve adding a separate isolated input, which complicates things a bit. Currently every "function" (both isolator and analyzer) has one isolated input and one isolated output, and they must be one of the supported tricium datatypes, and each intermediate data type like Files only needs to be produced once for a run.

So, anyway, we want some kind of way to including the *filtered* file list, not just all the files.

+ maruel and juliehockett in case you have thoughts as well... what would be ideal as far as analyze writers are concerned? It should be in the input JSON that's read in, right? Maybe we really do need to make a separate isolated input for each analyzer?
I'm fine with always isolating the whole thing and expect the analyzer to process the relevant subset. I think it's less work overall than trying to isolate for every analyzer.
Another approach would be to include in the input a map of metadata for every function keyed on function name and then pass the function name in the env. 
Project Member

Comment 4 by bugdroid1@chromium.org, Jul 14

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/infra/+/17a5a6a7d8cff1912c0163a781e77dcdeaeb36e7

commit 17a5a6a7d8cff1912c0163a781e77dcdeaeb36e7
Author: Quinten Yearsley <qyearsley@chromium.org>
Date: Sat Jul 14 01:35:29 2018

[tricium] Extract helper function to filter files and use it in Pylint analyzer

This isn't quite as ideal as Tricium providing the filtered files for
the analyzer, but at least it makes it so it's less code duplicated
between different analyzers, since most analyzers will want to do some
filtering of the input file list.

Bug:  863600 , 863106
Change-Id: Iba82eb176535c6aa7e5465b9955f012bf2fdeee0
Reviewed-on: https://chromium-review.googlesource.com/1137212
Reviewed-by: Lann Martin <lannm@chromium.org>
Reviewed-by: Marc-Antoine Ruel <maruel@chromium.org>
Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org>

[modify] https://crrev.com/17a5a6a7d8cff1912c0163a781e77dcdeaeb36e7/go/src/infra/tricium/api/v1/data_helpers.go
[modify] https://crrev.com/17a5a6a7d8cff1912c0163a781e77dcdeaeb36e7/go/src/infra/tricium/functions/pylint/pylint_parser.go
[modify] https://crrev.com/17a5a6a7d8cff1912c0163a781e77dcdeaeb36e7/go/src/infra/tricium/api/v1/data_helpers_test.go
[modify] https://crrev.com/17a5a6a7d8cff1912c0163a781e77dcdeaeb36e7/go/src/infra/tricium/functions/shellcheck/main.go

Possibly related to bug 873202 (organize common helper code for analyzers).

Sign in to add a comment