Support skipping tricium for some files (e.g. generated files, expectations) |
||||||||||||
Issue descriptionSince they're generally not written by hand.
,
Jul 24
Mentioned by Diego today. If we want to do this by keeping a list of file path patterns, there are multiple options: - Tricium could add a flag to file metadata for each file - A helper function could be added for analyzers to filter Note, we don't want to unconditionally always filter out all possibly generated files for all analyzers, since it's possible that some analyzers would want to check these files.
,
Aug 13
,
Sep 27
,
Sep 28
It's quite annoying since Tricium keeps posting comments on each patchset. This adds noise and hides useful comments from reviewers.
,
Sep 28
I agree, this will continue be a source of noise in general, and should be done sooner rather than later. I propose adding a helper function that is available to all analyzers to filter different kinds of generated or expectation files; use it in most or all of the existing analyzers (second option from #2). Alternative: Add this function in Tricium itself, and always hide these files from all analyzers (first option from #2). Reason: I think that it makes sense to add this as an optional thing that analyzer can use, because it's similar to the binary file filtering: it's conceivable that some analyzer will want to look at the full actual list of files, and specific decisions about filtering files seems like something that should be explicitly done in the analyzer. Always just giving the analyzer the full list of files makes it easier to reason about what the input of the analyzer is going to be. Note: whichever way we choose, we can always change our mind later, the cost of changing to the other way doesn't seem very high. There may be different new types of generated files over time, and there's no sure way to tell all possible generated file types from the content or path of the file, although there is almost always a way to guess a particular type of generated file from the path. For example: Recipe expectations: .expected/*.json Web test expectatations: LayoutTests/*-expected.* or in the future: web_tests/*-expected.* Generated proto files: *_pb2.py or *.pb.go Expectation files for milo frontend tests: expectations/*.html So the helper function might just be composed of a bunch of checks to see whether it matches any of these patterns.
,
Oct 1
We'll also want to make it easy to skip (most) third party files, which may be modified in rolls etc. but not directly edited. One exception is third_party/blink.
,
Oct 24
Robbie had an idea about a way to do this that wouldn't require that all of these file patterns are kept in a list with the Tricium code: Config could be kept in .gitattributes (or a similar file) and GitFileIsolator could potentially read that file and just not copy files which are to be ignored. Config about what files to skip could be alongside the files to skip. This way, for example, the config saying to skip different sets of third party code could be put alongside that third party code; config saying to skip layout test expectations could be with the layout tests; it would be easier to skip new sets of files that should be skipped. The easiest way to do this now would be to only support skipping all analyzers; but potentially, if GitFileIsolator could output separate sets of files for each analyzer then we'd be able to support skipping particular analyzers.
,
Oct 29
This needs to be addressed as soon as possible. :/ Assuming Quinten is there next week I'd like this to be looked at in priority, since it's currently very noisy for a fair number of kinds of changes. I'm fine with the .gitattributes idea, or autodetecting on a "generated via" / "autogenerated" word in the first 3 lines in a comment, or anything that gets the desired result.
,
Nov 6
I'm in favor of the .gitattributes idea, modifying GitFileIsolator to filter out files. This would mean that, at first, all analyzers would filter out the same set of files. Potentially in the future we could support per-analyzer skipping, but I'm not sure whether we need to do that. We could start out initially skipping based on path patterns. To skip based on the contents of .gitattributes, I believe that GitFileIsolator must additionally check out .gitattributes files (right now it's running `git checkout FETCH_HEAD -- file1 file2 file3 etc` to check out only files in the change).
,
Nov 9
,
Nov 9
thestig@ pointed out another category of files we want to skip: files ending in _expected.txt, such as //src/third_party/pdfium/testing/resources/javascript/annot_props_expected.txt Uploaded CL https://chromium-review.googlesource.com/c/infra/infra/+/1329751
,
Nov 9
,
Nov 10
The following revision refers to this bug: https://chromium.googlesource.com/infra/infra/+/b027de3bd60c5a6920d2eeb78705f48b1c30efc1 commit b027de3bd60c5a6920d2eeb78705f48b1c30efc1 Author: Quinten Yearsley <qyearsley@chromium.org> Date: Sat Nov 10 00:06:35 2018 [tricium git-file-isolator] Skip known generated/expectation files. This would be an initial quick change, to be followed up by actually reading .gitattributes files to decide what to skip. Bug: 862770 Change-Id: I965f637af921e609fe5f51f23c56ead5a9670cec Reviewed-on: https://chromium-review.googlesource.com/c/1329751 Commit-Queue: Quinten Yearsley <qyearsley@chromium.org> Reviewed-by: Marc-Antoine Ruel <maruel@chromium.org> Cr-Commit-Position: refs/heads/master@{#18914} [add] https://crrev.com/b027de3bd60c5a6920d2eeb78705f48b1c30efc1/go/src/infra/tricium/functions/git-file-isolator/isolator_test.go [modify] https://crrev.com/b027de3bd60c5a6920d2eeb78705f48b1c30efc1/go/src/infra/tricium/functions/git-file-isolator/isolator.go
,
Nov 12
New version of GitFileIsolator deployed, and it seems to now be filtering out generated pb files as expected, e.g. https://chromium-review.googlesource.com/c/infra/infra/+/1332331. Next steps: - check and and use .gitattributes files - add .gitattributes files in the various repos - remove hard-coded list in git-file-isolator
,
Jan 11
,
Jan 15
The new version is bad, and produces the error: Failed to run command: exit status 1, cmd: [git checkout FETCH_HEAD -- 100644 blob 9cc142c8ed9c01fc7874eaa4c0018ec890fccafe .gitattributes 100644 blob 7c091e9e5dfdb1aac2556d4e9b5e99c9e415e101 third_party/blink/renderer/core/.gitattributes] That is, it's not just using the filenames in the list of .gitattributes files as it should.
,
Jan 16
(6 days ago)
The following revision refers to this bug: https://chromium.googlesource.com/infra/infra/+/3f776d393633a5e64f9db3d7ba4b47e477f01b75 commit 3f776d393633a5e64f9db3d7ba4b47e477f01b75 Author: Quinten Yearsley <qyearsley@chromium.org> Date: Wed Jan 16 17:41:47 2019 When checking .gitattributes files, use --name-only. In the initially committed version of the the .gitattributes code, --name-only was accidentally left out, so the other fields output by ls-tree, including file modes etc. were kept, which causes git checkout to fail. This wasn't caught in testing because the unit tests don't cover any of the command invocation, and test that involves running the analyzer (`make test`) didn't catch it because there were no .gitattributes files to check out in the sample change. Bug: 862770 Change-Id: Ic61d898bf81ff186d4f1b08a091a51ffc167f13c Reviewed-on: https://chromium-review.googlesource.com/c/1412048 Reviewed-by: Marc-Antoine Ruel <maruel@chromium.org> Commit-Queue: Quinten Yearsley <qyearsley@chromium.org> Cr-Commit-Position: refs/heads/master@{#20025} [modify] https://crrev.com/3f776d393633a5e64f9db3d7ba4b47e477f01b75/go/src/infra/tricium/functions/git-file-isolator/test/tricium/data/git_file_details.json [modify] https://crrev.com/3f776d393633a5e64f9db3d7ba4b47e477f01b75/go/src/infra/tricium/functions/git-file-isolator/isolator.go
,
Jan 16
(6 days ago)
New version of isolator deployed, seems to be working (e.g. https://chromium-swarm.appspot.com/task?id=427095bcf1c2f810&refresh=10&show_raw=1). |
||||||||||||
►
Sign in to add a comment |
||||||||||||
Comment 1 by qyears...@chromium.org
, Jul 11Status: Available (was: Untriaged)
Summary: Skip running Tricium on generated or expectation files (was: Running tricium on expectation or generated files generally isn't as helpful)