New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 736188 link

Starred by 2 users

Issue metadata

Status: Started
Owner:
Last visit > 30 days ago
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux , Android , Windows , Chrome , Mac
Pri: 3
Type: Bug

Blocked on:
issue 595493


Participants' hotlists:
HSTS-Preload


Sign in to add a comment

Introduce an HSTS preload list auto-roller

Project Member Reported by lgar...@chromium.org, Jun 23 2017

Issue description

Now that  Issue 595493  is resolved, updates to the preload list are simple JSON updates.

Here is a simple strawman for an auto-roller to handle *additions*:
1. Pull the list of domains for HSTS that are 1) completely new to the list, and 2) still preloadable.
2. Add entries above the "// END OF BULK ENTRIES" line, alphabetized, and possibly preceded by a date comment.

Strawman A for #1:
- Pull the full pending list
- Run ScanPending() [2] and remove the domains that fail.
- Remove domains that are already preloaded in the list (e.g. for HPKP only). Possibly put them in the CL description to highlight the need for manual attention.

Strawman B for #2:
- Introduce a new hstspreload.org API endpoint that filters the pending domains as desired, e.g. by daily cron job (so the auto-roller doesn't have to scan Go, and the filtering is handled by the same infrastructure as submission).
- Give the auto-roller an API key to that end point (to 

I'm happy to manually LGTM weekly rolls in the near future.

Future steps:
- handle removals
- handle conversion of non-HSTS preloaded domains that are submitted for HSTS, or prevent them from being handled through the automated process


[1] https://hstspreload.org/api/v2/pending
[2] https://github.com/chromium/hstspreload/blob/master/cmd/hstspreload/scan.go
 
Components: Internals>Network>DomainSecurityPolicy

Comment 2 by mart...@martijnc.be, Aug 24 2017

I looked at this for a bit but have not found much documentation on auto-rollers, only a couple Python auto-rollers in the tree. Are there rules or guidelines for auto-roller that are used by Chromium?

The existing auto-rollers appear to be managing a full Chromium checkout and invoke git cl commands to create changes, this means the auto-roller needs to run on some kind of (virtual) server.

  - Does the auto-roller code need to live in the Chromium repo (and thus be Python or C++)?
  - Is it possible for a non-Googler to work on an auto-roller (they appear to need some infrastructure)?

For the HSTS list; maybe we could annotate (bulk) entries with a flag or move them under a separate toplevel key in the JSON. That way we can add/remove bulk entries without having to search for the comments in normal text. But then we would lose the comments when we write the updated list.. Maybe we could move the bulk entries to a separate file and tell the generator to read both files and merge the entries?

waffles@ knows about auto-rollers, and was interested in this.

- If I recall correctly, most auto-rollers are written in Python. Not sure if C++ is okay.
- Hopefully waffles@ can answer if a non-Googler can easily work on it.

I tried annotating entries at [1] but abandoned that because I can work around it for now [2]. I also started writing an auto-roller that can take comments into account, but I should probably not land that without a proper grammar that allows reserializing comments inline with JSON.

Moving the bulk entires to a separate file is risky because there are an unknown number of bots who consume the current file directly. (This is also why I haven't updated the format [3] yet, nor moved the file from tsss.json to tsss.json5)

[1] https://chromium-review.googlesource.com/c/chromium/src/+/588344
[2] https://github.com/chromium/hstspreload.org/blob/92194ed1e15495131c0d8b2b736c7509a553a4c4/scripts/update_bulk_preloaded.py#L17
[3] https://github.com/chromium/hstspreload.org/issues/79
Cc: dpranke@chromium.org
https://groups.google.com/a/chromium.org/d/msg/chromium-dev/HMd0Ah9Qt8o/fs-6wNb9AwAJ essentially sums up everything I know, sorry. Dirk will be better able to help than I can here.

I believe auto-rollers are typically python, and yes - Dirk can talk more about whether an infra or infra-internals checkout is needed.
FWIW, I recently wrote myself a primitive auto-roller at [1].

It should probably not run without adult supervision in its current state, but I'm hoping it will be reliable enough for manual rolls. I plan to run it locally to create a CL once a week, for now.
(Also note that it relies on the result of a recent scan [2].)

[1] https://github.com/chromium/hstspreload.org/blob/6f85aeeb3fd009f0c95074f4d78e6af950711569/scripts/roll_preload_list.py
[2] https://github.com/chromium/hstspreload/blob/d1118fbf01d23af010108c7ec333e5681c86e18d/cmd/hstspreload/scan.go#L42
dpranke@: ping :-D
Cc: phajdan.jr@chromium.org borenet@chromium.org
Sorry for the delay in responding.

The auto-rollers can be public code, and should be written in Python. We in fact have code that we can re-use for this, and we've been cleaning up the skia auto-roller to make it more general.

borenet@, can you talk to lgarron@ about your recent work and whether or not it'd be helpful?

Comment 8 by bore...@google.com, Oct 4 2017

I've only skimmed the conversation above, so please excuse my unfamiliarity with your goals.  If I understand correctly, you want to periodically update a file checked into the Chromium repo based on an external source which is not itself a Git repo.

The Skia autoroller is a server written in Go and run on Skia's infrastructure.  The general assumption is that you want to update the revision of a dependency of a parent repo (eg. chromium/src.git) on a child repo (eg. skia.git).  The typical case is that this dependency is pinned in the DEPS file checked in to the parent repo, although we can also roll into Fuchsia and Android.  There are currently 13 instances of this server, rolling various projects into various others (most roll into Chromium, or roll Skia into other downstream projects).  They are all managed by the Skia infrastructure team.

Given the above, I don't think Skia's autoroller is a good fit for what you want to do, since the assumption is that we're updating the hash for a child repo.  We do have a handful of bots which run periodically and commit into Skia, so that's not unheard of.
Oh, for some reason I thought this was a file in a repo that was being updated.

I agree, if the file isn't in a git repo, the skia auto_roller doesn't make any sense.

If the "auto_roller" is basically a python script that needs to run periodically, check to see if the list has changed, and then update the file in the checkout, the easiest way to do that is to have a script in the chromium repo that does this, and then we can add a builder that will run the script periodically.

Is the file we're talking about //net/http/transport_security_state_static.json ? As agable@ said on the thread, the file is kinda large, so we could store it in a Google Cloud Storage bucket and then download it via a hook.

On the other hand, it's well-formatted JSON, i.e., text, so it probably compresses relatively well and stores reasonably well in Git, so we can probably also just keep it where it is.

However, it would be nice to have a format where it didn't need to all be in one file.
> However, it would be nice to have a format where it didn't need to all be in one file.

There are a variety of important tools that pull the list from its canonical (current) location in the Chrome source, in its current format. I think we should separate the question of where to store the file from an auto-roller.
> I think we should separate the question of where to store the file
> from an auto-roller.

That's fine.
Cc: -mart...@martijnc.be
Owner: marti...@chromium.org
Status: Assigned (was: Available)
Status: Started (was: Assigned)

Sign in to add a comment