New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 629397 link

Starred by 3 users

Issue metadata

Status: Assigned
Owner:
Last visit > 30 days ago
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Feature



Sign in to add a comment

[Factory] .device_id will be deprecated

Project Member Reported by itspeter@google.com, Jul 19 2016

Issue description

UserAgent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36

Steps to reproduce the problem:
Hi Wei-Ning,

The event_log is planning to deprecate by testlog, hence the .device_id used by Overload is going to be deprecated as well.
The new device id is moved to /var/factory/log/device_id , would need your help to make sure we didn't break the backward compatibility.

In addition, would like to seek your opinion about if overload the device_id as the station name in the future is still a good approach ?

What is the expected behavior?

What went wrong?
We want to replace the .device_id by device_id, but it will break the Overlord.

Did this work before? N/A 

Chrome version: 51.0.2704.103  Channel: n/a
OS Version: OS X 10.11.5
Flash Version: Shockwave Flash 22.0 r0
 
Cc: hungte@chromium.org kitching@chromium.org itspeter@chromium.org
Components: Factory
Labels: -Via-Wizard -OS-Mac OS-Chrome
Owner: wnhuang@chromium.org
Status: Assigned (was: Unconfirmed)
Summary: [Factory] .device_id is going to deprecating. (was: .device_id is going to deprecating.)

Comment 2 by hungte@chromium.org, Jul 19 2016

Hi Peter, when looking at the new path again, with the fact that "other programs also use this file", I wonder if we should review the hierarchy and location of these files?

I mean, "/var/factory/log" sounds like some folder that you can safely delete and restart without changing anything. Maybe we should find a new home for device_id so people won't accidentally delete it?

 /var/factory/
   log/ # anything managed by testlog
   init/ # maybe a new place to hold startup options and identities, or etc?
     device_id
     run_goofy_presenter
   run/ # holds session data that may change?
     active_test_list

Comment 3 by hungte@chromium.org, Jul 19 2016

Summary: [Factory] .device_id is going to be deprecated. (was: [Factory] .device_id is going to deprecating.)
Project Member

Comment 4 by sheriffbot@chromium.org, Jul 19 2016

Labels: Hotlist-Google
I fine with that, we can also reuse the .device_id path if that make more sense. 
From a testlog perspective, it should be decoupled from Goofy and how goofy stores device_id is done separately in testlog_goofy module.

Comment 6 by hungte@chromium.org, Jul 20 2016

just realized there's one good thing if it's named .device_id - we can do restart_factory -a by "rm -rf /var/factory/*", which will NOT include files starting with . if it's in top level folder.

However I'm not sure if that's a strong reason to. There are also other possibilities, for example having a .something folder, or carefully modify restart_factory so it won't clear where device_id lives.

I've created a doc for discussion of file hierarchy.
Summary: [Factory] .device_id will be deprecated (was: [Factory] .device_id is going to be deprecated.)
To provide some more background on this issue...

device_id: A (preferably) deterministic, unique identifier of the device in question.  If the device is ever reimaged, the device_id should be the same without having to manually set it.  There should never be two devices with the same device_id.

station_name: For station-based testing, identifies the name of the station in question.  Multiple devices may have the same station_name (e.g. a backup machine could be used to replace a production machine if it fails), but their device_id should still be different.

Currently when there are multiple stations performing the same test for pipelining, there are two cases.

  (1) The configuration between stations is identical.  In this case, station_name should be the same.

  (2) The configuration between stations is different (e.g. for conductive test, there is usually a "left" configuration, and a "right" configuration).  In this case, station_name should be different.

As for Overlord, since it has only been used for builds configured with station mode in the past, I suggest that it reads from the file station_name instead.

Comment 8 by hungte@chromium.org, Jul 20 2016

BTW, should we keep reimageid ?
That was used to prevent duplication of log path; but if we can find a reliable way to generate device ID, and if instalog can handle logs correctly, I wonder if we still need reimage id.

Maybe device_id + device_name (station_name) would be enough.
I think reimageid is more useful in DUT-based testing, but I'm not totally sure of the use case?
I made some survey for how to compute a deterministic, unique id - forwarded to you.
Cc: cychiang@chromium.org
+cychiang who introduced reimageid.

Jimmy, can you share with us why you introduced reimage ID?

If my assumption is correct, that's only made to solve log uploading path (for example rsync to logs) right?
In fact, reimage_id was named image_id and it was introduced in the very beginning of event log.

https://chromium-review.googlesource.com/#/c/23195/4/client/cros/factory/event_log.py

In the event log we use device_id to identify device, and reimage_id to identify image version.

At that time, there was no factory updater nor factory toolkit, so that image version actually identifies ChromeOS image version + factory bundle on it.
Since that image id is stored in state directory, if user clears all the states, it will get a new image id.
So this image id identifies the "one life of full test flow."

Image id changes when
1. reimage
2. clear factory state.

For example, in the factory, device 310001 gets an ChromeOS image version 1000 + factory bundle md5sum abcd.
After imaging and installation, it gets image id a1b1.

When there is something wrong with this device, it goes to failure analysis.
It might or might not get reimaged after failure analysis.
If it is not reimaged, factory operator should clear the state and run through the test flow again.
Then, on the same device 310001, same image version 1000 + factory bundle md5sum abcd, we get a new image id c2d2.

In event log, we can separate the events of 
device id 310001 + image id a1b1.
device id 310001 + image id c2d2.

image id a1b1 is the first life of factory process of that device.
image id c2d2 is the second life of factory process of that device.

We can then compute many different metrics from event logs based on one life of factory process of one device.
Otherwise, there will be multiple duplicate events if that device is re-tested / re-imaged many times. 

Its current usage for event log is in
https://cs.corp.google.com/chromeos_public/src/platform/factory/py/test/event_log.py?type=cs&q=GetReimageId+package:%5Echromeos_public$&l=499

Goofy's system_log_manager also use device id and image id to sync system logs to server.
as shown in https://cs.corp.google.com/chromeos_public/src/platform/factory/py/goofy/system_log_manager.py?type=cs&q=event_log.GetReimageId+package:%5Echromeos_public$&l=191

Hope this helps!

 

For now we set device_id as meaningful name(e.g. VSWRStation) in station mode. If .device_id is going to be deprecated, then we probably don't want to use 'Device ID' as the overlord machine ID, since it'll be a long hash and thus non-meaningful.

My new proposal would be to use the active test_list name as Overlord machine ID, this one we don't have to worry about device ID as long as the test list is selected correctly on the toolkit.
Another thought to consider --

Testlog has a value called Station Name
Instalog has a value called Node ID
Overlord has a value called Machine ID

Maybe it's not important for these three to be aligned along nomenclature (station name, node ID, machine ID), but I think it would simplify a lot if they all pulled from the same value.
Are there any RPC method (say JSON RPC) in the testlog API that we can use to get the Station Name or Node ID? If so, overlord can be modified to read it as Machine ID.
Testlog don't have so, Instalog might have a plug-in serves for that purpose.
Instalog also does not have this functionality.
Owner: chuntsen@chromium.org
Re-assign to chuntsen

Sign in to add a comment