[Factory] .device_id will be deprecated |
||||||
Issue descriptionUserAgent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36 Steps to reproduce the problem: Hi Wei-Ning, The event_log is planning to deprecate by testlog, hence the .device_id used by Overload is going to be deprecated as well. The new device id is moved to /var/factory/log/device_id , would need your help to make sure we didn't break the backward compatibility. In addition, would like to seek your opinion about if overload the device_id as the station name in the future is still a good approach ? What is the expected behavior? What went wrong? We want to replace the .device_id by device_id, but it will break the Overlord. Did this work before? N/A Chrome version: 51.0.2704.103 Channel: n/a OS Version: OS X 10.11.5 Flash Version: Shockwave Flash 22.0 r0
,
Jul 19 2016
Hi Peter, when looking at the new path again, with the fact that "other programs also use this file", I wonder if we should review the hierarchy and location of these files?
I mean, "/var/factory/log" sounds like some folder that you can safely delete and restart without changing anything. Maybe we should find a new home for device_id so people won't accidentally delete it?
/var/factory/
log/ # anything managed by testlog
init/ # maybe a new place to hold startup options and identities, or etc?
device_id
run_goofy_presenter
run/ # holds session data that may change?
active_test_list
,
Jul 19 2016
,
Jul 19 2016
,
Jul 19 2016
I fine with that, we can also reuse the .device_id path if that make more sense. From a testlog perspective, it should be decoupled from Goofy and how goofy stores device_id is done separately in testlog_goofy module.
,
Jul 20 2016
just realized there's one good thing if it's named .device_id - we can do restart_factory -a by "rm -rf /var/factory/*", which will NOT include files starting with . if it's in top level folder. However I'm not sure if that's a strong reason to. There are also other possibilities, for example having a .something folder, or carefully modify restart_factory so it won't clear where device_id lives. I've created a doc for discussion of file hierarchy.
,
Jul 20 2016
To provide some more background on this issue... device_id: A (preferably) deterministic, unique identifier of the device in question. If the device is ever reimaged, the device_id should be the same without having to manually set it. There should never be two devices with the same device_id. station_name: For station-based testing, identifies the name of the station in question. Multiple devices may have the same station_name (e.g. a backup machine could be used to replace a production machine if it fails), but their device_id should still be different. Currently when there are multiple stations performing the same test for pipelining, there are two cases. (1) The configuration between stations is identical. In this case, station_name should be the same. (2) The configuration between stations is different (e.g. for conductive test, there is usually a "left" configuration, and a "right" configuration). In this case, station_name should be different. As for Overlord, since it has only been used for builds configured with station mode in the past, I suggest that it reads from the file station_name instead.
,
Jul 20 2016
BTW, should we keep reimageid ? That was used to prevent duplication of log path; but if we can find a reliable way to generate device ID, and if instalog can handle logs correctly, I wonder if we still need reimage id. Maybe device_id + device_name (station_name) would be enough.
,
Jul 20 2016
I think reimageid is more useful in DUT-based testing, but I'm not totally sure of the use case?
,
Jul 20 2016
I made some survey for how to compute a deterministic, unique id - forwarded to you.
,
Jul 20 2016
+cychiang who introduced reimageid. Jimmy, can you share with us why you introduced reimage ID? If my assumption is correct, that's only made to solve log uploading path (for example rsync to logs) right?
,
Jul 20 2016
In fact, reimage_id was named image_id and it was introduced in the very beginning of event log. https://chromium-review.googlesource.com/#/c/23195/4/client/cros/factory/event_log.py In the event log we use device_id to identify device, and reimage_id to identify image version. At that time, there was no factory updater nor factory toolkit, so that image version actually identifies ChromeOS image version + factory bundle on it. Since that image id is stored in state directory, if user clears all the states, it will get a new image id. So this image id identifies the "one life of full test flow." Image id changes when 1. reimage 2. clear factory state. For example, in the factory, device 310001 gets an ChromeOS image version 1000 + factory bundle md5sum abcd. After imaging and installation, it gets image id a1b1. When there is something wrong with this device, it goes to failure analysis. It might or might not get reimaged after failure analysis. If it is not reimaged, factory operator should clear the state and run through the test flow again. Then, on the same device 310001, same image version 1000 + factory bundle md5sum abcd, we get a new image id c2d2. In event log, we can separate the events of device id 310001 + image id a1b1. device id 310001 + image id c2d2. image id a1b1 is the first life of factory process of that device. image id c2d2 is the second life of factory process of that device. We can then compute many different metrics from event logs based on one life of factory process of one device. Otherwise, there will be multiple duplicate events if that device is re-tested / re-imaged many times. Its current usage for event log is in https://cs.corp.google.com/chromeos_public/src/platform/factory/py/test/event_log.py?type=cs&q=GetReimageId+package:%5Echromeos_public$&l=499 Goofy's system_log_manager also use device id and image id to sync system logs to server. as shown in https://cs.corp.google.com/chromeos_public/src/platform/factory/py/goofy/system_log_manager.py?type=cs&q=event_log.GetReimageId+package:%5Echromeos_public$&l=191 Hope this helps!
,
Jul 20 2016
For now we set device_id as meaningful name(e.g. VSWRStation) in station mode. If .device_id is going to be deprecated, then we probably don't want to use 'Device ID' as the overlord machine ID, since it'll be a long hash and thus non-meaningful. My new proposal would be to use the active test_list name as Overlord machine ID, this one we don't have to worry about device ID as long as the test list is selected correctly on the toolkit.
,
Jul 21 2016
Another thought to consider -- Testlog has a value called Station Name Instalog has a value called Node ID Overlord has a value called Machine ID Maybe it's not important for these three to be aligned along nomenclature (station name, node ID, machine ID), but I think it would simplify a lot if they all pulled from the same value.
,
Jul 21 2016
Are there any RPC method (say JSON RPC) in the testlog API that we can use to get the Station Name or Node ID? If so, overlord can be modified to read it as Machine ID.
,
Jul 21 2016
Testlog don't have so, Instalog might have a plug-in serves for that purpose.
,
Jul 22 2016
Instalog also does not have this functionality.
,
Feb 8 2017
Re-assign to chuntsen |
||||||
►
Sign in to add a comment |
||||||
Comment 1 by itspeter@chromium.org
, Jul 19 2016Components: Factory
Labels: -Via-Wizard -OS-Mac OS-Chrome
Owner: wnhuang@chromium.org
Status: Assigned (was: Unconfirmed)
Summary: [Factory] .device_id is going to deprecating. (was: .device_id is going to deprecating.)