Add UMA for CPU/power/memory usage of workers. |
||||||||||
Issue descriptionWe want to understand how much power and memory workers use. From an internal thread: "Regarding memory, it's hard to measure the memory usage of a service worker in general because many things are shared among all threads in the renderer process. On the other hand, it's possible to measure some specific things like v8::Isolate's heap consumption, memory usage of Resources fetched by a service worker etc." "Using CPU-usage as a proxy for power, our metrics[1] are still mainly focused on the renderer main thread. [...] we do have a worker scheduler[2] but it's pretty simplistic. It wouldn't be too hard to extend it to log some worker-specific CPU metrics." [1] RendererScheduler.ForegroundRendererMainThreadLoad [2] https://cs.chromium.org/chromium/src/third_party/WebKit/Source/platform/scheduler/child/worker_scheduler_impl.h?rcl=0&l=18
,
Feb 20 2017
,
Mar 16 2017
From internal thread: it should be pretty easy to add UMA about CPU usage for worker threads here: https://cs.chromium.org/chromium/src/third_party/WebKit/Source/platform/scheduler/child/worker_scheduler_impl.h?l=19&ct=xref_jump_to_def&gsn=WorkerSchedulerImpl For reference, here's how we report the CPU usage UMA on the main thread: https://cs.chromium.org/chromium/src/third_party/WebKit/Source/platform/scheduler/renderer/renderer_scheduler_impl.cc?l=53&gs=cpp%253Ablink%253A%253Ascheduler%253A%253A%253Canonymous-namespace%253E%253A%253AReportForegroundRendererTaskLoad(base%253A%253ATimeTicks%252C%2Bdouble)%2540chromium%252F..%252F..%252Fthird_party%252FWebKit%252FSource%252Fplatform%252Fscheduler%252Frenderer%252Frenderer_scheduler_impl.cc%257Cdef&gsn=ReportForegroundRendererTaskLoad&ct=xref_usages
,
Mar 17 2017
I tried to add ThreadLoadTracker in WorkerScheduerImpl and have run it locally for a while to see how it'd look, but I started to wonder it might not be what we want to collect. Workers or ServiceWorkers are basically created when the page needs it, and then killed / closed after they are no longer needed, therefore if we just measure load by 'task run time' / 'thread uptime' this often ends up around 100%. We probably want to use different denominator, say, renderer process uptime or something akin, and maybe also should expect that it can go higher than 100% in case there're multiple worker threads running all the time? Any thoughts?
,
Mar 17 2017
I'd personally interested in seeing (task run time on a worker thread / task run time on the main thread). Sami and Altimin might have ideas.
,
Mar 17 2017
Hmm. Maybe would it make more sense to add (task run time on thread X / task run time on all threads)? I guess what we want to understand is how much time/power of the renderer process is going on what thread.
,
Mar 17 2017
Dividing by the task run time of the different thread feels a bit weird to me.. #6 makes more sense and could be interesting to collect but it doesn't seem to give us good info about how busy the process was by itself.
,
Mar 17 2017
#4: Yes, ThreadLoadTracker was designed with main thread in mind and is not a perfect match for workers. However, even with 100% load being the most common scenario, this data is still useful: by looking at the number of samples we will get worker CPU usage / main thread CPU usage ratio.
,
Mar 17 2017
Right, I think we basically want some top level breakdown of how CPU time is distributed between the main thread and the worker thread. What kind of insight/decisions are we hoping to get out of this data and what would we need to collect in order to do that?
,
Mar 17 2017
I think that we can make this a part of per-task type metrics ( crbug.com/702318 ) with special *Worker task types.
,
Mar 22 2017
#8: Ok, that's true that # of samples can give us some important meta data too. I can re-polish my patch and put it up for review then.
,
Mar 24 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/c71427629a142302c08179360116f7a0125b90cc commit c71427629a142302c08179360116f7a0125b90cc Author: kinuko <kinuko@chromium.org> Date: Fri Mar 24 03:06:24 2017 Add UMA to record WorkerThread runtime We have one for ServiceWorker but don't have the stat for other workers. Having it would be useful to add other stats like WorkerThread CPU Load UMA (https://codereview.chromium.org/2749383003/) BUG=692906 Review-Url: https://codereview.chromium.org/2766263005 Cr-Commit-Position: refs/heads/master@{#459349} [modify] https://crrev.com/c71427629a142302c08179360116f7a0125b90cc/third_party/WebKit/Source/platform/scheduler/child/worker_scheduler_impl.cc [modify] https://crrev.com/c71427629a142302c08179360116f7a0125b90cc/third_party/WebKit/Source/platform/scheduler/child/worker_scheduler_impl.h [modify] https://crrev.com/c71427629a142302c08179360116f7a0125b90cc/tools/metrics/histograms/histograms.xml
,
Mar 24 2017
Will wait for a few days, see UMA stats to decide good ReportingInterval and will land the other CL.
,
Mar 28 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/c802941a91a0b6427e8cda5775f7bf728939e7de commit c802941a91a0b6427e8cda5775f7bf728939e7de Author: kinuko <kinuko@chromium.org> Date: Tue Mar 28 06:07:54 2017 Fix WorkerThread.Runtime UMA BUG=692906 R=nhiroki@chromium.org Review-Url: https://codereview.chromium.org/2781663003 Cr-Commit-Position: refs/heads/master@{#460025} [modify] https://crrev.com/c802941a91a0b6427e8cda5775f7bf728939e7de/third_party/WebKit/Source/platform/scheduler/child/worker_scheduler_impl.cc
,
Apr 5 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/d81b3970f7e6120f5b145af37d330db86b458dba commit d81b3970f7e6120f5b145af37d330db86b458dba Author: kinuko <kinuko@chromium.org> Date: Wed Apr 05 07:41:37 2017 WorkerThread CPU Load UMA I feel we might want to separate out these UMAs per worker-types and for background/foreground cases, but starting with the simplest one. BUG=692906 Review-Url: https://codereview.chromium.org/2749383003 Cr-Commit-Position: refs/heads/master@{#462006} [modify] https://crrev.com/d81b3970f7e6120f5b145af37d330db86b458dba/third_party/WebKit/Source/platform/BUILD.gn [add] https://crrev.com/d81b3970f7e6120f5b145af37d330db86b458dba/third_party/WebKit/Source/platform/scheduler/base/time_converter.h [modify] https://crrev.com/d81b3970f7e6120f5b145af37d330db86b458dba/third_party/WebKit/Source/platform/scheduler/child/worker_scheduler_impl.cc [modify] https://crrev.com/d81b3970f7e6120f5b145af37d330db86b458dba/third_party/WebKit/Source/platform/scheduler/child/worker_scheduler_impl.h [modify] https://crrev.com/d81b3970f7e6120f5b145af37d330db86b458dba/third_party/WebKit/Source/platform/scheduler/renderer/renderer_scheduler_impl.cc [modify] https://crrev.com/d81b3970f7e6120f5b145af37d330db86b458dba/tools/metrics/histograms/histograms.xml
,
Apr 5 2017
,
Jun 26 2017
So now we have stats from Stable branch. Roughly 98.92% sample logs 0% load (i.e. went idle for 98.92%) while roughly 0.35% sample marks 100% load. Interval rate is 1 sec (while renderer scheduler's interval is 1 min). Comparing sample numbers for renderers and workers (by multiplying the renderer samples by 60) worker loads are more sampled than renderers, though interval diffs might be introducing some skews / for workers we don't filter out 30+ sec tasks. Let me tentatively re-assign this to altimin@ so that he can analyze the stats further. I think possible next steps could be to breakdown workloads by worker types.
,
Jun 26 2017
I think measuring is useful so we can see the amount of impact. But we should throttle worker types that we can associate with a frame/tab when we throttle the frame/tab regardless of what the metrics show us as today's impact. The worker shouldn't be a loophole where a page can use a lot of a user's battery. The exception to this is workers that are shared across frames/pages (SharedWorker and ServiceWorker are the only ones, right?). For SharedWorker we should only throttle it if all the tabs talking to it are backgrounded. For ServiceWorker, I don't think we should throttle just for principle, but we might need to if we encounter abuses in the wild. If we wait until there are significant abuses in the wild, it will be harder to curb it then because more people will have built code depending on workers not being throttled, particularly as we get more aggressive about throttling the main thread. They will be extra upset if they move to a worker and then that gets throttled later and there site breaks twice.
,
Jun 26 2017
,
Jul 14 2017
,
Apr 3 2018
CPU usage is measured now in RendererScheduler.TaskDurationPerThreadType and RendererScheduler.TaskDurationPerTaskType.DedicatedWorker histograms. Also we believe that it's a good approximation for power usage. Marking it as available to see if anyone wants to look into memory usage.
,
May 8 2018
|
||||||||||
►
Sign in to add a comment |
||||||||||
Comment 1 by nhiroki@chromium.org
, Feb 16 2017Components: Blink>Scheduling