New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 685733 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Xoogler
Closed: Feb 2017
Cc:
EstimatedDays: ----
NextAction: ----
OS: All
Pri: 3
Type: Feature



Sign in to add a comment

Investigate performance of trace events via virtual dispatch

Project Member Reported by primiano@chromium.org, Jan 26 2017

Issue description

In the context of discussing bit.ly/tracing-unbundling with base owners, turned out that Option A (playing include-path tricks) might cause some headaches.
So it's time to evaluate whether Option B (using a clean interface injection) would introduce any measurable cost.

Quick recap. TRACE_EVENTx macros today boil down to this:
- Each macro creates a unique static local variable (per event) that points to the category enabled-state word (the non-moveable uint8_t that tells whether a given category is enabled or not).
- The first time the macro is ever hit, we need to get the pointer to the category enabled-state word and cache it into the static local var above.
- if(UNLIKELY(*category_enabled_work))) to check if the macro should record an event or be a no-op
- if the check above is true, we invoke the code in TraceLog that actually records the event.

So, the pseudo-code for a TRACe_EVENTx macro instance, after folding all the macro layers looks like this:

1. static uint8_t* category_enabled_ptr = nullptr;
2. if (!category_enabled_ptr__line_1234)
3.  category_enabled_ptr__line_1234 = GetPointerForCategory("category-name")
4. if (*category_enabled_ptr__line_1234) {
5.  TraceLog::GetInstance()->AddEvent("event-name", ...other arguments passed to the macro)
6. }


The things that would change if we go for Option B are lines 3 and 5 of the pseudo code above. Instead of being direct calls they would become vptr calls, as follows:
1. static uint8_t* category_enabled_ptr = nullptr;
2. if (!category_enabled_ptr)
3.  category_enabled_ptr = GetTraceInterface()->GetPointerForCategory("category-name")
4. if (*category_enabled_ptr) {
5.  GetTraceInterface()->AddEvent("event-name", ...other arguments passed to the macro)
6. }

Now, some considerations:
- The fastpath should not be affected at all. With fastpath I mean the case when tracing is disabled and the code is hit for the 2nd (or more) time.
  In this case, the only thing in the fastpath is the UNLIKELY(), non-barriered, if of line 2. This is not different from today
- When tracing is still disabled and we hit the event when the first time ever can be affected, as we introduce a vptr vs direct call. However, unless we do something utterly stupid, this shouldn't make any practical difference, as this happens only once per call site.
- The case of hitting the event when tracing is enabled is the most interesting and this bug is about it.
  Is the vptr call going to be slower than a straight call?

This is the thing worth investigating here.
We have a benchmark (tracing_perftests) and should be quite easy to use that to create a testing TRACE_EVENT_WITH_VIRTUAL_DISPATCH macro and see if that has any measurable impact.
 
Cc: picksi@chromium.org
Performance test and discussion: https://codereview.chromium.org/2656303003
Probably test should be improved to get more precise result.
Status: Fixed (was: Assigned)
Results (see https://crrev.com/2656303003 for details):

Desktop:
  Median "today": 1.60 us per event
  Median with virtual calls: 1.66 us per event
  Tradeoff: 3.75%

Android (Nexus 5):
  Median "today": 50.2 us per event
  Median with virtual calls: 51.6 us per event
  Tradeoff: 2.79%

Sign in to add a comment