New issue
Advanced search Search tips

Issue 869442 link

Starred by 1 user

Issue metadata

Status: Untriaged
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment

trousers: fix context handles

Project Member Reported by apronin@chromium.org, Jul 31

Issue description

[Triggered by http://b/111417778 fixes]

If I understand the trousers design right:

There are at least two types of context handles: external (see src/tcs/tcs_context.c) and internal (e.g. see at the top of src/tcs/tcs_utils.c).

The internal context handle used in tcs_utils.c is 0x30000000. 0x30 in the upper byte can be assumed to be the mark of internal handles.

For the external context handles, the upper byte should either just be 0xA0 or in the range 0xA0..0xFF. The current code just starts with 0xA0xxyy00 and increments the counter and changes yy based on the current time. If we just wait for the overflow, we may end up with 0x0000yy00, with non-zero yy, and continue incrementing.

The 2nd byte (xx), being dependent on the time of when it was first initialized, is seemingly tasked with preventing using stale handles if tcsd restarts from underneath its users. 

So, we have the lower 16 bits (+possibly 0xA0->0xFF in the upper portion) for the actual handle in this tcsd session. If a new context is requested every second, and we just go through them in order, we'll go through all 2^16 contexts in under 2 days.

Having the 3rd byte (yy) dependent on time (srand(time(NULL)), then rand()) was supposed to prevent repeating the same handle in subsequent requests, but as we know from b/111417778, collisions naturally can and do still happen if some handles are open once and kept open, and others are open and closed frequently.

However, we know that at any given time we only have a limited number of handles open: each user of tcsd - daemon or utility - uses a few, typically 1-2; and there's a limited number of users; so we shouldn't often see more than 10 contexts open at the same time (note: theorizing, not measured). 

So, it is suggested to have a different approach to generating external handles.
1) Set the upper 4 bits to 0xA.
2) Set the next 16 bits to xx - a random number generated from current time when tcsd starts - as it is now, but shifted further 4 bits left.
3) Use the lower 22 bits as the counter - start with 0 and increment for each request, allow to overflow.
4) Keep the code from CL https://crrev.com/c/1153656 that checks for duplicates and re-request the next one if such handle is already in use. At worst, it will go through ~10-20 already existing consecutive long-living handles (less in practice, probably just 1-2) to come up with a unique one - which should still be not worse than doing time(), srand() and rand() for each requested context.

If we generate a new context every second, going through 2^22 lower bits would take 48+ days. So, in practice (where we don't need it every second, we probably won't overflow until reboot anyways).
 
Err, correction, of course:
...
2) Set the next 8 bits to xx ...
3) Use the lower 20 bits as the counter ...

So, it wraps not in 48+ days, but in 12+ days in a 'context every second' scenario. Or 2 years in a more realistic 'context every minute' scenario.
Components: OS>Systems>Security
1) The current code also makes sure that the handle is randomized (3rd byte ORed with random value), which complicates attacks relying on predicting the next handle.
2) The bug descr is incorrect. The underlying counter will start with 0xA0xx0000 and then will keep incrementing until it wraps at 0x00000000 (after at least 95*2^24 increments). Then we'll enter the endless loop at 
	if (nextContextHandle == 0)
		return getNextHandle();

So, I'll drop my original CL https://crrev.com/c/1157030, which removes randomization, and will just focus on preventing the endless loop.
If we really care about the randomization, then I guess we still want to change the implementation of handles generation. The Trousers' one is pretty weak in this sense.
First, as you say, randomization only touches one byte, which by itself means that a few hundred of attempts would be enough to guess the right value.
Second (and, IMO, even worse), the current implementation calls srand(time(NULL)) before *every* handle generation, which means that two handles generated within one second will receive the same "random" value.

Sign in to add a comment