New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 809882 link

Starred by 1 user

Issue metadata

Status: Assigned
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 3
Type: Feature



Sign in to add a comment

Install: Update partition sequentially

Project Member Reported by gwendal@chromium.org, Feb 7 2018

Issue description

Looking at performance numbers on a particular eMMC , I noticed that there is a 3ms latency penalty when doing 512kB read on data that has been written randomly instead of sequentially. The read latency is doubled.

https://docs.google.com/spreadsheets/d/1BxxT5iODmcYev_BKKw8EyQ2tyoec4ylpuStwFVrtpuM/edit?usp=sharing

In consequence, given updates are delta updates, after few updates, sequentially written data will be scattered across many different pages and will read more slowly over time.

After updates that contains a lot of deltas or after a given number of updates, we should completely rewrite the updated partition sequentially.

To help the SSD find free erase block [esp. when the device is full], we can also do a block discard, but this is not required.

 
Thanks for filling the bug.

The block discard before the update can definitely happen.

The delta updates try to write sequentially but that doesn't always happen and at this point we cannot make it happen unless we want to lose on update size.

The full updates write sequentially, but I guess since full updates and delta updates happen interchangeably in the client and since currently we don't do block discard, this really won't help. right?

Comment 2 by gwendal@google.com, Feb 7 2018

blkdiscard helps the SSD find free erase block. If we don't do it, we will still see benefit of doing sequential write, the SSD will have to work harder to find erase block, but it will get there.
What is important is to write sequential from first LBA to the last LBA of files/filesystem. Leaving gap in a stream of writes, even with increase LBA is equivalent to random writes.

Comment 3 by de...@google.com, Feb 14 2018

The number of jumps when writing the target partition is not that high, the only time we do that is when there's fragmentation on the "file" we want to write. 
cros payload tool can show some stats about this for a given payload.

Doing a discard on the whole partition can take quite a bit of time. Is there any real benefit on doing a full discard if you are going to write the partition sequentially anyway?

Comment 4 by gwendal@google.com, Feb 14 2018

blkdiscard (without the -s option) is fast as only the meta data is affected.
Anyhow, as I meant to say in #2 but forogt, it is only beneficial when the disk is full/almost full.
Cc: grundler@chromium.org
Sorry for late reply, I had some other P1 to work on. 

+grundler@ as he had some ideas on this while back.

> The number of jumps when writing the target partition is not that high, the only time we do that is when there's fragmentation on the "file" we want to write. 

Actually, I think we need to know what exactly 'high' means for this situation. Here is some number from a delta payload with between to major releases:

cros payload show payload.delta
Payload version:         1
Manifest length:         837388
Number of operations:    13082
Number of kernel ops:    13
Block size:              4096
Minor version:           5


Number of operations          number of fragmented extents per operation
11300	1
1545	2
89	3
63	4
42	5
14	6
15	7
6	8
5	9
3	10
3	11
2	15
2	16
1	17
1	19
1	20
1	28
1	70
1	105

> the only time we do that is when there's fragmentation on the "file" we want to write. 
Another time is when we remove blocks from a file because they were equal in source and target (to use for SOURCE_COPY operation). Then the file 'becomes' fragmented. And as you see in the results above, this happens alot.


I remember that gwendal@ said at some point that even if we write just a few non-sequential blocks, it will have the same problem.

If you guys don't think blkdiscard of the entire partition helps that much, we can abandon crrev.com/c/917608.

To be more clear we always write sequentially for full updates, but delta updates normally (well, almost always) have fragmentation. As per offline discussion with gwendal@, if once in a while we get a full update, then everything is rest to zero. which normally happens since like 30% of all updates are full updates.

Also I had a document a while back proposing some changes, but that is kind of out-dated now (https://docs.google.com/document/d/1Zrd2SHr69FlSxIWhJ3n2Hsd-RoW30vQAUzyUBMQZ8_s)

But I thought about it more. I think it maybe possible to do sequential writes for delta updates too. The only operations that cause fragmentation now are SOURCE_BSDIFF, BROTLI_BSDIFF, and PUFFIN (which all use bsdiff internally). I think deymo@ gave this pointer in a while back that it is possible to break a bsdiff operation into N number of bsdiff operations (N is the number of destination extents) by keeping the source same for all operations and break their destination. so basically break operation:

op: <src_ext1, src_ext2, src_ext3> -> <tgt_ext1, tgt_ext2>

to two operations:

op1: <src_ext1, src_ext2, src_ext3> -> <tgt_ext1>
op1: <src_ext1, src_ext2, src_ext3> -> <tgt_ext2>

This will not affect the payload size by much, but will allow us to write sequentially. However, this will increase the payload generation time.



Attached an example of multi-extent files in chromeos_10176.68.0_veyron-minnie_recovery_beta-channel_minnie-mp-v4-zero.bin


file-fragments.txt
49.4 KB View Download
Cc: norvez@chromium.org
Cc: -ahass...@chromium.org
Owner: ahass...@chromium.org
Status: Assigned (was: Untriaged)
Cc: tbrindus@chromium.org

Sign in to add a comment