Make Courgette use libdivsufsort |
||
Issue descriptionCourgette uses BSDiff, which uses QSufSort to generate suffix arrays. deymo@ suggested that we switch to replace QSufSort with libdivsufsort, which is faster and uses less memory; indeed, AOSP as been doing this since 2011: https://android.googlesource.com/platform/external/bsdiff/+/c2c68a78ac72e056cf23b7017206f8a530c0aa63 We want try this in Courgette. Another issue: Creating chrome.7z patch requires suffix array with 270+ MB entries, i.e., ~1.1 GB of RAM! To be robust against memory fragmentation, Courgette uses PagedArray, which allocates memory in chunks and pretend it's one big array. We have to revisit this issue, in light of the fact that in production we use courgette64 anyway.
,
May 3 2016
The last case of "libdivsufsort + PagedArray" would need more involved code change. Meanwhile, we should decide whether to replace PagedArray with std::vector. If so then it should be a separate change. If not (e.g., to be conservative, or usabilitiy of 32-bit version) then we would need to change to libdivsufsort, and get less gain.
,
May 3 2016
nit: CrOS has been doing this since 2011. AOSP merged CrOS' changes with a git merge, so the changes look like from 2011 but they indeed landed in AOSP last year. CrOS uses bsdiff on basically every incremental update since 2011 and there were corner-case fixes to bsdiff at the time. tbao@ tested that version on Android updates and also had some improvement. Also, note that libdivsuffsort compiles two different variants: 32 and 64 bits; regardless of the arch you are building for, these are size of the suffix array indices. If your data is less than 2GB, then it is safe to use the 32 bits version (the 64 bit one requires twice as much RAM because each index is 64 bit instead of 32).
,
May 4 2016
Thanks for the info! I'll likely stick with 32-bit (index size); nearly halving RAM usage unwound the death-clock of 32-bit Courgette-gen. I want to reformat libdivsort to conform to our style (with tests), and try to adapt PagedArray. It would also be interesting to investigate using Courgette on CrOS.
,
May 4 2016
divsufsort()'s third param |n| seems to be documented wrong. deymo@: This might create a benign bug in AOSP's usage? I'm adapting https://code.google.com/p/chromium/codesearch#chromium/src/courgette/third_party/qsufsort_unittest.cc To get it to pass I had to +1 on length, e.g., change: courgette::qsuf::qsufsort<int*>(&I[0], &V[0], s, len); to divsufsort(s, &I[0], len + 1); ^^^ Digging deeper: the interface claims: /** * Constructs the suffix array of a given string. * @param T[0..n-1] The input string. * @param SA[0..n] The output array of suffixes. * @param n The length of the given string. * @return 0 if no error occurred, -1 or -2 otherwise. */ saint_t divsufsort(const sauchar_t *T, saidx_t *SA, saidx_t n); But really you should pass length of |T| + 1 as |n|, i.e., one more than qsufsort()'s interface. Indeed, if you examine divsufsort()'s implementation: saint_t divsufsort(const sauchar_t *T, saidx_t *SA, saidx_t n) { ... /* Check arguments. */ if((T == NULL) || (SA == NULL) || (n < 0)) { return -1; } else if(n == 0) { return 0; } else if(n == 1) { SA[0] = 0; return 0; } else if(n == 2) { m = (T[0] < T[1]); SA[m ^ 1] = 0, SA[m] = 1; return 0; } For T = ['A'], we'd expect SA = [0, 1], for [[], ['A']]. However, passing n = 1 == len(T) yields SA = [0]. This appears to apply for longer inputs.
,
May 4 2016
re: comment 4 we can still use the 32-bit index variant but compile for 64-bit - I think other parts of courgette might still need the extra memory space that 64-bit gives us - or is the sort the part with the largest mem usage?
,
May 5 2016
re #4: dgarrett@ had an intern 2 or 3 years ago who worked on courgette for arm to see if we could use it in Chrome OS. The TL;DR answer is that courgette requires a LOT of RAM that we are not ok with for background updates. I would recommend against modifying libdivsufsort and I would treat it as an external library instead. re #5: https://github.com/y-256/libdivsufsort in the readme shows something a little different WRT SA: /* * Constructs the suffix array of a given string. * @param T[0..n-1] The input string. * @param SA[0..n-1] The output array or suffixes. * @param n The length of the given string. * @return 0 if no error occurred, -1 or -2 otherwise. */ saint_t divsufsort(const sauchar_t *T, saidx_t *SA, saidx_t n); I don't know where did you get the interface documentation, but https://github.com/y-256/libdivsufsort/blob/master/include/divsufsort.h.cmake#L67 also says SA[0..n-1] and not SA[0..n]. For T=['A'], the suffix array of that string is an array of length 1: SA = [0] by definition and that's what we get from divsufsort().
,
May 5 2016
Re. #4: Yes we're working on the client-side RAM problem. The intermediate data can take 100's of MB, but these currently use memory-mapped IO so don't need RAM (needs address space though). As for peak RAM for supporting data structures, we have a CL underway to reduce it by 25%, and can probably reduce by 15% more with another change. There's a more ambitious goal to also reduce the 100's of MB mentioned earlier, but that's lofty. Regarding modifying libdivsufsort, I want to measure performance impact of adapting it to use PagedArray. Re. #5: I looked back a my original cmake-generated divsufsort.h and found "n-1", so you're right; sorry for the noise (accidentally deleted??). Meanwhile, it still stands that BSDiff followed QSufSort's convention of including empty string in suffix array. Indeed, your modified version still have I[] allocated with size |oldsize+1|. And since you continued to use BSDiff's search(), I don't know if there will be some edge case. Anyway, don't worry about this for now.
,
May 17 2016
I think we hit the edge-case you mention recently only when enabling bsdiff improvements in imgdiff's android implementation. I haven't seen this problem in CrOS, maybe it just manifests as a worse-than-ideal patch size when you are searching for the lexicographically largest string, but still valid patch. We test all the bsdiff patches on the server side with bspatch. Anyway, the fix is here: https://android-review.googlesource.com/#/c/223480/
,
May 17 2016
Thanks for the update! The convention used by qsufsort and search() is that to include '', thus giving rise to n+1 suffixes, which search() relies on. Meanwhile, divsufsort excludes '', and its accompanying sa_search() uses the convention. However, sa_search() applies lower_bound() rather than finding longest match. Although it's more efficient, some work is needed to get it to do what search() does. So right now Android's bsdiff (and soon Courgette) applies a "Frankenstein" fusion of divsufsort() and (BSDiff's) search(). Re. Bug 223480 , I don't think the "oldsize" -> "oldsize-1" fix is correct. I'd propose to leave search() alone, but adapt divsufsort() to produce the same output as qsufsort. So instead of if(divsufsort(old_buf, I, oldsize)) err(1, "divsufsort"); perhaps do: I[0] = oldsize; // Empty string if(divsufsort(old_buf, &I[1], oldsize)) err(1, "divsufsort"); Meanwhile here's experiment for Courgette: https://codereview.chromium.org/1948843002/ This is complicated by Courgette's usage of PagedArray (and my decision to keep using it to extend the usability of 32-bit Courgette). I followed your advice and sought to minimize code changes to divsufsort(). :)
,
May 17 2016
And if you want to add unit tests, please feel free to take Courgette's tests as a starting point. https://codereview.chromium.org/1948843002/ as a version for Franken-divsufsort.
,
Jun 2 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/804ed8a1e0f236f22e19abef7013e9641a3ad49d commit 804ed8a1e0f236f22e19abef7013e9641a3ad49d Author: huangs <huangs@chromium.org> Date: Thu Jun 02 06:26:36 2016 [Courgette] PagedArray: Add Iterators and Parametrize Page Size as int Template. This is a refactoring CL to enable PagedArray usage by libdivsufsort. In addition to overloading operator[], for more general usage we need need pointer-like accessors to PagedArray. To this end we implement PagedArray_const_iterator and PagedArray_const_iterator, which merely wraps a PagedArray pointer along with an index. We also add various operators needed by libdivsufsort. For optimization, '<' and '<=' operators omits pointer checks. By default PagedArray page size is 2**18 elements (1 MiB for int32_t). To enable better testing, we made (log) page size a tepmlate parameter. BUG= 608885 Review-Url: https://codereview.chromium.org/2008553007 Cr-Commit-Position: refs/heads/master@{#397311} [modify] https://crrev.com/804ed8a1e0f236f22e19abef7013e9641a3ad49d/courgette/third_party/bsdiff/paged_array.h [modify] https://crrev.com/804ed8a1e0f236f22e19abef7013e9641a3ad49d/courgette/third_party/bsdiff/paged_array_unittest.cc
,
Jun 14 2016
Third-party library inclusion proposal: go/courgette-use-libdivsufsort
,
Jul 26 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/7054b5a2e3b2ce7f5b2e868e91d54b5ec0773089 commit 7054b5a2e3b2ce7f5b2e868e91d54b5ec0773089 Author: huangs <huangs@chromium.org> Date: Tue Jul 26 21:46:13 2016 [Courgette] Refactor BSDiff namespaces and bsdiff::search() interface. Details: - Move BSDiff (but not PagedArray) from namespace courgette to bsdiff. - Change namespace courgette::qsuf to qsuf. - Change bsdiff:search() to return struct {pos, size} so we don't need awkward pointer passing; update callers. - Updated BSDiff callers. Also fix weird hybrid usage by setup_util.cc, which calls Courgette's BSDiff, but using OK from BSPatch. BUG= 608885 Review-Url: https://codereview.chromium.org/2031193002 Cr-Commit-Position: refs/heads/master@{#407924} [modify] https://crrev.com/7054b5a2e3b2ce7f5b2e868e91d54b5ec0773089/chrome/installer/setup/setup_util.cc [modify] https://crrev.com/7054b5a2e3b2ce7f5b2e868e91d54b5ec0773089/chrome/utility/chrome_content_utility_client.cc [modify] https://crrev.com/7054b5a2e3b2ce7f5b2e868e91d54b5ec0773089/components/update_client/component_patcher_operation.cc [modify] https://crrev.com/7054b5a2e3b2ce7f5b2e868e91d54b5ec0773089/courgette/bsdiff_memory_unittest.cc [modify] https://crrev.com/7054b5a2e3b2ce7f5b2e868e91d54b5ec0773089/courgette/courgette_tool.cc [modify] https://crrev.com/7054b5a2e3b2ce7f5b2e868e91d54b5ec0773089/courgette/simple_delta.cc [modify] https://crrev.com/7054b5a2e3b2ce7f5b2e868e91d54b5ec0773089/courgette/third_party/bsdiff/README.chromium [modify] https://crrev.com/7054b5a2e3b2ce7f5b2e868e91d54b5ec0773089/courgette/third_party/bsdiff/bsdiff.h [modify] https://crrev.com/7054b5a2e3b2ce7f5b2e868e91d54b5ec0773089/courgette/third_party/bsdiff/bsdiff_apply.cc [modify] https://crrev.com/7054b5a2e3b2ce7f5b2e868e91d54b5ec0773089/courgette/third_party/bsdiff/bsdiff_create.cc [modify] https://crrev.com/7054b5a2e3b2ce7f5b2e868e91d54b5ec0773089/courgette/third_party/bsdiff/bsdiff_search.h [modify] https://crrev.com/7054b5a2e3b2ce7f5b2e868e91d54b5ec0773089/courgette/third_party/bsdiff/bsdiff_search_unittest.cc [modify] https://crrev.com/7054b5a2e3b2ce7f5b2e868e91d54b5ec0773089/courgette/third_party/bsdiff/qsufsort.h [modify] https://crrev.com/7054b5a2e3b2ce7f5b2e868e91d54b5ec0773089/courgette/third_party/bsdiff/qsufsort_unittest.cc [modify] https://crrev.com/7054b5a2e3b2ce7f5b2e868e91d54b5ec0773089/courgette/versioning_unittest.cc
,
Jul 27 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/d63dcb01c3a8fc30a840ce5aaab8ea5e6c9107e4 commit d63dcb01c3a8fc30a840ce5aaab8ea5e6c9107e4 Author: huangs <huangs@chromium.org> Date: Wed Jul 27 16:07:51 2016 [Courgette] Add third-party library: libdivsufsort. We wish to use third-party library libdivsufsort in Courgette to replace QSufSort in Courgette-gen. This will reduce time and memory usage. This CL focuses on adding libdivsufsort and tests. We will make the switch in a follow-up CL. Proposal: go/courgette-use-libdivsufsort Code source: https://github.com/y-256/libdivsufsort BUG= 631482 , 608885 Review-Url: https://codereview.chromium.org/2156223002 Cr-Commit-Position: refs/heads/master@{#408141} [modify] https://crrev.com/d63dcb01c3a8fc30a840ce5aaab8ea5e6c9107e4/courgette/BUILD.gn [modify] https://crrev.com/d63dcb01c3a8fc30a840ce5aaab8ea5e6c9107e4/courgette/courgette.gyp [add] https://crrev.com/d63dcb01c3a8fc30a840ce5aaab8ea5e6c9107e4/courgette/third_party/divsufsort/LICENSE [add] https://crrev.com/d63dcb01c3a8fc30a840ce5aaab8ea5e6c9107e4/courgette/third_party/divsufsort/README.chromium [add] https://crrev.com/d63dcb01c3a8fc30a840ce5aaab8ea5e6c9107e4/courgette/third_party/divsufsort/divsufsort.cc [add] https://crrev.com/d63dcb01c3a8fc30a840ce5aaab8ea5e6c9107e4/courgette/third_party/divsufsort/divsufsort.h [add] https://crrev.com/d63dcb01c3a8fc30a840ce5aaab8ea5e6c9107e4/courgette/third_party/divsufsort/divsufsort_private.h [add] https://crrev.com/d63dcb01c3a8fc30a840ce5aaab8ea5e6c9107e4/courgette/third_party/divsufsort/divsufsort_unittest.cc [add] https://crrev.com/d63dcb01c3a8fc30a840ce5aaab8ea5e6c9107e4/courgette/third_party/divsufsort/sssort.cc [add] https://crrev.com/d63dcb01c3a8fc30a840ce5aaab8ea5e6c9107e4/courgette/third_party/divsufsort/trsort.cc
,
Aug 1 2016
Did more experiments, confirmed the 25% time reduction (for courgette64.exe): CL: http://crrev.com/2187953003/ Data: https://docs.google.com/a/chromium.org/spreadsheets/d/1gVyy5olmXVy7aGr0ibCLDGVmoD4ljxAwCI_WQKwZP8g/edit?usp=sharing Summary: - 32-bit Courgette-gen (courgette.exe) used to fail for 64-bit Chrome, that's fixed (due to significant memory reduction, ~1 GB). - 64-bit Courgette-gen: Time reduced by 25% (for 64-bit Chrome, 960s to 660s) - 32-bit Courgette-gen: Time reduced by 22% (32-bit only, since baseline failed). - Suffix sorting time: Reduced by 70% (for 64-bit Chrome, 400s to 100s). - courgette.exe and courgette64.exe each increased in size by 7%. Note: should not affect client-side code. Details: - 2 configurations: - Baseline - DivSufSort. - 2 executables: courgette.exe (32-bit) and courgette64.exe. - 3 data sets (patching chrome.7z): - A: 32-bit 50.0.2661.106 to 51.0.2704.101 - B: 51.0.2704.101 to 52.0.2743.39 - C: 51.0.2704.101 to 52.0.2743.39 - 6 trial runs. - Super nitty note: etiennep@ reduced time by 4% in http://crrev.com/2055343002 by an orthogonal fashion. - If we factor this *out*: Normalized to current baseline = 1, old time would be 1/(0.04) = 1.04167, and our reduction of 0.25 would become 1-(1.04167-0.25)/1.04167 = 24%. But we're already being conservative (64-bit Chrome had 31% reduction). So I think it's fair to claim 25% reduction.
,
Aug 2 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/4efc71b655733edfe3373cac6d547dbc0d623bae commit 4efc71b655733edfe3373cac6d547dbc0d623bae Author: huangs <huangs@chromium.org> Date: Tue Aug 02 21:28:12 2016 [Courgette] Replace QSufSort with libdivsufsort. Last step of the effort to get Courgette to use libdivsufsort. BUG= 608885 Review-Url: https://codereview.chromium.org/2187953003 Cr-Commit-Position: refs/heads/master@{#409326} [modify] https://crrev.com/4efc71b655733edfe3373cac6d547dbc0d623bae/courgette/BUILD.gn [modify] https://crrev.com/4efc71b655733edfe3373cac6d547dbc0d623bae/courgette/courgette.gyp [modify] https://crrev.com/4efc71b655733edfe3373cac6d547dbc0d623bae/courgette/third_party/bsdiff/README.chromium [modify] https://crrev.com/4efc71b655733edfe3373cac6d547dbc0d623bae/courgette/third_party/bsdiff/bsdiff_apply.cc [modify] https://crrev.com/4efc71b655733edfe3373cac6d547dbc0d623bae/courgette/third_party/bsdiff/bsdiff_create.cc [modify] https://crrev.com/4efc71b655733edfe3373cac6d547dbc0d623bae/courgette/third_party/bsdiff/bsdiff_search_unittest.cc [delete] https://crrev.com/bf7ab24213d1c6b4d477d3ef0957f6d25d8677d0/courgette/third_party/bsdiff/qsufsort.h [delete] https://crrev.com/bf7ab24213d1c6b4d477d3ef0957f6d25d8677d0/courgette/third_party/bsdiff/qsufsort_unittest.cc
,
Aug 29 2016
|
||
►
Sign in to add a comment |
||
Comment 1 by hua...@chromium.org
, May 3 2016