TransactionFailedError: too much contention on these datastore entities |
||
Issue descriptionThis error spiked in the dashboard yesterday, with nearly 8000 occurrences within 24 hours: TransactionFailedError: too much contention on these datastore entities. please try again. entity group key: (app=s~chromeperf, CachedPickledString, "externally_visible__list_tests_get_tests_v2_redacted") The stack trace implicated a call to layered_cache.DeleteAsync(). Documentation suggests: Every attempt to create, update, or delete an entity takes place in the context of a transaction. There is a write throughput limit of about one transaction per second within a single entity group. The stack trace goes on to point to the DeleteAsync call in TestMetadata.CreateCallback. So it appears that several new master/bot/suite started uploading, creating several new TestMetadata entities, all of which wanted to purge the CachedPickledString and MultipartEntity entities containing cached list of tests for their respective master/bot/suite. However, since the master/bot/suite was new, it is unlikely that that the CachedPickledString/MultipartEntity entities existed, neither at first nor for each subsequent attempt to purge them, so many of the attempts to delete the entities were unnecessary. A fix is ready to check if the entities exist before attempting to delete them. This should reduce contention.
,
Jan 11
The following revision refers to this bug: https://chromium.googlesource.com/catapult/+/96320b515106e029ad6326b81cb6feef5660c6a8 commit 96320b515106e029ad6326b81cb6feef5660c6a8 Author: benshayden <benjhayden@chromium.org> Date: Fri Jan 11 19:45:24 2019 Prevent contention errors in DeleteAsync using existence checks. This error spiked in the dashboard yesterday, with nearly 8000 occurrences within 24 hours: TransactionFailedError: too much contention on these datastore entities. please try again. entity group key: (app=s~chromeperf, CachedPickledString, "externally_visible__list_tests_get_tests_v2_redacted") The stack trace implicated a call to layered_cache.DeleteAsync(). Documentation suggests: Every attempt to create, update, or delete an entity takes place in the context of a transaction. There is a write throughput limit of about one transaction per second within a single entity group. The stack trace goes on to point to the DeleteAsync call in TestMetadata.CreateCallback. So it appears that several new master/bot/suite started uploading, creating several new TestMetadata entities, all of which wanted to purge the CachedPickledString and MultipartEntity entities containing cached list of tests for their respective master/bot/suite. However, since the master/bot/suite was new, it is unlikely that that the CachedPickledString/MultipartEntity entities existed, neither at first nor for each subsequent attempt to purge them. This fix avoids attempting to delete the entities if they don't exist. Calling get() does not automatically create a transaction. This fix should reduce calls to delete entities, and thereby reduce contention on them. This cached list of tests is only used by the V1 UI. V2spa uses a different set of cached test suite descriptors, which is not purged when new TestMetadata entities are created, so test suite descriptors may be stale for up to a day. That could change if users complain, in which case this bugfix would benefit v2spa as well as v1 ui. Bug: chromium:920304 Change-Id: I9e826590208370f140819e3d94dbe428812c626b Reviewed-on: https://chromium-review.googlesource.com/c/1403458 Reviewed-by: Sean McCullough <seanmccullough@chromium.org> Commit-Queue: Ben Hayden <benjhayden@chromium.org> [modify] https://crrev.com/96320b515106e029ad6326b81cb6feef5660c6a8/dashboard/dashboard/common/stored_object.py [modify] https://crrev.com/96320b515106e029ad6326b81cb6feef5660c6a8/dashboard/dashboard/common/layered_cache.py
,
Jan 11
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/660d8d32dc59178b53de1dfd2623e298647f2447 commit 660d8d32dc59178b53de1dfd2623e298647f2447 Author: chromium-autoroll <chromium-autoroll@skia-public.iam.gserviceaccount.com> Date: Fri Jan 11 23:45:39 2019 Roll src/third_party/catapult 6eeb1d2fc794..96320b515106 (1 commits) https://chromium.googlesource.com/catapult.git/+log/6eeb1d2fc794..96320b515106 git log 6eeb1d2fc794..96320b515106 --date=short --no-merges --format='%ad %ae %s' 2019-01-11 benjhayden@chromium.org Prevent contention errors in DeleteAsync using existence checks. Created with: gclient setdep -r src/third_party/catapult@96320b515106 The AutoRoll server is located here: https://autoroll.skia.org/r/catapult-autoroll Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+/master/autoroll/README.md If the roll is causing failures, please contact the current sheriff, who should be CC'd on the roll, and stop the roller if necessary. CQ_INCLUDE_TRYBOTS=luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel BUG= chromium:920304 TBR=sullivan@chromium.org Change-Id: I6fab5daefaf1f04be32e530e95ceb99005c62270 Reviewed-on: https://chromium-review.googlesource.com/c/1407555 Reviewed-by: chromium-autoroll <chromium-autoroll@skia-public.iam.gserviceaccount.com> Commit-Queue: chromium-autoroll <chromium-autoroll@skia-public.iam.gserviceaccount.com> Cr-Commit-Position: refs/heads/master@{#622222} [modify] https://crrev.com/660d8d32dc59178b53de1dfd2623e298647f2447/DEPS
,
Jan 16
(6 days ago)
|
||
►
Sign in to add a comment |
||
Comment 1 by benjhayden@chromium.org
, Jan 9