Swarming automatic TaskProperties fallback |
|||||||||||||||||
Issue descriptionImplement a way to create automatic fallback when a TaskProperties cannot be fulfilled in a pre-determinated amount of time. This requires updating the DB Schema, the API and the documentation, but requires no bot change. Related to issue 756295 but not exactly an 'or', in the sense that there is a strong linear progression between the options. Work in progress doc (internal only): http://go/swarming-automatic-fallback
,
Jan 12 2018
Issue 706586 will be implemented as part of this work.
,
Jan 12 2018
Issue 706586 has been merged into this issue.
,
Jan 12 2018
The following revision refers to this bug: https://chromium.googlesource.com/infra/luci/luci-py.git/+/521c67ca0b647037004b0e7f3831d47aa9c3771c commit 521c67ca0b647037004b0e7f3831d47aa9c3771c Author: Marc-Antoine Ruel <maruel@chromium.org> Date: Fri Jan 12 20:54:13 2018 swarming: reduce the use of request.properties No functional change. request.properties will change of form. Do refactoring so follow up refactoring has less copy-paste. Fix an aliasing of a variable 'request' which made the code confusing. R=vadimsh@chromium.org Bug: 781021 Change-Id: I85edd9fa2a7105d6304c02ab10eeb066ae8a65ac Reviewed-on: https://chromium-review.googlesource.com/865414 Reviewed-by: Vadim Shtayura <vadimsh@chromium.org> Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org> [modify] https://crrev.com/521c67ca0b647037004b0e7f3831d47aa9c3771c/appengine/swarming/handlers_bot.py [modify] https://crrev.com/521c67ca0b647037004b0e7f3831d47aa9c3771c/appengine/swarming/handlers_endpoints.py [modify] https://crrev.com/521c67ca0b647037004b0e7f3831d47aa9c3771c/appengine/swarming/message_conversion.py
,
Feb 9 2018
keyboard: load shedding, shed, deny
,
Feb 16 2018
,
Feb 21 2018
,
Feb 21 2018
,
Mar 1 2018
,
Mar 21 2018
,
Mar 21 2018
,
Mar 22 2018
The following revision refers to this bug: https://chromium.googlesource.com/infra/luci/luci-py.git/+/821925d0b365357471252beac43c928161d7f506 commit 821925d0b365357471252beac43c928161d7f506 Author: Marc-Antoine Ruel <maruel@chromium.org> Date: Thu Mar 22 01:12:17 2018 Swarming: move assert_task() guts into subfunction There is no functional change, the code is just moved. Soon a TaskRequest will have multiple TaskProperties, each as part of a TaskSlice. Move this code without changing the logic, so that follow ups are easier to read. In a follow up I'll want to make it async but decided to just move code first. R=vadimsh@chromium.org Bug: 781021 Change-Id: I596d41d76a37baa4e5333e89761d8ad054d7ac83 Reviewed-on: https://chromium-review.googlesource.com/938779 Reviewed-by: Vadim Shtayura <vadimsh@chromium.org> Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org> [modify] https://crrev.com/821925d0b365357471252beac43c928161d7f506/appengine/swarming/server/task_queues.py
,
Mar 24 2018
The following revision refers to this bug: https://chromium.googlesource.com/infra/luci/luci-py.git/+/c5db5135a63f3db4452308b1535ec47f189f668b commit c5db5135a63f3db4452308b1535ec47f189f668b Author: Marc-Antoine Ruel <maruel@chromium.org> Date: Sat Mar 24 03:54:44 2018 Swarming: one TaskToRun per attempt. - Change TaskToRun key id to the try number. It will contain more information once issue 781021 is implemented. - Do not store TaskToRun for deduped task. - Do not reuse TaskToRun for try #2. - Add TaskToRun.created_ts to know when a task was enqueued. It will be needed for issue 781021 . - Rename a few 'task' local variable to 'to_run' make it things clearer. This change temporarily breaks task cancelation and dead bot handling for the tasks that are on-going. Bug: 817831 , 781021 Change-Id: If252001ba8c8bcb499134b1d21ab2f10cd06203b Reviewed-on: https://chromium-review.googlesource.com/969760 Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org> Reviewed-by: Vadim Shtayura <vadimsh@chromium.org> [modify] https://crrev.com/c5db5135a63f3db4452308b1535ec47f189f668b/appengine/swarming/doc/Schemas.md [modify] https://crrev.com/c5db5135a63f3db4452308b1535ec47f189f668b/appengine/swarming/server/task_result_test.py [modify] https://crrev.com/c5db5135a63f3db4452308b1535ec47f189f668b/appengine/swarming/server/task_scheduler.py [modify] https://crrev.com/c5db5135a63f3db4452308b1535ec47f189f668b/appengine/swarming/server/task_scheduler_test.py [modify] https://crrev.com/c5db5135a63f3db4452308b1535ec47f189f668b/appengine/swarming/server/task_to_run.py [modify] https://crrev.com/c5db5135a63f3db4452308b1535ec47f189f668b/appengine/swarming/server/task_to_run_test.py
,
Mar 26 2018
The following revision refers to this bug: https://chromium.googlesource.com/infra/luci/luci-py.git/+/d15fc3980b4541f6f6b384ba167db4af19179625 commit d15fc3980b4541f6f6b384ba167db4af19179625 Author: Marc-Antoine Ruel <maruel@chromium.org> Date: Mon Mar 26 22:28:58 2018 [swarming] overhaul handlers unit tests Will add new tests soon and wanted to clean it up a bit first. Removed 460 lines from the test code by creating reusable expectations. No functional change. Bug: 754390 , 781021 Change-Id: I1f8a5b84312c21e6d4578cae1beb61d281009a4e Reviewed-on: https://chromium-review.googlesource.com/979275 Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org> Reviewed-by: Robbie Iannucci <iannucci@chromium.org> [modify] https://crrev.com/d15fc3980b4541f6f6b384ba167db4af19179625/appengine/swarming/handlers_bot_test.py [modify] https://crrev.com/d15fc3980b4541f6f6b384ba167db4af19179625/appengine/swarming/handlers_endpoints_test.py [modify] https://crrev.com/d15fc3980b4541f6f6b384ba167db4af19179625/appengine/swarming/handlers_test.py [modify] https://crrev.com/d15fc3980b4541f6f6b384ba167db4af19179625/appengine/swarming/test_env_handlers.py
,
Mar 28 2018
The following revision refers to this bug: https://chromium.googlesource.com/infra/luci/luci-py.git/+/fc0788b49dfdfdb1d7e2a438e2feb325df39c8d6 commit fc0788b49dfdfdb1d7e2a438e2feb325df39c8d6 Author: Marc-Antoine Ruel <maruel@chromium.org> Date: Wed Mar 28 09:41:22 2018 [swarming] Create TaskSlice at the API level. It is correctly supported so that clients can start updating right away, but TaskSlice fallback is not implemented. No behavior change, except making the API return duplicate data, so clients have time to update. This permits simplifying the delta for the actual implementation, splitting PAI changes from the DB schema change, thus reducing the likelihood of breaking the whole thing and making the change almost reviewable for once. The client is not updated yet, will do a bit later. Bug: 781021 Change-Id: I4941e6869fae9ffaceb5fa1c074ce5699b9c1642 Reviewed-on: https://chromium-review.googlesource.com/979453 Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org> Reviewed-by: Nodir Turakulov <nodir@chromium.org> [modify] https://crrev.com/fc0788b49dfdfdb1d7e2a438e2feb325df39c8d6/appengine/swarming/handlers_endpoints_test.py [modify] https://crrev.com/fc0788b49dfdfdb1d7e2a438e2feb325df39c8d6/appengine/swarming/local_smoke_test.py [modify] https://crrev.com/fc0788b49dfdfdb1d7e2a438e2feb325df39c8d6/appengine/swarming/message_conversion.py [modify] https://crrev.com/fc0788b49dfdfdb1d7e2a438e2feb325df39c8d6/appengine/swarming/swarming_rpcs.py [modify] https://crrev.com/fc0788b49dfdfdb1d7e2a438e2feb325df39c8d6/appengine/swarming/test_env_handlers.py
,
Mar 28 2018
The following revision refers to this bug: https://chromium.googlesource.com/infra/luci/luci-py.git/+/023d49b55cce35021d6e050322ec7367c77f9d55 commit 023d49b55cce35021d6e050322ec7367c77f9d55 Author: Marc-Antoine Ruel <maruel@chromium.org> Date: Wed Mar 28 20:43:04 2018 [swarming] Compact task_scheduler_test by 80 lines. No functional change. Makes properties explicit in each unit test. The main reason for this change is that it'll make task_slices testing much easier, so that the properties value will eventually be changed for task_slices values. Make task_queue expectations stricter. Change-Id: I90bdfc967b073a16d24f75967ed8d21aa6f01080 Bug: 781021 Reviewed-on: https://chromium-review.googlesource.com/982833 Reviewed-by: Robbie Iannucci <iannucci@chromium.org> Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org> [modify] https://crrev.com/023d49b55cce35021d6e050322ec7367c77f9d55/appengine/swarming/server/task_scheduler_test.py
,
Apr 4 2018
The following revision refers to this bug: https://chromium.googlesource.com/infra/luci/luci-py.git/+/4587925615cde26a625b5a5b7b31bdeb446381aa commit 4587925615cde26a625b5a5b7b31bdeb446381aa Author: Marc-Antoine Ruel <maruel@chromium.org> Date: Wed Apr 04 18:23:41 2018 [swarming] Rename task_slice_index to current_task_slice This is more meaningful, as this represent the current TaskSlice that is being enqueued or run. R=iannucci@chromium.org Bug: 781021 Change-Id: I51b7c4942c99aac97227325810ace65f31e1ca3b Reviewed-on: https://chromium-review.googlesource.com/996352 Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org> Reviewed-by: Robbie Iannucci <iannucci@chromium.org> [modify] https://crrev.com/4587925615cde26a625b5a5b7b31bdeb446381aa/appengine/swarming/handlers_endpoints_test.py [modify] https://crrev.com/4587925615cde26a625b5a5b7b31bdeb446381aa/appengine/swarming/local_smoke_test.py [modify] https://crrev.com/4587925615cde26a625b5a5b7b31bdeb446381aa/appengine/swarming/message_conversion.py [modify] https://crrev.com/4587925615cde26a625b5a5b7b31bdeb446381aa/appengine/swarming/swarming_rpcs.py [modify] https://crrev.com/4587925615cde26a625b5a5b7b31bdeb446381aa/appengine/swarming/test_env_handlers.py
,
Apr 4 2018
The following revision refers to this bug: https://chromium.googlesource.com/infra/luci/luci-py.git/+/aac2b76b53aa1b2158ac84b9325bf0e3b3265f18 commit aac2b76b53aa1b2158ac84b9325bf0e3b3265f18 Author: Marc-Antoine Ruel <maruel@chromium.org> Date: Wed Apr 04 18:35:41 2018 [swarming] Define current_task_slice in task result, TaskSlice task request Will soon™ be used. Doesn't use it yet, to reduce the delta and make the change more reviewable. TaskSlice will be embedded into TaskRequest as a repeated field as 'task_slices', to replace the current embedded TaskProperties as 'properties'. Bug: 781021 Change-Id: I009aa9436f7ed86ab77ec0a43027cf367f933c4a Reviewed-on: https://chromium-review.googlesource.com/980705 Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org> Reviewed-by: Robbie Iannucci <iannucci@chromium.org> [modify] https://crrev.com/aac2b76b53aa1b2158ac84b9325bf0e3b3265f18/appengine/swarming/message_conversion.py [modify] https://crrev.com/aac2b76b53aa1b2158ac84b9325bf0e3b3265f18/appengine/swarming/server/task_request.py [modify] https://crrev.com/aac2b76b53aa1b2158ac84b9325bf0e3b3265f18/appengine/swarming/server/task_result.py [modify] https://crrev.com/aac2b76b53aa1b2158ac84b9325bf0e3b3265f18/appengine/swarming/server/task_result_test.py [modify] https://crrev.com/aac2b76b53aa1b2158ac84b9325bf0e3b3265f18/appengine/swarming/server/task_scheduler.py [modify] https://crrev.com/aac2b76b53aa1b2158ac84b9325bf0e3b3265f18/appengine/swarming/server/task_scheduler_test.py
,
Apr 4 2018
The following revision refers to this bug: https://chromium.googlesource.com/infra/luci/luci-py.git/+/e295ecee83965943a08dea963dd76a3ec446931f commit e295ecee83965943a08dea963dd76a3ec446931f Author: Marc-Antoine Ruel <maruel@chromium.org> Date: Wed Apr 04 19:34:21 2018 [swarming] Add TaskToRun.task_slice_index Remove TaskToRun._pre_put_hook, it's not useful. new_task_to_run() does the verifications. Use assert (instead of raising an exception) because this is an internal failure. Bug: 781021 Change-Id: Ia2b78f3684f5f6a69e1c39ed17df4290bfbeab33 Reviewed-on: https://chromium-review.googlesource.com/988595 Reviewed-by: Robbie Iannucci <iannucci@chromium.org> Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org> [modify] https://crrev.com/e295ecee83965943a08dea963dd76a3ec446931f/appengine/swarming/server/task_result_test.py [modify] https://crrev.com/e295ecee83965943a08dea963dd76a3ec446931f/appengine/swarming/server/task_scheduler.py [modify] https://crrev.com/e295ecee83965943a08dea963dd76a3ec446931f/appengine/swarming/server/task_scheduler_test.py [modify] https://crrev.com/e295ecee83965943a08dea963dd76a3ec446931f/appengine/swarming/server/task_to_run.py [modify] https://crrev.com/e295ecee83965943a08dea963dd76a3ec446931f/appengine/swarming/server/task_to_run_test.py
,
Apr 10 2018
The following revision refers to this bug: https://chromium.googlesource.com/infra/luci/luci-py.git/+/945d5d7da6b9f32a9523d12e9cfc8eecbb87c79a commit 945d5d7da6b9f32a9523d12e9cfc8eecbb87c79a Author: Marc-Antoine Ruel <maruel@chromium.org> Date: Tue Apr 10 14:53:37 2018 [swarming] Add TaskRequest.manual_tags and prepare for TaskRequest.task_slices TaskRequest.manual_tags is to differentiate between tags specified by the user and tags that are automatically added. Tags on a TaskSlice will have the manual tags plus the tags relevant for this specific TaskSlice. E.g. if someone triggers first on 'gpu:foo:bar' then fallbacks on 'gpu:foo:baz', the TaskResultSummary shall have the automatic tag from the dimension of the TaskSlice that actually ran. Said in a different way: this means that the tags for a TaskResultSummary shall have only the relevant tags, hence the manual tags must be kept in a separate property in TaskRequest. Stop using TaskRequest.properties and use TaskRequest.task_slice(index) to prepare for TaskRequest.task_slices. Right now index must be 0, as there's only one TaskSlice (the one stored as TaskRequest.properties). This makes later CL much simpler. Redo task_request_test.py TaskRequest entity creation flow as this was based on a flow that was used in 2015. No behavior change from the user's PoV. Bug: 781021 Change-Id: If4e49e577ef9086cb1560e0b19c54070174c738e Reviewed-on: https://chromium-review.googlesource.com/995621 Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org> Reviewed-by: Robbie Iannucci <iannucci@chromium.org> [modify] https://crrev.com/945d5d7da6b9f32a9523d12e9cfc8eecbb87c79a/appengine/swarming/event_mon_metrics.py [modify] https://crrev.com/945d5d7da6b9f32a9523d12e9cfc8eecbb87c79a/appengine/swarming/handlers_backend.py [modify] https://crrev.com/945d5d7da6b9f32a9523d12e9cfc8eecbb87c79a/appengine/swarming/handlers_bot.py [modify] https://crrev.com/945d5d7da6b9f32a9523d12e9cfc8eecbb87c79a/appengine/swarming/handlers_endpoints.py [modify] https://crrev.com/945d5d7da6b9f32a9523d12e9cfc8eecbb87c79a/appengine/swarming/handlers_endpoints_test.py [modify] https://crrev.com/945d5d7da6b9f32a9523d12e9cfc8eecbb87c79a/appengine/swarming/message_conversion.py [modify] https://crrev.com/945d5d7da6b9f32a9523d12e9cfc8eecbb87c79a/appengine/swarming/server/service_accounts_test.py [modify] https://crrev.com/945d5d7da6b9f32a9523d12e9cfc8eecbb87c79a/appengine/swarming/server/task_queues.py [modify] https://crrev.com/945d5d7da6b9f32a9523d12e9cfc8eecbb87c79a/appengine/swarming/server/task_queues_test.py [modify] https://crrev.com/945d5d7da6b9f32a9523d12e9cfc8eecbb87c79a/appengine/swarming/server/task_request.py [modify] https://crrev.com/945d5d7da6b9f32a9523d12e9cfc8eecbb87c79a/appengine/swarming/server/task_request_test.py [modify] https://crrev.com/945d5d7da6b9f32a9523d12e9cfc8eecbb87c79a/appengine/swarming/server/task_result.py [modify] https://crrev.com/945d5d7da6b9f32a9523d12e9cfc8eecbb87c79a/appengine/swarming/server/task_result_test.py [modify] https://crrev.com/945d5d7da6b9f32a9523d12e9cfc8eecbb87c79a/appengine/swarming/server/task_scheduler.py [modify] https://crrev.com/945d5d7da6b9f32a9523d12e9cfc8eecbb87c79a/appengine/swarming/server/task_scheduler_test.py [modify] https://crrev.com/945d5d7da6b9f32a9523d12e9cfc8eecbb87c79a/appengine/swarming/server/task_to_run.py [modify] https://crrev.com/945d5d7da6b9f32a9523d12e9cfc8eecbb87c79a/appengine/swarming/server/task_to_run_test.py [modify] https://crrev.com/945d5d7da6b9f32a9523d12e9cfc8eecbb87c79a/appengine/swarming/test_env_handlers.py
,
Apr 10 2018
,
Apr 11 2018
The following revision refers to this bug: https://chromium.googlesource.com/infra/luci/luci-py.git/+/27f3c7ed26c44319dcfefaee5bb51496a1194d6d commit 27f3c7ed26c44319dcfefaee5bb51496a1194d6d Author: Marc-Antoine Ruel <maruel@chromium.org> Date: Wed Apr 11 14:32:16 2018 [swarming] Fix bug introduced in 945d5d7da6b9f32a952. This very rarely occurs, only during a very short window where the negative cache was not caught, and the item was not detected as reapeable even in the index. TBR=iannucci@chromium.org Bug: 781021 Change-Id: I1f4977fff59ac95ca81c6a2053da77d3d872ff52 Reviewed-on: https://chromium-review.googlesource.com/1007255 Reviewed-by: Marc-Antoine Ruel <maruel@chromium.org> Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org> [modify] https://crrev.com/27f3c7ed26c44319dcfefaee5bb51496a1194d6d/appengine/swarming/server/task_scheduler.py
,
Apr 11 2018
The following revision refers to this bug: https://chromium.googlesource.com/infra/luci/luci-py.git/+/afc02e9362a1007e1076db41788d74d4bcf272c6 commit afc02e9362a1007e1076db41788d74d4bcf272c6 Author: Marc-Antoine Ruel <maruel@chromium.org> Date: Wed Apr 11 14:50:16 2018 [swarming] Retry Tx more in /internal/cron/update_bot_info datastore_utils.transaction_async() sets the default number of retries to 1, which means a lot of TransactionFailedError: too much contention on these datastore entities. please try again. are thrown. This is a problem to the extent that the logs are filed with exceptions due to this cron job at a rate of ~50%, depending on the new number of dead bots. 'Fix' the problem by retrying more. TBR=iannucci@chromium.org Bug: 781021 Change-Id: I2a93b3380f17ca6c6e975df97a99b2832094c540 Reviewed-on: https://chromium-review.googlesource.com/1007334 Reviewed-by: Marc-Antoine Ruel <maruel@chromium.org> Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org> [modify] https://crrev.com/afc02e9362a1007e1076db41788d74d4bcf272c6/appengine/swarming/server/bot_management.py
,
Apr 11 2018
Oops the last CL was related to issue 826421 .
,
Apr 12 2018
The following revision refers to this bug: https://chromium.googlesource.com/infra/luci/luci-py.git/+/0b45c23341a5d0c2fceff571f3c906a531b41411 commit 0b45c23341a5d0c2fceff571f3c906a531b41411 Author: Marc-Antoine Ruel <maruel@chromium.org> Date: Thu Apr 12 00:32:32 2018 [swarming] Add TaskRequest.task_slices Do not store them yet, but add the ability to read it. This makes the code forward compatible to simplify deployment and make the CL smaller. Include unit tests to confirm only one TaskSlice is supported. Bug: 781021 Change-Id: I474e1299b2152dba4e39a4da7abdbd54d9328587 Reviewed-on: https://chromium-review.googlesource.com/1008542 Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org> Reviewed-by: Robbie Iannucci <iannucci@chromium.org> [modify] https://crrev.com/0b45c23341a5d0c2fceff571f3c906a531b41411/appengine/swarming/doc/Schemas.md [modify] https://crrev.com/0b45c23341a5d0c2fceff571f3c906a531b41411/appengine/swarming/message_conversion.py [modify] https://crrev.com/0b45c23341a5d0c2fceff571f3c906a531b41411/appengine/swarming/server/task_request.py [modify] https://crrev.com/0b45c23341a5d0c2fceff571f3c906a531b41411/appengine/swarming/server/task_request_test.py [modify] https://crrev.com/0b45c23341a5d0c2fceff571f3c906a531b41411/appengine/swarming/server/task_scheduler_test.py [modify] https://crrev.com/0b45c23341a5d0c2fceff571f3c906a531b41411/appengine/swarming/server/task_to_run_test.py
,
Apr 12 2018
The following revision refers to this bug: https://chromium.googlesource.com/infra/luci/luci-py.git/+/4e55114a23538a9a91546f448ef65e4ae69eebb4 commit 4e55114a23538a9a91546f448ef65e4ae69eebb4 Author: Marc-Antoine Ruel <maruel@chromium.org> Date: Thu Apr 12 17:54:16 2018 [swarming] Store TaskRequest.task_slices instead of .properties This is backward incompatible before TaskRequest.task_slices support in 0b45c23341a5d0c2f. Still only allow a single TaskSlice as the fallback algorithm is not live yet. Bug: 781021 Change-Id: I0390123a6fa2570b56de2b80fe38fee650538f45 Reviewed-on: https://chromium-review.googlesource.com/1008667 Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org> Reviewed-by: Robbie Iannucci <iannucci@chromium.org> [modify] https://crrev.com/4e55114a23538a9a91546f448ef65e4ae69eebb4/appengine/swarming/handlers_endpoints_test.py [modify] https://crrev.com/4e55114a23538a9a91546f448ef65e4ae69eebb4/appengine/swarming/message_conversion.py
,
Apr 18 2018
The following revision refers to this bug: https://chromium.googlesource.com/infra/luci/luci-py.git/+/b5572d91544ea860c6c9a9ae808acf9728c39459 commit b5572d91544ea860c6c9a9ae808acf9728c39459 Author: Marc-Antoine Ruel <maruel@chromium.org> Date: Wed Apr 18 21:50:11 2018 [swarming] refactor task_to_run No functional change. The reason for this change is to enable using task_slices in a follow up, and by doing refactoring up front, this will better highlight the actual changes. Change-Id: Ib15a1c9fc835ef09fc00222ce92d1fe0fc6d2634 Bug: 781021 Reviewed-on: https://chromium-review.googlesource.com/1015330 Reviewed-by: Robbie Iannucci <iannucci@chromium.org> Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org> [modify] https://crrev.com/b5572d91544ea860c6c9a9ae808acf9728c39459/appengine/swarming/server/task_to_run_test.py
,
Apr 20 2018
The following revision refers to this bug: https://chromium.googlesource.com/infra/luci/luci-py.git/+/1fc37f909ef6cb6a87db77bf71b45fe6d78a6e01 commit 1fc37f909ef6cb6a87db77bf71b45fe6d78a6e01 Author: Marc-Antoine Ruel <maruel@chromium.org> Date: Fri Apr 20 00:26:35 2018 [client] Use the new API task_slices for task creation Task result will be done later. In the meantime they can be queried with the command query. Do not add support for multiple task slices yet, I still don't know how to encode this in a user friendly way. But switching to task_slices is a prerequisite anyway. Once this tool (and all other clients) are updated, we'll be able to remove the compatibility layer in the server. Bug: 781021 Change-Id: Ic2b5d3002caa2747e456669a721c3ec4207ce45f Reviewed-on: https://chromium-review.googlesource.com/1019901 Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org> Reviewed-by: Vadim Shtayura <vadimsh@chromium.org> [modify] https://crrev.com/1fc37f909ef6cb6a87db77bf71b45fe6d78a6e01/client/swarming.py [modify] https://crrev.com/1fc37f909ef6cb6a87db77bf71b45fe6d78a6e01/client/tests/swarming_test.py
,
Apr 20 2018
The following revision refers to this bug: https://chromium.googlesource.com/infra/luci/luci-py.git/+/74f8bd0ed7bcef244a0494fd7c05a2b788b87e3f commit 74f8bd0ed7bcef244a0494fd7c05a2b788b87e3f Author: Marc-Antoine Ruel <maruel@chromium.org> Date: Fri Apr 20 12:40:36 2018 [client] Fix regression in 1fc37f909ef6cb6a87db Mainly working around a stupid assert that tests that timeout is a float. I could have also replaced the assert but it's simpler to fix the call site for now. TBR=vadimsh@chromium.org Bug: 781021 Bug: skia:7827 Change-Id: Ic82b462c159d9e0929be3532e4c97a9d1b4b891b Reviewed-on: https://chromium-review.googlesource.com/1021830 Reviewed-by: Marc-Antoine Ruel <maruel@chromium.org> Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org> [modify] https://crrev.com/74f8bd0ed7bcef244a0494fd7c05a2b788b87e3f/client/swarming.py
,
Apr 21 2018
The following revision refers to this bug: https://chromium.googlesource.com/infra/luci/luci-py.git/+/4653a4dc87ff3de19cde53fc261754615b40ca6e commit 4653a4dc87ff3de19cde53fc261754615b40ca6e Author: Marc-Antoine Ruel <maruel@chromium.org> Date: Sat Apr 21 00:47:29 2018 [client] Update default value to be of the right type. In optparse, if you use type='float', default=0, the default value type will be int. That's cute. Bug: 781021 Change-Id: Iab9a7b16486c311bdf5cf2241c4963c1447aa2b2 Reviewed-on: https://chromium-review.googlesource.com/1022375 Reviewed-by: Vadim Shtayura <vadimsh@chromium.org> Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org> [modify] https://crrev.com/4653a4dc87ff3de19cde53fc261754615b40ca6e/client/swarming.py
,
Apr 23 2018
The following revision refers to this bug: https://chromium.googlesource.com/infra/luci/luci-py.git/+/d9ce22df5318640a761303259c47854d7932d986 commit d9ce22df5318640a761303259c47854d7932d986 Author: Marc-Antoine Ruel <maruel@chromium.org> Date: Mon Apr 23 14:48:41 2018 [client] Update swarming.py run with new task_slices This was broken by me in 1fc37f909ef6cb6a87db. I didn't realize because run is rarely used. - Fix swarming.py run to work. - Add a unit test to confirm that it works now (and didn't before) - Rename a few poorly named unit tests. Bug: 781021 Change-Id: Ic43f5b4021465209a5328086db203cb6cb523868 Reviewed-on: https://chromium-review.googlesource.com/1023712 Reviewed-by: Kevin Lubick <kjlubick@chromium.org> Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org> [modify] https://crrev.com/d9ce22df5318640a761303259c47854d7932d986/client/swarming.py [modify] https://crrev.com/d9ce22df5318640a761303259c47854d7932d986/client/tests/swarming_test.py
,
Apr 25 2018
The following revision refers to this bug: https://chromium.googlesource.com/infra/luci/luci-py.git/+/bf92440161b8ce8e3c99bc69a10f96a13aab9256 commit bf92440161b8ce8e3c99bc69a10f96a13aab9256 Author: Marc-Antoine Ruel <maruel@chromium.org> Date: Wed Apr 25 20:54:49 2018 [client] Add swarming.py post This command permits arbitrary post RPC. Think of a generic RPC CLI tool, except that it prepends the swarming API base resource. While not generally meant be used, it'll be useful to test TaskSlice support while iterating on the way this should be presented to the user in the CLI tool. Bug: 781021 Change-Id: Ie405178e9b4849d69751d67e24eb17f486075edc Reviewed-on: https://chromium-review.googlesource.com/1028461 Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org> Reviewed-by: Quinten Yearsley <qyearsley@chromium.org> [modify] https://crrev.com/bf92440161b8ce8e3c99bc69a10f96a13aab9256/client/swarming.py [modify] https://crrev.com/bf92440161b8ce8e3c99bc69a10f96a13aab9256/client/tests/swarming_test.py
,
Apr 25 2018
The following revision refers to this bug: https://chromium.googlesource.com/infra/luci/luci-py.git/+/febdf3b3d8818dfded4dae6c9aecab43de5ceadb commit febdf3b3d8818dfded4dae6c9aecab43de5ceadb Author: Marc-Antoine Ruel <maruel@chromium.org> Date: Wed Apr 25 21:19:59 2018 [swarming] fix TaskSlice checks and tags creation - Due to a bug in ndb.Model where LocalStructuredProperty's _pre_put_hook is not called, the function needs to be called manually. - automatic tags depends on expiration_ts as TaskRequest.task_slice() is called, which was initialized after. Oops. Unit test refactoring come in the very next CL that confirms all that. I confirmed manually that the next CL wouldn't pass without these two fixes. Bug: 781021 Change-Id: Ie6ae974d539c372e0fa4c719f8d907b010a6f100 Reviewed-on: https://chromium-review.googlesource.com/1028632 Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org> Reviewed-by: Quinten Yearsley <qyearsley@chromium.org> [modify] https://crrev.com/febdf3b3d8818dfded4dae6c9aecab43de5ceadb/appengine/swarming/server/task_request.py
,
Apr 25 2018
The following revision refers to this bug: https://chromium.googlesource.com/infra/luci/luci-py.git/+/ae88212bc49ee866a650439f216742dd3d84910d commit ae88212bc49ee866a650439f216742dd3d84910d Author: Marc-Antoine Ruel <maruel@chromium.org> Date: Wed Apr 25 22:24:59 2018 [swarming] Improve tests No functional changes After years of DB schema refactoring, the unit tests have started to become really tricky. This CL changes the unit tests code to be more 'typed', as much as python can be typed. This is done by changing many dict to the expected ndb.Model type instead. This makes reading the code saner. Change the unit tests to use more coherent utility function names across unit tests. This is in preparation of removing support for the old style TaskRequest.properties, as for TaskSlice support, TaskRequest.task_slices is going to be the only member used going on from (next CL). Intentionally do this unit test refactoring without touching actual code. Bug: 781021 Change-Id: I0da1bcf8d318909516121f7a4b612fa92e1ca786 Reviewed-on: https://chromium-review.googlesource.com/1028185 Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org> Reviewed-by: Quinten Yearsley <qyearsley@chromium.org> [modify] https://crrev.com/ae88212bc49ee866a650439f216742dd3d84910d/appengine/swarming/server/service_accounts_test.py [modify] https://crrev.com/ae88212bc49ee866a650439f216742dd3d84910d/appengine/swarming/server/task_queues_test.py [modify] https://crrev.com/ae88212bc49ee866a650439f216742dd3d84910d/appengine/swarming/server/task_request_test.py [modify] https://crrev.com/ae88212bc49ee866a650439f216742dd3d84910d/appengine/swarming/server/task_result_test.py [modify] https://crrev.com/ae88212bc49ee866a650439f216742dd3d84910d/appengine/swarming/server/task_scheduler_test.py [modify] https://crrev.com/ae88212bc49ee866a650439f216742dd3d84910d/appengine/swarming/server/task_to_run_test.py
,
Apr 26 2018
The following revision refers to this bug: https://chromium.googlesource.com/infra/luci/luci-py.git/+/effb9f51728b7b809003dc6f69ce35f60a8c5681 commit effb9f51728b7b809003dc6f69ce35f60a8c5681 Author: Marc-Antoine Ruel <maruel@chromium.org> Date: Thu Apr 26 15:38:20 2018 [swarming] Remove support to store old-style TaskRequest Now that all tests have been updated to TaskRequest.task_slices, the only remaining use was for terminate pseudo-task. Update this logic to use the new task_slices mechanism. All TaskRequest must now use task_slices. Having one way to do things makes everything much simpler. Rename TaskRequest.properties to TaskRequest.properties_old to signal that should not be used anymore. Bug: 781021 Change-Id: I5f53da00073ebd787c847cf0d819ff61c117b616 Reviewed-on: https://chromium-review.googlesource.com/1027807 Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org> Reviewed-by: Quinten Yearsley <qyearsley@chromium.org> [modify] https://crrev.com/effb9f51728b7b809003dc6f69ce35f60a8c5681/appengine/swarming/server/task_request.py [modify] https://crrev.com/effb9f51728b7b809003dc6f69ce35f60a8c5681/appengine/swarming/server/task_request_test.py
,
Apr 26 2018
,
Apr 26 2018
,
Apr 27 2018
The following revision refers to this bug: https://chromium.googlesource.com/infra/luci/luci-py.git/+/0f2e8630c078cb65ea310d97b0f6d938006ad8f0 commit 0f2e8630c078cb65ea310d97b0f6d938006ad8f0 Author: Marc-Antoine Ruel <maruel@chromium.org> Date: Fri Apr 27 19:31:46 2018 [swarming] add more test expectations on bot's valid task queues - Make assert_bot_async() return the number of updated entities - Add expectation in unit tests with the number of BotTaskDimensions entities updated. No functional change, as the return value was previously ignored and it is currently looked only in unit tests. This is done so when task slices are added, the expectations match. Bug: 781021 Change-Id: Ibf9387deb336b6ee6b6e6690826f866e061328b4 Reviewed-on: https://chromium-review.googlesource.com/1032796 Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org> Reviewed-by: Quinten Yearsley <qyearsley@chromium.org> [modify] https://crrev.com/0f2e8630c078cb65ea310d97b0f6d938006ad8f0/appengine/swarming/server/task_queues.py [modify] https://crrev.com/0f2e8630c078cb65ea310d97b0f6d938006ad8f0/appengine/swarming/server/task_queues_test.py [modify] https://crrev.com/0f2e8630c078cb65ea310d97b0f6d938006ad8f0/appengine/swarming/server/task_scheduler_test.py
,
Apr 27 2018
The following revision refers to this bug: https://chromium.googlesource.com/infra/luci/luci-py.git/+/883704490ac581b18fd9f259f8fd1c57a428dbda commit 883704490ac581b18fd9f259f8fd1c57a428dbda Author: Marc-Antoine Ruel <maruel@chromium.org> Date: Fri Apr 27 19:37:36 2018 [swarming] Improve task_to_run - Inline task_to_run.gen_queue_number() inside new_task_to_run(), it doesn't need to be exported anymore. It used to be called in task_scheduler.py but this was when TaskToRun entities were reused. This was fixed as part of https://crbug.com/817831 . There is now exactly one TaskToRun per try (either try_number or task_slice_index) so no queue number mutation is possible now. \o/ - Rename task_key to to_run_key to be consistent with the rest of the source code. - Add utility functions task_to_run_key_slice_index() and task_to_run_key_try_number(), they will be needed soon to figure out the actual task_slice used for the TaskToRun, when only having access to its ndb.Key. - This is also in preparation for a bug fix with memcache-based negative cache that will be singled out as a separate CL. - Add unit tests that are ready for task slice fallback. Bug: 781021 Change-Id: Ic2c436b9de7de7d7f386408f2bd3585da2f02e54 Reviewed-on: https://chromium-review.googlesource.com/1032636 Reviewed-by: Quinten Yearsley <qyearsley@chromium.org> Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org> [modify] https://crrev.com/883704490ac581b18fd9f259f8fd1c57a428dbda/appengine/swarming/server/task_to_run.py [modify] https://crrev.com/883704490ac581b18fd9f259f8fd1c57a428dbda/appengine/swarming/server/task_to_run_test.py
,
Apr 30 2018
The following revision refers to this bug: https://chromium.googlesource.com/infra/luci/luci-py.git/+/7c74da00be7dc2a080e8c89df08b4af4d648d140 commit 7c74da00be7dc2a080e8c89df08b4af4d648d140 Author: Marc-Antoine Ruel <maruel@chromium.org> Date: Mon Apr 30 15:13:37 2018 [swarming] Fix memcache-based TaskToRun negative cache It used to be only the TaskRequest task id. This meant that the try number 1 and 2 shared the same negative cache, which was fine. Adding TaskSlice fallback in the mix, this completely breaks down, as TaskToRun entities are not reused since https://crbug.com/817831 was fixed. When this CL is made live, this will cause a small hickup as the negative cache is being repopulated, as the keys changed. This won't be a big deal in practice, we cleared the whole memcache frequently and the service survived just fine. Add a unit test that test interference between try_number 1 and 2. It will be expanded once task fallback is added, i.e. task_slice_index > 0 is supported. Bug: 781021 Change-Id: I944ee067ed96ca3410e45a9c20cffadc68047a23 Reviewed-on: https://chromium-review.googlesource.com/1033292 Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org> Reviewed-by: Quinten Yearsley <qyearsley@chromium.org> [modify] https://crrev.com/7c74da00be7dc2a080e8c89df08b4af4d648d140/appengine/swarming/server/task_to_run.py [modify] https://crrev.com/7c74da00be7dc2a080e8c89df08b4af4d648d140/appengine/swarming/server/task_to_run_test.py
,
Apr 30 2018
The following revision refers to this bug: https://chromium.googlesource.com/infra/luci/luci-py.git/+/fec529e9d03b95a6f58fe8c0597f238cd6994c40 commit fec529e9d03b95a6f58fe8c0597f238cd6994c40 Author: Marc-Antoine Ruel <maruel@chromium.org> Date: Mon Apr 30 20:07:12 2018 [swarming] correctly initialize TaskRunResult.current_task_slice It went undetected because, well, current_task_slice != 0 was not supported. :) The current unit test only test for current_task_slice == 0 because it's the only supported value for now. Bug: 781021 Change-Id: I723cd7748433883c2490bc5641312ab1726d5918 Reviewed-on: https://chromium-review.googlesource.com/1033423 Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org> Reviewed-by: Quinten Yearsley <qyearsley@chromium.org> [modify] https://crrev.com/fec529e9d03b95a6f58fe8c0597f238cd6994c40/appengine/swarming/server/task_result.py [modify] https://crrev.com/fec529e9d03b95a6f58fe8c0597f238cd6994c40/appengine/swarming/server/task_result_test.py [modify] https://crrev.com/fec529e9d03b95a6f58fe8c0597f238cd6994c40/appengine/swarming/server/task_scheduler.py
,
May 1 2018
The following revision refers to this bug: https://chromium.googlesource.com/infra/luci/luci-py.git/+/cdcee247aa8b224788b89b287b3ff911a1e6050c commit cdcee247aa8b224788b89b287b3ff911a1e6050c Author: Marc-Antoine Ruel <maruel@chromium.org> Date: Tue May 01 15:50:37 2018 [swarming] dedupe non-zero TaskSlice - When looking for task deduplication, also look for task slice fallback. This should be fairly fast as we don't expect task requests to have many idempotent task fallbacks. - Improve task_scheduler_test to be more specific when looking at TaskToRun entities than "the one that happens to be fetched by the query". - Remove spurious log statements. - Change the cron job to reenqueue task fallback to return the task ids affected, this will be leveraged in the unit test that confirms this to work. This change is currently a no-op as only a single TaskSlice is allowed. A unit test is already written for this specific use case but I can't commit it right away. :/ I still prefer to fix these bugs before, so once task fallback is enabled, the code is actually known to be good. Bug: 781021 Change-Id: Ie5943924a782994c717c4cb995ee2a9e07748b72 Reviewed-on: https://chromium-review.googlesource.com/1033443 Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org> Reviewed-by: Quinten Yearsley <qyearsley@chromium.org> [modify] https://crrev.com/cdcee247aa8b224788b89b287b3ff911a1e6050c/appengine/swarming/server/task_scheduler.py [modify] https://crrev.com/cdcee247aa8b224788b89b287b3ff911a1e6050c/appengine/swarming/server/task_scheduler_test.py
,
May 2 2018
,
May 2 2018
The following revision refers to this bug: https://chromium.googlesource.com/infra/luci/luci-py.git/+/f6e46dd8b9cb9130cc37f627334e04549c8a9530 commit f6e46dd8b9cb9130cc37f627334e04549c8a9530 Author: Marc-Antoine Ruel <maruel@chromium.org> Date: Wed May 02 21:07:20 2018 [swarming] last changes before enabling task fallback Most of the changes are test changes, to reduce the delta for the commit enabling the task fallback support. The exception is to add the check for duplicate task right away. Bug: 781021 Change-Id: I5ac377ea4058a96ea32e43c6f2c5d7ebe958f7cb Reviewed-on: https://chromium-review.googlesource.com/1039945 Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org> Reviewed-by: Quinten Yearsley <qyearsley@chromium.org> [modify] https://crrev.com/f6e46dd8b9cb9130cc37f627334e04549c8a9530/appengine/swarming/handlers_endpoints_test.py [modify] https://crrev.com/f6e46dd8b9cb9130cc37f627334e04549c8a9530/appengine/swarming/local_smoke_test.py [modify] https://crrev.com/f6e46dd8b9cb9130cc37f627334e04549c8a9530/appengine/swarming/server/task_request.py [modify] https://crrev.com/f6e46dd8b9cb9130cc37f627334e04549c8a9530/appengine/swarming/server/task_request_test.py
,
May 2 2018
,
May 3 2018
The following revision refers to this bug: https://chromium.googlesource.com/infra/luci/luci-py.git/+/d682c59d424c0bb48893e720450926d3a062feaf commit d682c59d424c0bb48893e720450926d3a062feaf Author: Marc-Antoine Ruel <maruel@chromium.org> Date: Thu May 03 12:27:09 2018 [swarming] Enable TaskSlice fallback support - Remove the check that limited the number of TaskSlice attached to a TaskRequest to 1. - Update tests accordingly. Bug: 781021 Change-Id: I8bfe7a9fdbec756ec2b5d7a3ea2148e9838f7a19 Reviewed-on: https://chromium-review.googlesource.com/1038904 Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org> Reviewed-by: Quinten Yearsley <qyearsley@chromium.org> [modify] https://crrev.com/d682c59d424c0bb48893e720450926d3a062feaf/appengine/swarming/handlers_endpoints_test.py [modify] https://crrev.com/d682c59d424c0bb48893e720450926d3a062feaf/appengine/swarming/local_smoke_test.py [modify] https://crrev.com/d682c59d424c0bb48893e720450926d3a062feaf/appengine/swarming/server/task_request.py [modify] https://crrev.com/d682c59d424c0bb48893e720450926d3a062feaf/appengine/swarming/server/task_request_test.py [modify] https://crrev.com/d682c59d424c0bb48893e720450926d3a062feaf/appengine/swarming/server/task_scheduler_test.py [modify] https://crrev.com/d682c59d424c0bb48893e720450926d3a062feaf/appengine/swarming/server/task_to_run_test.py
,
May 3 2018
,
May 3 2018
,
May 3 2018
This is committed and released to prod. Issue 839173 , issue 839464 and issue 839467 are the follow up items.
,
May 24 2018
The following revision refers to this bug: https://chromium.googlesource.com/infra/luci/luci-py.git/+/d4649a781cb62cd7b06801f57d7a916cfd889225 commit d4649a781cb62cd7b06801f57d7a916cfd889225 Author: Marc-Antoine Ruel <maruel@chromium.org> Date: Thu May 24 00:29:31 2018 [swarming] fix expiration on the fly The 'task slice expiration on the fly' used the whole TaskRequest expiration instead of the individual TaskSlice expiration. Oops. Add test case to confirm all code paths work. Extract the local transaction function from _expire_task() to make the whole thing more readable, otherwise it's hard to know which variable is aliased or not. Fix incorrect docstring and rename variables to be clearer. Bug: 781021 Change-Id: Ia2d318c6414476c1abb0d368a9a8838a4a031ac1 Reviewed-on: https://chromium-review.googlesource.com/1069612 Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org> Reviewed-by: Quinten Yearsley <qyearsley@chromium.org> [modify] https://crrev.com/d4649a781cb62cd7b06801f57d7a916cfd889225/appengine/swarming/server/task_scheduler.py [modify] https://crrev.com/d4649a781cb62cd7b06801f57d7a916cfd889225/appengine/swarming/server/task_scheduler_test.py
,
May 29 2018
|
|||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||
Comment 1 by mar...@chromium.org
, Jan 12 2018Status: Assigned (was: Available)