We have pain in dealing with pubsub callback for async pipeline, especially the code is too complex and copied around. Instead, we should abstract the shared logic, and make it simpler and easier to use and debug.
The following revision refers to this bug: https://chromium.googlesource.com/infra/infra/+/8b01bec0a5a7eb0620fe38b8d167db82fdb1eb09 commit 8b01bec0a5a7eb0620fe38b8d167db82fdb1eb09 Author: Shuotao Gao <stgao@chromium.org> Date: Wed Feb 14 00:20:57 2018 [Findit] make buildbucket client not know how to create pubsub callback info. 1. Add a PubSubCallback class to represent the callback info. 2. Make caller of the buildbucket client create the PubSubCallback instance and pass it over as part of the TryJob instance. The original code had a bug of using the same notification id when multiple try jobs are scheduled at the same time, but this was not revealed because of no such usage yet. Bug: 811632 Change-Id: I0f85369870f26f44b66a9df1f0d29d3a4fde0a56 Reviewed-on: https://chromium-review.googlesource.com/915723 Commit-Queue: Shuotao Gao <stgao@chromium.org> Reviewed-by: Roberto Carrillo <robertocn@chromium.org> [modify] https://crrev.com/8b01bec0a5a7eb0620fe38b8d167db82fdb1eb09/appengine/findit/handlers/test/periodic_bot_update_test.py [modify] https://crrev.com/8b01bec0a5a7eb0620fe38b8d167db82fdb1eb09/appengine/findit/waterfall/flake/test/schedule_flake_try_job_pipeline_test.py [modify] https://crrev.com/8b01bec0a5a7eb0620fe38b8d167db82fdb1eb09/appengine/findit/services/test/try_job_test.py [modify] https://crrev.com/8b01bec0a5a7eb0620fe38b8d167db82fdb1eb09/appengine/findit/common/waterfall/buildbucket_client.py [modify] https://crrev.com/8b01bec0a5a7eb0620fe38b8d167db82fdb1eb09/appengine/findit/services/try_job.py [modify] https://crrev.com/8b01bec0a5a7eb0620fe38b8d167db82fdb1eb09/appengine/findit/common/waterfall/test/pubsub_callback_test.py [modify] https://crrev.com/8b01bec0a5a7eb0620fe38b8d167db82fdb1eb09/appengine/findit/common/waterfall/pubsub_callback.py
The second CL is https://chromium-review.googlesource.com/c/infra/infra/+/921161
The 3rd CL to delete some legacy code to rerun tryjobs is https://chromium-review.googlesource.com/c/infra/infra/+/926710
The following revision refers to this bug: https://chromium.googlesource.com/infra/infra/+/c0467347943dcaea49ff1c5371227b4b663491e6 commit c0467347943dcaea49ff1c5371227b4b663491e6 Author: Shuotao Gao <stgao@chromium.org> Date: Fri Feb 23 00:35:21 2018 [Findit] Redesign pubsub-based callback for async pipeline (2) The new flow works as this: 1. When scheduling a Buildbucket build, add the pipeline id in the user data. 2. Add a generic handler to retrieve the pipeline id, restore the Pipeline instance, and then schedule a task to run in the same task queue and target as the pipeline itself. In AsynchronousPipeline, make a few change: 1. Make it possible to save and get runtime parameters within the pipeline. This is mostly to associate the try job id with the pipeline id to avoid scheduling the same build twice. A follow up change could be to add client operation id supported by Buildbucket for another layer of protection. 2. Add the support to schedule a timeout callback and an explicit function to handle that case. Added handler to receive try-job notification, because this is needed to test https://chromium-review.googlesource.com/c/infra/infra/+/933293 before deployment with that change is made default version. 1. Add a try job specific handler to extract build json data. 2. Protect the handler with login:admin. Bug: 811632 Change-Id: I0b941eef4af9d9b4e37bd4c090ec0bb983cbcbca Reviewed-on: https://chromium-review.googlesource.com/917331 Commit-Queue: Shuotao Gao <stgao@chromium.org> Reviewed-by: Chan Li <chanli@chromium.org> [modify] https://crrev.com/c0467347943dcaea49ff1c5371227b4b663491e6/appengine/findit/gae_libs/test/pipelines_test.py [add] https://crrev.com/c0467347943dcaea49ff1c5371227b4b663491e6/appengine/findit/gae_libs/handlers/test/pubsub_pipeline_callback_test.py [add] https://crrev.com/c0467347943dcaea49ff1c5371227b4b663491e6/appengine/findit/handlers/try_job_pubsub_pipeline_callback.py [add] https://crrev.com/c0467347943dcaea49ff1c5371227b4b663491e6/appengine/findit/gae_libs/handlers/pubsub_pipeline_callback.py [modify] https://crrev.com/c0467347943dcaea49ff1c5371227b4b663491e6/appengine/findit/app.yaml [modify] https://crrev.com/c0467347943dcaea49ff1c5371227b4b663491e6/appengine/findit/main.py [modify] https://crrev.com/c0467347943dcaea49ff1c5371227b4b663491e6/appengine/findit/gae_libs/pipelines.py [add] https://crrev.com/c0467347943dcaea49ff1c5371227b4b663491e6/appengine/findit/handlers/test/try_job_pubsub_pipeline_callback_test.py
The following revision refers to this bug: https://chromium.googlesource.com/infra/infra/+/615d135d8a50ab1d47c32685b60df90b88a8c3dd commit 615d135d8a50ab1d47c32685b60df90b88a8c3dd Author: Shuotao Gao <stgao@chromium.org> Date: Mon Feb 26 21:51:48 2018 [Findit] Fix permission issue for PubSub push endpoint. For PubSub push, users.is_current_user_admin returns false, while it works for login:admin protection in app.yaml Bug: 811632 Change-Id: I0ce482fc9395494fba579c33a3ce2a5070db2ff9 Reviewed-on: https://chromium-review.googlesource.com/938154 Reviewed-by: Roberto Carrillo <robertocn@chromium.org> Reviewed-by: Chan Li <chanli@chromium.org> Commit-Queue: Shuotao Gao <stgao@chromium.org> [modify] https://crrev.com/615d135d8a50ab1d47c32685b60df90b88a8c3dd/appengine/findit/gae_libs/handlers/test/pubsub_pipeline_callback_test.py [modify] https://crrev.com/615d135d8a50ab1d47c32685b60df90b88a8c3dd/appengine/findit/gae_libs/handlers/pubsub_pipeline_callback.py [modify] https://crrev.com/615d135d8a50ab1d47c32685b60df90b88a8c3dd/appengine/findit/makefile [modify] https://crrev.com/615d135d8a50ab1d47c32685b60df90b88a8c3dd/appengine/findit/gae_libs/handlers/base_handler.py
The following revision refers to this bug: https://chromium.googlesource.com/infra/infra/+/a034e53c3bc683ecff9ee554aebb424370a76824 commit a034e53c3bc683ecff9ee554aebb424370a76824 Author: Shuotao Gao <stgao@chromium.org> Date: Mon Feb 26 23:15:29 2018 [Findit] Redesign pubsub-based callback for async pipeline (3) Convert the compile job pipeline to the new PubSub callback. Bug: 811632 Change-Id: Ic50a36101459d5f1ed6a85d55d95f7df6985c226 Reviewed-on: https://chromium-review.googlesource.com/933293 Commit-Queue: Shuotao Gao <stgao@chromium.org> Reviewed-by: Chan Li <chanli@chromium.org> [modify] https://crrev.com/a034e53c3bc683ecff9ee554aebb424370a76824/appengine/findit/services/test/try_job_test.py [modify] https://crrev.com/a034e53c3bc683ecff9ee554aebb424370a76824/appengine/findit/services/compile_failure/compile_try_job.py [modify] https://crrev.com/a034e53c3bc683ecff9ee554aebb424370a76824/appengine/findit/common/waterfall/buildbucket_client.py [modify] https://crrev.com/a034e53c3bc683ecff9ee554aebb424370a76824/appengine/findit/common/exceptions.py [modify] https://crrev.com/a034e53c3bc683ecff9ee554aebb424370a76824/appengine/findit/pipelines/compile_failure/test/start_compile_try_job_pipeline_test.py [modify] https://crrev.com/a034e53c3bc683ecff9ee554aebb424370a76824/appengine/findit/pipelines/compile_failure/run_compile_try_job_pipeline.py [modify] https://crrev.com/a034e53c3bc683ecff9ee554aebb424370a76824/appengine/findit/pipelines/compile_failure/test/run_compile_try_job_pipeline_test.py [modify] https://crrev.com/a034e53c3bc683ecff9ee554aebb424370a76824/appengine/findit/waterfall/test/monitor_try_job_pipeline_test.py [modify] https://crrev.com/a034e53c3bc683ecff9ee554aebb424370a76824/appengine/findit/services/try_job.py
The following revision refers to this bug: https://chromium.googlesource.com/infra/infra/+/85314011f396ecfdcd9732310de33c1d05c303a2 commit 85314011f396ecfdcd9732310de33c1d05c303a2 Author: Shuotao Gao <stgao@chromium.org> Date: Tue Feb 27 00:55:40 2018 In unittests, execute tasks ordered by ETA, queue name and task name. Change the way tasks are executed in unittests: 1. A task could spawn more tasks and also delete existing tasks, so we should execute one task at a time. 2. A task could be scheduled with a countdown delay, thus task execution should be ordered by their ETAs. To ensure consistency in unittests, if tasks have the same ETA, use queue-name and task-name for ordering. Bug: 811632 Change-Id: Ic87bf8129dd983ba4dc1ee86d0971b9c655a04bc Reviewed-on: https://chromium-review.googlesource.com/929875 Reviewed-by: Robbie Iannucci <iannucci@chromium.org> Reviewed-by: Roberto Carrillo <robertocn@chromium.org> Reviewed-by: Chan Li <chanli@chromium.org> Commit-Queue: Shuotao Gao <stgao@chromium.org> [modify] https://crrev.com/85314011f396ecfdcd9732310de33c1d05c303a2/appengine/findit/gae_libs/test/pipelines_test.py [modify] https://crrev.com/85314011f396ecfdcd9732310de33c1d05c303a2/appengine/findit/gae_libs/pipelines.py [modify] https://crrev.com/85314011f396ecfdcd9732310de33c1d05c303a2/appengine_module/testing_utils/testing.py
Comment 1 by st...@chromium.org
, Feb 13 2018