New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 811632 link

Starred by 1 user

Issue metadata

Status: Verified
Owner:
Closed: Feb 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: ----



Sign in to add a comment

Redesign pubsub callback for async pipeline

Project Member Reported by st...@chromium.org, Feb 13 2018

Issue description

We have pain in dealing with pubsub callback for async pipeline, especially the code is too complex and copied around.
Instead, we should abstract the shared logic, and make it simpler and easier to use and debug.
 

Comment 1 by st...@chromium.org, Feb 13 2018

Labels: -Pri-3 Pri-1
Project Member

Comment 2 by bugdroid1@chromium.org, Feb 14 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/infra/+/8b01bec0a5a7eb0620fe38b8d167db82fdb1eb09

commit 8b01bec0a5a7eb0620fe38b8d167db82fdb1eb09
Author: Shuotao Gao <stgao@chromium.org>
Date: Wed Feb 14 00:20:57 2018

[Findit] make buildbucket client not know how to create pubsub callback info.

1. Add a PubSubCallback class to represent the callback info.
2. Make caller of the buildbucket client create the PubSubCallback instance and
   pass it over as part of the TryJob instance.

The original code had a bug of using the same notification id when multiple try
jobs are scheduled at the same time, but this was not revealed because of no
such usage yet.

Bug:  811632 
Change-Id: I0f85369870f26f44b66a9df1f0d29d3a4fde0a56
Reviewed-on: https://chromium-review.googlesource.com/915723
Commit-Queue: Shuotao Gao <stgao@chromium.org>
Reviewed-by: Roberto Carrillo <robertocn@chromium.org>

[modify] https://crrev.com/8b01bec0a5a7eb0620fe38b8d167db82fdb1eb09/appengine/findit/handlers/test/periodic_bot_update_test.py
[modify] https://crrev.com/8b01bec0a5a7eb0620fe38b8d167db82fdb1eb09/appengine/findit/waterfall/flake/test/schedule_flake_try_job_pipeline_test.py
[modify] https://crrev.com/8b01bec0a5a7eb0620fe38b8d167db82fdb1eb09/appengine/findit/services/test/try_job_test.py
[modify] https://crrev.com/8b01bec0a5a7eb0620fe38b8d167db82fdb1eb09/appengine/findit/common/waterfall/buildbucket_client.py
[modify] https://crrev.com/8b01bec0a5a7eb0620fe38b8d167db82fdb1eb09/appengine/findit/services/try_job.py
[modify] https://crrev.com/8b01bec0a5a7eb0620fe38b8d167db82fdb1eb09/appengine/findit/common/waterfall/test/pubsub_callback_test.py
[modify] https://crrev.com/8b01bec0a5a7eb0620fe38b8d167db82fdb1eb09/appengine/findit/common/waterfall/pubsub_callback.py

Comment 4 by st...@chromium.org, Feb 21 2018

The 3rd CL to delete some legacy code to rerun tryjobs is https://chromium-review.googlesource.com/c/infra/infra/+/926710
Project Member

Comment 5 by bugdroid1@chromium.org, Feb 23 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/infra/+/c0467347943dcaea49ff1c5371227b4b663491e6

commit c0467347943dcaea49ff1c5371227b4b663491e6
Author: Shuotao Gao <stgao@chromium.org>
Date: Fri Feb 23 00:35:21 2018

[Findit] Redesign pubsub-based callback for async pipeline (2)

The new flow works as this:
1. When scheduling a Buildbucket build, add the pipeline id in the user data.
2. Add a generic handler to retrieve the pipeline id, restore the Pipeline
   instance, and then schedule a task to run in the same task queue and target
   as the pipeline itself.

In AsynchronousPipeline, make a few change:
1. Make it possible to save and get runtime parameters within the pipeline. This
   is mostly to associate the try job id with the pipeline id to avoid scheduling the
   same build twice. A follow up change could be to add client operation id supported
   by Buildbucket for another layer of protection.
2. Add the support to schedule a timeout callback and an explicit function to
   handle that case.

Added handler to receive try-job notification, because this is needed to test https://chromium-review.googlesource.com/c/infra/infra/+/933293 before deployment with that change is made default version.
1. Add a try job specific handler to extract build json data.
2. Protect the handler with login:admin.

Bug:  811632 
Change-Id: I0b941eef4af9d9b4e37bd4c090ec0bb983cbcbca
Reviewed-on: https://chromium-review.googlesource.com/917331
Commit-Queue: Shuotao Gao <stgao@chromium.org>
Reviewed-by: Chan Li <chanli@chromium.org>

[modify] https://crrev.com/c0467347943dcaea49ff1c5371227b4b663491e6/appengine/findit/gae_libs/test/pipelines_test.py
[add] https://crrev.com/c0467347943dcaea49ff1c5371227b4b663491e6/appengine/findit/gae_libs/handlers/test/pubsub_pipeline_callback_test.py
[add] https://crrev.com/c0467347943dcaea49ff1c5371227b4b663491e6/appengine/findit/handlers/try_job_pubsub_pipeline_callback.py
[add] https://crrev.com/c0467347943dcaea49ff1c5371227b4b663491e6/appengine/findit/gae_libs/handlers/pubsub_pipeline_callback.py
[modify] https://crrev.com/c0467347943dcaea49ff1c5371227b4b663491e6/appengine/findit/app.yaml
[modify] https://crrev.com/c0467347943dcaea49ff1c5371227b4b663491e6/appengine/findit/main.py
[modify] https://crrev.com/c0467347943dcaea49ff1c5371227b4b663491e6/appengine/findit/gae_libs/pipelines.py
[add] https://crrev.com/c0467347943dcaea49ff1c5371227b4b663491e6/appengine/findit/handlers/test/try_job_pubsub_pipeline_callback_test.py

Project Member

Comment 6 by bugdroid1@chromium.org, Feb 26 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/infra/+/615d135d8a50ab1d47c32685b60df90b88a8c3dd

commit 615d135d8a50ab1d47c32685b60df90b88a8c3dd
Author: Shuotao Gao <stgao@chromium.org>
Date: Mon Feb 26 21:51:48 2018

[Findit] Fix permission issue for PubSub push endpoint.

For PubSub push, users.is_current_user_admin returns false, while it works
for login:admin protection in app.yaml

Bug:  811632 
Change-Id: I0ce482fc9395494fba579c33a3ce2a5070db2ff9
Reviewed-on: https://chromium-review.googlesource.com/938154
Reviewed-by: Roberto Carrillo <robertocn@chromium.org>
Reviewed-by: Chan Li <chanli@chromium.org>
Commit-Queue: Shuotao Gao <stgao@chromium.org>

[modify] https://crrev.com/615d135d8a50ab1d47c32685b60df90b88a8c3dd/appengine/findit/gae_libs/handlers/test/pubsub_pipeline_callback_test.py
[modify] https://crrev.com/615d135d8a50ab1d47c32685b60df90b88a8c3dd/appengine/findit/gae_libs/handlers/pubsub_pipeline_callback.py
[modify] https://crrev.com/615d135d8a50ab1d47c32685b60df90b88a8c3dd/appengine/findit/makefile
[modify] https://crrev.com/615d135d8a50ab1d47c32685b60df90b88a8c3dd/appengine/findit/gae_libs/handlers/base_handler.py

Project Member

Comment 7 by bugdroid1@chromium.org, Feb 26 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/infra/+/a034e53c3bc683ecff9ee554aebb424370a76824

commit a034e53c3bc683ecff9ee554aebb424370a76824
Author: Shuotao Gao <stgao@chromium.org>
Date: Mon Feb 26 23:15:29 2018

[Findit] Redesign pubsub-based callback for async pipeline (3)

Convert the compile job pipeline to the new PubSub callback.

Bug:  811632 
Change-Id: Ic50a36101459d5f1ed6a85d55d95f7df6985c226
Reviewed-on: https://chromium-review.googlesource.com/933293
Commit-Queue: Shuotao Gao <stgao@chromium.org>
Reviewed-by: Chan Li <chanli@chromium.org>

[modify] https://crrev.com/a034e53c3bc683ecff9ee554aebb424370a76824/appengine/findit/services/test/try_job_test.py
[modify] https://crrev.com/a034e53c3bc683ecff9ee554aebb424370a76824/appengine/findit/services/compile_failure/compile_try_job.py
[modify] https://crrev.com/a034e53c3bc683ecff9ee554aebb424370a76824/appengine/findit/common/waterfall/buildbucket_client.py
[modify] https://crrev.com/a034e53c3bc683ecff9ee554aebb424370a76824/appengine/findit/common/exceptions.py
[modify] https://crrev.com/a034e53c3bc683ecff9ee554aebb424370a76824/appengine/findit/pipelines/compile_failure/test/start_compile_try_job_pipeline_test.py
[modify] https://crrev.com/a034e53c3bc683ecff9ee554aebb424370a76824/appengine/findit/pipelines/compile_failure/run_compile_try_job_pipeline.py
[modify] https://crrev.com/a034e53c3bc683ecff9ee554aebb424370a76824/appengine/findit/pipelines/compile_failure/test/run_compile_try_job_pipeline_test.py
[modify] https://crrev.com/a034e53c3bc683ecff9ee554aebb424370a76824/appengine/findit/waterfall/test/monitor_try_job_pipeline_test.py
[modify] https://crrev.com/a034e53c3bc683ecff9ee554aebb424370a76824/appengine/findit/services/try_job.py

Project Member

Comment 8 by bugdroid1@chromium.org, Feb 27 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/infra/+/85314011f396ecfdcd9732310de33c1d05c303a2

commit 85314011f396ecfdcd9732310de33c1d05c303a2
Author: Shuotao Gao <stgao@chromium.org>
Date: Tue Feb 27 00:55:40 2018

In unittests, execute tasks ordered by ETA, queue name and task name.

Change the way tasks are executed in unittests:
1. A task could spawn more tasks and also delete existing tasks, so we should execute one task at a time.
2. A task could be scheduled with a countdown delay, thus task execution should be ordered by their ETAs.
   To ensure consistency in unittests, if tasks have the same ETA, use queue-name and task-name for ordering.

Bug:  811632 
Change-Id: Ic87bf8129dd983ba4dc1ee86d0971b9c655a04bc
Reviewed-on: https://chromium-review.googlesource.com/929875
Reviewed-by: Robbie Iannucci <iannucci@chromium.org>
Reviewed-by: Roberto Carrillo <robertocn@chromium.org>
Reviewed-by: Chan Li <chanli@chromium.org>
Commit-Queue: Shuotao Gao <stgao@chromium.org>

[modify] https://crrev.com/85314011f396ecfdcd9732310de33c1d05c303a2/appengine/findit/gae_libs/test/pipelines_test.py
[modify] https://crrev.com/85314011f396ecfdcd9732310de33c1d05c303a2/appengine/findit/gae_libs/pipelines.py
[modify] https://crrev.com/85314011f396ecfdcd9732310de33c1d05c303a2/appengine_module/testing_utils/testing.py

Comment 9 by st...@chromium.org, Feb 27 2018

Status: Verified (was: Assigned)

Sign in to add a comment