New issue
Advanced search Search tips

Issue 751925 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Sep 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug



Sign in to add a comment

Replace Tumble with task queues.

Project Member Reported by d...@chromium.org, Aug 3 2017

Issue description

Currently, LogDog uses the Tumble journaling state machine to manage its
archival. Tumble is generally overkill for this one-state task, but was
chosen because it seemed, at the time, likely that it would be used
everywhere in LUCI.

Almost two years later, LogDog is the only production major user of
Tumble. Since it barely scrapes the power of Tumble, and since Tumble
itself is rather opaque in its operations, this trade-off is not
worthwhile. Instead, we replace Tumble with task queues.
 
Project Member

Comment 1 by bugdroid1@chromium.org, Aug 3 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/external/github.com/luci/luci-go.git/+/6553860f73d8b8e76215fa065c38021fbffe07de

commit 6553860f73d8b8e76215fa065c38021fbffe07de
Author: dnj <dnj@chromium.org>
Date: Thu Aug 03 17:48:48 2017

[tq] Move to top-level package.

Move the "tq" task dispatcher to a top-level package. This is useful
outside of scheduler, and will be used in LogDog for task dispatching.

BUG= chromium:751925 
TEST=None
R=vadimsh@chromium.org

Review-Url: https://codereview.chromium.org/2988413002

[rename] https://crrev.com/6553860f73d8b8e76215fa065c38021fbffe07de/appengine/tq/tq.go
[rename] https://crrev.com/6553860f73d8b8e76215fa065c38021fbffe07de/appengine/tq/tq_test.go
[modify] https://crrev.com/6553860f73d8b8e76215fa065c38021fbffe07de/scheduler/appengine/engine/cron/demo/main.go

Project Member

Comment 2 by bugdroid1@chromium.org, Aug 3 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/external/github.com/luci/luci-go.git/+/da9d004ddcf295086e66df450f6779839259be9c

commit da9d004ddcf295086e66df450f6779839259be9c
Author: dnj <dnj@chromium.org>
Date: Thu Aug 03 21:03:14 2017

[tq] Enable task deletion.

Enable named task deletion. This generalizes the task queue batching
function.

Add the concept of a name suffix to the task. This allows the user to
supply information without discarding sharding utility.

BUG= chromium:751925 
TEST=unit

Review-Url: https://codereview.chromium.org/2986373002

[modify] https://crrev.com/da9d004ddcf295086e66df450f6779839259be9c/appengine/tq/tq.go
[modify] https://crrev.com/da9d004ddcf295086e66df450f6779839259be9c/appengine/tq/tq_test.go

Project Member

Comment 3 by bugdroid1@chromium.org, Aug 3 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/external/github.com/luci/luci-go.git/+/18ae6bdbc6652e10610d55e4528d4bb3c624f813

commit 18ae6bdbc6652e10610d55e4528d4bb3c624f813
Author: dnj <dnj@chromium.org>
Date: Thu Aug 03 21:14:18 2017

[logdog] Replace Tumble with push queues.

Replace Tumble with push queues for archival.

Currently, LogDog uses the Tumble journaling state machine to manage its
archival. Tumble is generally overkill for this one-state task, but was
chosen because it seemed, at the time, likely that it would be used
everywhere in LUCI.

Almost two years later, LogDog is the only production major user of
Tumble. Since it barely scrapes the power of Tumble, and since Tumble
itself is rather opaque in its operations, this trade-off is not
worthwhile. Instead, we replace Tumble with task queues.

When a log stream is registered, an "expired" task will be enqueued to
handle it once the stream expires (if it never gets terminated). When
the stream is terminated, the expiration task is deleted, replaced with
a shorter-term archival task.

We leave Tumble and its mutation handling in-place because, in
production, there is still a Tumble backlog to process through. This
should be empited within a few days, and we can finish the removal.

BUG= chromium:751925 
TEST=unit

Review-Url: https://codereview.chromium.org/2989333002

[add] https://crrev.com/18ae6bdbc6652e10610d55e4528d4bb3c624f813/common/gcloud/pubsub/publisher.go
[modify] https://crrev.com/18ae6bdbc6652e10610d55e4528d4bb3c624f813/common/gcloud/pubsub/topic.go
[modify] https://crrev.com/18ae6bdbc6652e10610d55e4528d4bb3c624f813/logdog/api/endpoints/coordinator/services/v1/pb.discovery.go
[modify] https://crrev.com/18ae6bdbc6652e10610d55e4528d4bb3c624f813/logdog/api/endpoints/coordinator/services/v1/service.pb.go
[modify] https://crrev.com/18ae6bdbc6652e10610d55e4528d4bb3c624f813/logdog/api/endpoints/coordinator/services/v1/tasks.pb.go
[modify] https://crrev.com/18ae6bdbc6652e10610d55e4528d4bb3c624f813/logdog/api/endpoints/coordinator/services/v1/tasks.proto
[modify] https://crrev.com/18ae6bdbc6652e10610d55e4528d4bb3c624f813/logdog/appengine/cmd/coordinator/backend/main.go
[modify] https://crrev.com/18ae6bdbc6652e10610d55e4528d4bb3c624f813/logdog/appengine/cmd/coordinator/backend/module-backend.yaml
[modify] https://crrev.com/18ae6bdbc6652e10610d55e4528d4bb3c624f813/logdog/appengine/cmd/coordinator/services/main.go
[modify] https://crrev.com/18ae6bdbc6652e10610d55e4528d4bb3c624f813/logdog/appengine/cmd/coordinator/services/module-services.yaml
[modify] https://crrev.com/18ae6bdbc6652e10610d55e4528d4bb3c624f813/logdog/appengine/cmd/coordinator/vmuser/app.yaml
[modify] https://crrev.com/18ae6bdbc6652e10610d55e4528d4bb3c624f813/logdog/appengine/cmd/coordinator/vmuser/queue.yaml
[modify] https://crrev.com/18ae6bdbc6652e10610d55e4528d4bb3c624f813/logdog/appengine/coordinator/archivalPublisher.go
[modify] https://crrev.com/18ae6bdbc6652e10610d55e4528d4bb3c624f813/logdog/appengine/coordinator/coordinatorTest/context.go
[add] https://crrev.com/18ae6bdbc6652e10610d55e4528d4bb3c624f813/logdog/appengine/coordinator/coordinatorTest/taskqueue.go
[modify] https://crrev.com/18ae6bdbc6652e10610d55e4528d4bb3c624f813/logdog/appengine/coordinator/endpoints/services/registerStream.go
[modify] https://crrev.com/18ae6bdbc6652e10610d55e4528d4bb3c624f813/logdog/appengine/coordinator/endpoints/services/registerStream_test.go
[modify] https://crrev.com/18ae6bdbc6652e10610d55e4528d4bb3c624f813/logdog/appengine/coordinator/endpoints/services/terminateStream.go
[modify] https://crrev.com/18ae6bdbc6652e10610d55e4528d4bb3c624f813/logdog/appengine/coordinator/endpoints/services/terminateStream_test.go
[modify] https://crrev.com/18ae6bdbc6652e10610d55e4528d4bb3c624f813/logdog/appengine/coordinator/service.go
[add] https://crrev.com/18ae6bdbc6652e10610d55e4528d4bb3c624f813/logdog/appengine/coordinator/tasks/archival.go
[add] https://crrev.com/18ae6bdbc6652e10610d55e4528d4bb3c624f813/logdog/appengine/coordinator/tasks/routes.go
[modify] https://crrev.com/18ae6bdbc6652e10610d55e4528d4bb3c624f813/logdog/client/butler/output/logdog/output.go

Project Member

Comment 4 by bugdroid1@chromium.org, Aug 27 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/infra/+/305871228d893108315ea3c7568248e99951223b

commit 305871228d893108315ea3c7568248e99951223b
Author: Dan Jacques <dnj@chromium.org>
Date: Sun Aug 27 18:27:31 2017

Roll luci-go and luci/gae.

infra/go/src/go.chromium.org/luci:
a955ad0c Rephrase documentation to emphasize building.
db9711ef Expand documentation for using RPC Explorer in helloworld
example.
0e592e42 Makes the mmutex lock file path based on $MMUTEX_LOCK_DIR env
variable
96c2d00f Update tests to account for surfaced ViewURL and changes in
https://chromium-review.googlesource.com/624616, which dedicate the
in-memory config to testing.
c76a2c49 [logdog] Remove redundant Offset value.
2392b614 [gaeauth] Marshal using JSON instead of "gob".
5eb49035 [luci_config] Add dedicated erroring interface.
a895992c Milo: Add stats for how old buildbot master entries are
bf35680f Add interface for fakelogs package.
8c9e5d18 Milo: Drop master json entries that are more than 1MB
compressed
01b93274 [logdog] Add Makefile rules for GKE services.
feb6cf92 Support CIPD packages in Swarming client
e393c202 [caching] Server requires process cache.
adf70fae [lru] Fix GetOrCreate, add Create.
d50a1614 isolate: move logging into doExpArchive and doArchive
c79515e0 [logdog] Move microservices to Alpine Linux.
7a434b55 Remove unneeded GetConfigSetURL.
8c346853 [logdog] Fix "rpcexplorer" dispatching.
f14c4c28 [logdog] Add update commands to Makefile help.
701e0c83 [logdog] Refactor coordinator/fetcher to be more user friendly.
70b8f1ab [milo] Update source_manifest.proto to milo.
59385f46 [gaemiddleware] Split standard environment.
fa52ad43 scheduler: Better document the syntax of cron job schedule.

infra/go/src/go.chromium.org/gae:
e15119d [proto-gae] Update default copyright header.
ae4570c [cloud] Implement more complete Flex support.

TBR=iannucci@chromium.org
BUG= chromium:751925 
TEST=None

Change-Id: I612114b83bd80240fdaee512ef7d41c709f89cbb
Reviewed-on: https://chromium-review.googlesource.com/636852
Commit-Queue: Daniel Jacques <dnj@chromium.org>
Reviewed-by: Daniel Jacques <dnj@chromium.org>

[modify] https://crrev.com/305871228d893108315ea3c7568248e99951223b/go/src/infra/appengine/sheriff-o-matic/frontend/main.go
[modify] https://crrev.com/305871228d893108315ea3c7568248e99951223b/go/src/infra/libs/ephelper/middleware.go
[modify] https://crrev.com/305871228d893108315ea3c7568248e99951223b/go/src/infra/appengine/sheriff-o-matic/backend/main.go
[modify] https://crrev.com/305871228d893108315ea3c7568248e99951223b/go/src/infra/tricium/appengine/common/common.go
[modify] https://crrev.com/305871228d893108315ea3c7568248e99951223b/go/src/infra/appengine/luci-migration/app/handlers.go
[modify] https://crrev.com/305871228d893108315ea3c7568248e99951223b/go/src/infra/appengine/test-results/frontend/handlers.go
[modify] https://crrev.com/305871228d893108315ea3c7568248e99951223b/go/src/infra/experimental/appengine/buildbucket-viewer/frontend/main.go
[modify] https://crrev.com/305871228d893108315ea3c7568248e99951223b/go/src/infra/appengine/dashboard/frontend/dashboard.go
[modify] https://crrev.com/305871228d893108315ea3c7568248e99951223b/DEPS
[modify] https://crrev.com/305871228d893108315ea3c7568248e99951223b/go/src/infra/tricium/appengine/frontend/init.go

Project Member

Comment 5 by bugdroid1@chromium.org, Aug 28 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-go.git/+/1bbd1276b80b88de52e59beebb0537269637fd14

commit 1bbd1276b80b88de52e59beebb0537269637fd14
Author: Dan Jacques <dnj@chromium.org>
Date: Mon Aug 28 07:09:13 2017

[logdog] Set default gRPC receive/send size.

A gRPC roll sets the maximum gRPC received message size. Some BigTable
RPCs are exceeding this size. Explicitly set it to 16MB (default is 4MB)
so we don't bump up against the default size limit.

TBR=iannucci@chromium.org,hinoka@chromium.org
BUG= chromium:751925 
TEST=None

Change-Id: I6e5476fa58dbcfe2a0a041af1886148385cb9597
Reviewed-on: https://chromium-review.googlesource.com/637343
Reviewed-by: Daniel Jacques <dnj@chromium.org>
Commit-Queue: Daniel Jacques <dnj@chromium.org>

[modify] https://crrev.com/1bbd1276b80b88de52e59beebb0537269637fd14/logdog/appengine/coordinator/service.go
[modify] https://crrev.com/1bbd1276b80b88de52e59beebb0537269637fd14/logdog/common/storage/bigtable/storage.go
[modify] https://crrev.com/1bbd1276b80b88de52e59beebb0537269637fd14/logdog/server/service/service.go
[modify] https://crrev.com/1bbd1276b80b88de52e59beebb0537269637fd14/milo/logs/main.go

Project Member

Comment 6 by bugdroid1@chromium.org, Aug 29 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-go.git/+/92702d20822ad9909642f81442359d558ffb6be8

commit 92702d20822ad9909642f81442359d558ffb6be8
Author: Dan Jacques <dnj@google.com>
Date: Tue Aug 29 00:21:47 2017

[logdog] Fix task queue deletion.

Task queue delete will always fail within a transasction. Delete the
task queue entry post-transaction on success.

BUG= chromium:751925 
TEST=unit

Change-Id: Id0a0c2da3ae3ff887665bc285536039384bee27d
Reviewed-on: https://chromium-review.googlesource.com/639790
Commit-Queue: Daniel Jacques <dnj@chromium.org>
Commit-Queue: Robbie Iannucci <iannucci@chromium.org>
Reviewed-by: Robbie Iannucci <iannucci@chromium.org>

[modify] https://crrev.com/92702d20822ad9909642f81442359d558ffb6be8/logdog/appengine/coordinator/endpoints/services/terminateStream.go

Project Member

Comment 7 by bugdroid1@chromium.org, Sep 5 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-go.git/+/7e5647457576c203af078294efcde889a37cd87b

commit 7e5647457576c203af078294efcde889a37cd87b
Author: Dan Jacques <dnj@google.com>
Date: Tue Sep 05 22:22:03 2017

[logdog] Remove archival task queue.

Remove archival task queue. The system consumed too many resources to be
worth considering as a replacement to Tumble.

BUG= chromium:751925 
TEST=unit

Change-Id: If1752dc1c66d91555518f064a21abcb14586ba4f
Reviewed-on: https://chromium-review.googlesource.com/651556
Reviewed-by: Robbie Iannucci <iannucci@chromium.org>
Commit-Queue: Daniel Jacques <dnj@chromium.org>

[modify] https://crrev.com/7e5647457576c203af078294efcde889a37cd87b/logdog/appengine/cmd/coordinator/backend/main.go
[modify] https://crrev.com/7e5647457576c203af078294efcde889a37cd87b/logdog/appengine/cmd/coordinator/vmuser/queue.yaml
[delete] https://crrev.com/d44b628d629b7e319c848110482e4aeb192b0435/logdog/appengine/coordinator/tasks/archival.go
[delete] https://crrev.com/d44b628d629b7e319c848110482e4aeb192b0435/logdog/appengine/coordinator/tasks/routes.go

Comment 8 by d...@chromium.org, Sep 5 2017

Status: Fixed (was: Started)

Sign in to add a comment