New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 649454 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Sep 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug



Sign in to add a comment

pRPC client not retrying on transient AppEngine-level errors.

Project Member Reported by d...@chromium.org, Sep 22 2016

Issue description

pRPC client retries only when gRPC code header indicates a transient failure. Unfortunately, AppEngine can directly return a transient failure without consulting the app (and, therefore, the pRPC server).

This log was from an RPC request that occurred during service deployment: https://luci-milo.appspot.com/swarming/task/316c389e5e305010/steps/recipe_bootstrap/0/stdout

The pRPC client should be resilient against this sort of failure.
 
Project Member

Comment 1 by bugdroid1@chromium.org, Sep 26 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/external/github.com/luci/luci-go.git/+/25b6d8dd95fb75b45fb7454c76cc0a4d7fa3e0fc

commit 25b6d8dd95fb75b45fb7454c76cc0a4d7fa3e0fc
Author: dnj <dnj@chromium.org>
Date: Mon Sep 26 16:45:54 2016

pRPC: Retry on certain transient HTTP failures.

AppEngine can directly return certain retriable HTTP failures before
hitting the actual pRPC service. Modify the pRPC client to retry on
these failures by default.

BUG= chromium:649454 
TEST=None
R=iannucci@chromium.org, nodir@chromium.org

Review-Url: https://codereview.chromium.org/2360403002

[modify] https://crrev.com/25b6d8dd95fb75b45fb7454c76cc0a4d7fa3e0fc/grpc/prpc/client.go
[modify] https://crrev.com/25b6d8dd95fb75b45fb7454c76cc0a4d7fa3e0fc/grpc/prpc/client_test.go

Comment 2 by d...@chromium.org, Sep 26 2016

Status: Fixed (was: Untriaged)

Sign in to add a comment