Gmail chat (hangouts) fails to load |
||||||||||||||||
Issue descriptionVersion: 55.0.2861.0 (Official Build) canary (64-bit) OS: macOS 10.11.6 What steps will reproduce the problem? (1) Load Gmail What is the expected output? Expect the Hangouts chat sub-panel to load. What do you see instead? It reports the "Something's not right" error. The console indicates: initial.js:344 POST https://notifications.google.com/u/0/coc (anonymous) @ initial.js:344_.Nd @ initial.js:57_.rj @ initial.js:343_.tj @ initial.js:342sv.get @ rs=AA2YrTueJhp97u7qaVXdTYR9mIPF_MIE1g:136(anonymous) @ rs=AA2YrTueJhp97u7qaVXdTYR9mIPF_MIE1g:138 /mail/u/0/#label/khronos-webgl/1572e37f6afc1818:1 XMLHttpRequest cannot load https://notifications.google.com/u/0/coc. No 'Access-Control-Allow-Origin' header is present on the requested resource. Origin 'https://mail.google.com' is therefore not allowed access. The response had HTTP status code 500. If I launch Canary with a fresh --user-data-dir it works. This broke since yesterday -- last night in fact. I don't know whether this is caused by a recent bad CL, corrupted file in the user-data-dir, cookie jar, bad experiment, etc. Would appreciate pointers on how to diagnose this. I'll save off my user profile, but am going to have to start with a fresh one in order to continue getting work done. Enabled experiments: 6a89113b-2bdd0794 68ebfce2-ca7d8d80 16e0dd70-3f4a17df 90757ebb-870290a7 31101bd6-f23d1dea b3888d8d-459fc675 241fff6c-ca7d8d80 1e528f0f-15305a2 70281f79-f23d1dea 6345b824-3f4a17df e197bfc9-ca7d8d80 8364a5c2-ca7d8d80 7c1bc906-f55a7974 1c752ce9-6632fcb6 ba3f87da-5b060836 f049a919-3f4a17df 76b48ab8-a2567007 31362330-3f4a17df c70841c8-a2567007 f15c1c09-ca7d8d80 5274eb09-3f4a17df f11437-f23d1dea 48d08aab-3f4a17df bcc907f7-65bced95 a4566d9e-3d47f4f4 2e109477-bcf405c8 165e16d1-3f4a17df 9e5c75f1-7491430a 8f62251a-4757d2a7 6b121ae7-5e15bfb2 5139837c-3f4a17df 7f8176d9-f23d1dea f79cb77b-3f4a17df fb448877-f23d1dea 23a898eb-ca7d8d80 74df3f1-3f4a17df 7382e39a-3f4a17df 4ea303a6-f23d1dea 7a3692af-bc6856d9 f2e050c6-bc6856d9 fe9bec35-80f9a33e 9736de91-3f4a17df dbffab5d-ca7d8d80 de03e059-ca7d8d80 ca314179-ca7d8d80 c5073fab-f23d1dea 867c4c68-3f4a17df b2f0086-35a81a97 adda5502-ca7d8d80 d747916f-e510b2e9 76923fa8-9cf3205c 3ac60855-3ec2a267 f296190c-d0fe63d6 4442aae2-e1cc0f14 ed1d377-e1cc0f14 75f0f0a0-6bdfffe7 e2b18481-cdc3d902 e7e71889-e1cc0f14 fe05be5f-97e7f871 61b920c1-48c6f5be 46567c16-3f4a17df 52954db7-f23d1dea
,
Sep 16 2016
,
Sep 16 2016
Testing with ToT. I got a different CORS error message and saw "Something's wrong" displayed for a moment. XMLHttpRequest cannot load https://accounts.google.com/ServiceLogin?pass.... Redirect from 'https://accounts.google.com/ServiceLogin?pass...' to 'https://notifications.google.com/accounts/SetOSID?authuser=0&continue=https…' has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource. Origin 'null' is therefore not allowed access. but the hangout panel loaded. I signed out from the hangout panel and re-signed in. It worked without any console error message.
,
Sep 16 2016
I used Linux. Not Mac.
,
Sep 16 2016
Seeing this on Windows-10 with the latest canary(55.0.2861.0) as well. Worked fine till the last canary(55.0.2860.0). Getting consistent 'Something's not right.We're having trouble connecting to Google. We'll keep trying...' message. Attached is the screenshot of the console error. Trying a bisect for this.
,
Sep 16 2016
I could reproduce the same message as one in the opening post. I compared network activity between stable and canary. I see access to coc with GET before one with POST on stable, while on canary I see only POST which is failing CORS check. The response lacks CORS headers, so it's failing. I don't see ServiceWorker launched on canary. Something is wrong with SW?
,
Sep 16 2016
s/canary/ToT/
,
Sep 16 2016
Hmm, but regarding the console error message complaining about Origin 'null', I see it also on stable release. So, this is not the problem, I guess.
,
Sep 16 2016
Considering the manual changelog: Good build: 55.0.2860.0 Bad build: 55.0.2861.0 Changelog: ========== https://chromium.googlesource.com/chromium/src/+log/55.0.2860.0..55.0.2861.0?pretty=fuller&n=10000 There are 3 service worker related changes in the above regression range and few other worker related changes by horo@ and nhiroki@: 1. https://codereview.chromium.org/2339083002 [horo's] 2. https://codereview.chromium.org/2334003004 [horo's] 3. https://codereview.chromium.org/2337253005 [nhiroki's] Cc'ing horo and nhiroki as well for more insight on this. Appreciate your help!
,
Sep 16 2016
I have received this issue from other teams this morning and they were seeing this on Linux with Chrome stable channel.
,
Sep 16 2016
If it's happening on the Stable channel then it's probably a change in Gmail and not Chrome. I filed b/31545680 about this.
,
Sep 16 2016
I got more info from the person who informed about the bug this morning and according to him the message was "Something went wrong while displaying this webpage", Not sure if this would be same bug of different bug.
,
Sep 17 2016
Amit@, Can you please try to do narrow bisect using the script? Thank you!
,
Sep 17 2016
Per b/31545680 , it's not clear this is a bug in Chrome. Let's wait to see what the Gmail team has to say before investing a lot of time chasing a bug that might not have a bisect range at all.
,
Sep 19 2016
Tried a bisect of this using the script and below are the test steps and observation. Steps followed for bisect using script. 1. Opened corp(@google) and non-corp(@gmail) ids on two tabs. 2. On both tabs alternatively, opened 6/7 chat windows and refreshed the gmail tabs 5/6 times. 3. Repeated the step 2 for 1-2 cycle if the issue was not encountered for the first time. 4. Observed the error message is displayed on corp and non corp gmail account randomly. Issue reproduced on Chromium builds when both the tabs where kept opened for 5-6 mins. Below is the bisect result: Last good build: 55.0.2860.0 First bad build: 55.0.2861.0 Changelog: ========== https://chromium.googlesource.com/chromium/src/+log/10c71cf7cd97b4d893cd1fff74a1bd086fc46405..929cbb9f92b5570867c3842c80778243db81a013 alexclarke@: Could you please take a look and help in further investigation. Please land a fix or revert the CL if the change is related as we have scheduled Dev release tomorrow. Thank you! Note: Issue is reproducible across Windows-10, Mac OS 10.11.6 and Linux Ubuntu 14.04 using chrome version: 55.0.2865.0 as well.
,
Sep 19 2016
On a z620 I can't reproduce the "Something's not right" error. I note with a debug build I get the error below (it's not there for a release build and reverting https://codereview.chromium.org/2320403003 doesn't make it go away) and the chat window is really slow to load although it does load. [11186:11186:0919/111223:INFO:CONSOLE(0)] "XMLHttpRequest cannot load https://accounts.google.com/ServiceLogin?passive=1209600&osid=1&continue=https://notifications.google.com/u/0/coc&followup=https://notifications.google.com/u/0/coc&authuser=0. Redirect from 'https://accounts.google.com/ServiceLogin?passive=1209600&osid=1&continue=https://notifications.google.com/u/0/coc&followup=https://notifications.google.com/u/0/coc&authuser=0' to 'https://notifications.google.com/accounts/SetOSID?authuser=0&continue=https%3A%2F%2Fnotifications.google.com%2Fu%2F0%2Fcoc%3Fpli%3D1&osidt=ALWU2cvJb3qgol7CsdaTPPB3vTfRNYCdgJw-DDNxntZZCNZeUEzceEufLmIEH8-Cl57IDGmc_f_ADha-lRzMN-mZ5g6UVeX4vIMe-0yT736WM23E1fKBPBKv6XyWcSPzJlBfXFyCXV8s' has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource. Origin 'null' is therefore not allowed access.", source: https://mail.google.com/mail/u/0/#inbox (0)
,
Sep 19 2016
Correction the XMLHttpRequest failure message is there in release, it's probably a red herring. I'll try some more to reproduce the "Something's not right" error, it sounds like it only happens flakily.
,
Sep 19 2016
I've been trying to reproduce this with little success (I've tried with my corp account and with a personal gmail account) on Mac and Linux and so far I managed to get a "Something's not right" error with hangouts once out of hundreds of attempts. I tried opening a bunch of chat windows and reloading the page - doesn't seem to make any difference (it always seems to work). Is there a better repro? Are we really sure this is a Chrome bug?
,
Sep 19 2016
I also have this issue on Windows Pro 64 bit - Canary Version 55.0.2865.0 canary (64-bit). Affects most Google Services: - gmail - inbox - Google + - Google Calendar - etc. They eventually become unresponsive All of the above work OK on Version 54.0.2840.27 beta-m (64-bit)
,
Sep 19 2016
Any particular repro steps, or just leave them open in a tab? Any idea how long you typically need before they become unresponsive?
,
Sep 19 2016
Reg c#15, thank you Amit for providing the narrow bisect.
,
Sep 19 2016
I've spent most of the day trying to reproduce this, can somebody please confirm the bisect? Thanks!
,
Sep 19 2016
Sure, let me work on it. Thank you!
,
Sep 19 2016
The problem of Hangouts failing to load is still present on: Google Chrome 55.0.2865.0 (Official Build) canary (64-bit) Revision 9ea2a739531edffad0859fdfd0a53f1f039026e4-refs/heads/master@{#419399} Additionally, Gmail eventually becomes unresponsive while performing operations like archiving email threads. Thanks to ajha@ for finding the suspect CL. I'm going to attempt a revert.
,
Sep 19 2016
For some reason, i am not seeing the same suspected CL (per c#15) consistently on each bisect script run. Just wondering, would it be related to this change (https://buganizer.corp.google.com/issues/31545680#comment17)? Thank you!
,
Sep 19 2016
I filed b/31545680 so that the Gmail / Hangouts teams would start investigating this from their side. Note: I seem to see this frequently when closing another tab which results in the Gmail tab being brought to the foreground. Maybe it's related to tab visibility changes too?
,
Sep 20 2016
Regarding c#9, > 1. https://codereview.chromium.org/2339083002 [horo's] > 2. https://codereview.chromium.org/2334003004 [horo's] > 3. https://codereview.chromium.org/2337253005 [nhiroki's] I think these CLs don't affect loading (cc falken@ jfyi).
,
Sep 20 2016
,
Sep 20 2016
I don't think the CORS error is the root cause of the hangouts problem. I'm seeing the hangouts problem in canary channel Chrome, but dev channel Chrome, 55.0.2859.0, also has the CORS error on startup but doesn't have any other problems. This is on Windows 7.
,
Sep 20 2016
Just FYI: We've delayed tomorrow's M55 dev release to later this week due to this bug. Please try to resolve this ASAP.Thank you.
,
Sep 20 2016
,
Sep 20 2016
Note: https://codereview.chromium.org/2320403003 was reverted today in https://codereview.chromium.org/2353473003/ . Tomorrow's Canary should contain the revert, in order to confirm whether that was the cause. (@dbloch: you are probably right that the CORS issue is a red herring. Was difficult to tell.)
,
Sep 20 2016
As an additional data point, I could reproduce this locally on Windows 7 and confirmed https://codereview.chromium.org/2320403003 was the first bad revision (via per-revision bisect-builds.py). Steps (100% reproducible on my environment): 1. Open corp gmail on five tabs. 2. Wait for seconds for the tabs to be loaded (at this time hangout is shown without error messages). 3. Wait for ~5 mins. 4. See whether the hangout on any of the (background) tabs turned into "Something's not right". Command: $ python bisect-builds.py -o -b 418557 -g 418555 --user-data-dir="C:\Users\<username>\AppData\Local\Google\Chrome\User Data\Default" --use-local-cache --archive win64 --verify-range
,
Sep 20 2016
Thanks for the confirmation. I'm curious though, I spent hours trying to reproduce it yesterday with very limited success (1 time out of hundreds) is the failure only associated with a gmail experiment?
,
Sep 20 2016
Thanks hiroshige@ for the updated steps. I was able to reproduce the issue as per C#15 and C#33 on the reported version: 55.0.2861.0 on Windows-10, Mac OS 10.11.6 and Linux Ubuntu 14.04. Can confirm that the revert of the suspected CL(https://codereview.chromium.org/2353473003/) has worked and this is working fine on the latest M-55(55.0.2866.0) of Windows-10, Mac OS 10.11.6 and Linux Ubuntu 14.04 as per the test steps in C#33 and C#15. Adding the verified label therefore. Note: >#34: I was unable to reproduce the same with Calendar.
,
Sep 20 2016
Re #33 thanks for repro steps they work for me reliably.
,
Sep 20 2016
I understand what went wrong now. My patch made the TimeDomain only know about the very next delayed task for each queue, we had thought this was safe. Unfortunately it broke code in ThrottlingHelper::PumpThrottledTasks because after calling TimeDomain::ClearExpiredWakeups there was a good probability that TimeDomain::NextScheduledRunTime would return the wrong answer because it didn't know about the subsequent wakeups. I note in passing, we got unlucky here because the throttling code is about to get refactored and this patch (https://codereview.chromium.org/2258133002/), if it had landed, would have fixed it.
,
Sep 20 2016
,
Sep 20 2016
Looks good to me in the latest Canary.
,
Sep 20 2016
Latest canary is working for me too. alexclarke@, it's good to know you know the root cause now. Downgrading this to P2 as the emergency is resolved. Please make sure this issue is resolved before re-landing the associated code.
,
Sep 20 2016
,
Sep 20 2016
,
Sep 21 2016
,
Sep 22 2016
Can we close this now?
,
Sep 22 2016
Yes, assuming you'll ensure it doesn't regress again when re-landing the code.
,
Sep 22 2016
Will do. |
||||||||||||||||
►
Sign in to add a comment |
||||||||||||||||
Comment 1 by kbr@chromium.org
, Sep 16 2016