New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 697805 link

Starred by 23 users

Issue metadata

Status: Fixed
Owner:
Closed: Mar 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug

Blocked on:
issue 809225



Sign in to add a comment

30s freeze at boot on Dell Chromebook 13

Reported by romain.f...@sentryo.net, Mar 2 2017

Issue description

Chrome Version: 56.0.2924.110 (Official Build) (64-bit)
Chrome OS Version: 9000.91.0 (Official Build) stable-channel lulu
Chrome OS Platform: Dell Chromebook 13 (lulu)

Since the update to M56 in the stable channel, my Dell Chromebook 13 freezes for 30 seconds at boot. User-visible symptoms are a frozen login screen, which unfreezes after 30 seconds. After that the system appears to work fine. Powerwashing doesn't fix the issue.

This SATA error in dmesg is probably related:

[    4.549635] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[    5.281530] tpm_tis tpm_tis: command 0x65 (size 22) returned code 0x0
[    5.317468] tpm_tis tpm_tis: command 0x65 (size 22) returned code 0x0
[   34.729912] ata1.00: exception Emask 0x0 SAct 0x3 SErr 0x0 action 0x6 frozen
[   34.729936] ata1.00: failed command: READ FPDMA QUEUED
[   34.729957] ata1.00: cmd 60/08:00:30:72:20/00:00:00:00:00/40 tag 0 ncq 4096 in
                        res 40/00:00:e0:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
[   34.729985] ata1.00: status: { DRDY }
[   34.729997] ata1.00: failed command: READ FPDMA QUEUED
[   34.730015] ata1.00: cmd 60/10:08:d0:28:27/00:00:00:00:00/40 tag 1 ncq 8192 in
                        res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[   34.730043] ata1.00: status: { DRDY }
[   34.730062] ata1: hard resetting link
[   34.994812] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[   34.997994] ata1.00: configured for UDMA/133
[   34.998015] ata1.00: device reported invalid CHS sector 0
[   34.998053] ata1: EH complete
[   35.014484] SELinux: initialized (dev loop6, type squashfs), uses xattr

Other reports of the same issue: https://www.reddit.com/r/chromeos/comments/5swocg/chrome_os_56_on_dell_chromebook_13_slow_to/

Based on the number of identical reports from other users, this is probably not a hardware issue with my unit.
 
Same problem with ChromeOS 57 now in the stable channel (57.0.2987.123).
i have had the 30 second delay problem since you automatically installed 56. I called and you did a power wash and also tried another version of 56. No help. I am now on 57 and the problem persists. I talked to dell and after diagnosis, they believe its a software problem, not hardware. Many google forum users have the same problem
Version 58.0.3029.51 beta (64-bit)
Platform 9334.33.0 (Official Build) beta-channel lulu
ARC Version 3872362
Firmware Google_Lulu.6301.136.57
Dell 13 
Had this issue for some time on both stable and beta.
I can confirm this issue on my Dell 7310 since Chrome OS 56 DEV and all releases since. Restoring back to an earlier version of Chrome OS 54 temporarily fixes the problem, until next automatic update.

As I have been affected by this issue for 6 months now, I have gathered quite a lot of info on this matter, and instead of repeating myself, noted my findings including screenshots of logs in this google document https://goo.gl/TdnMJM

Feel free to contact me for additional troubleshooting and system logs at bjarne.oldrup@gmail.com

Comment 5 by a...@serwatuk.com, Apr 11 2017

Also having this problem. Started with update to Chrome OS Version 56.

Version 58.0.3029.51 beta (64-bit)
Platform 9334.33.0 (Official Build) beta-channel lulu
ARC Version 3872362
Firmware Google_Lulu.6301.136.57
I've been experiencing the 30 sec boot delay on my Dell Chromebook 13 (i3, 4GB) since version 56.

Version 58.0.3029.51 beta (64-bit)
Platform 9334.33.0 (Official Build) beta-channel lulu
ARC Version 3872362
Firmware Google_Lulu.6301.136.57
Just updated to a newer version and the problem still persists: 

Version 58.0.3029.68 beta (64-bit)
Platform 9334.42.0 (Official Build) beta-channel lulu
ARC Version 3900769
Firmware Google_Lulu.6301.136.57

Comment 8 by mor...@gmail.com, Apr 14 2017

I've been experiencing the 30 sec boot delay on my Dell Chromebook 13 (i3, 4GB) since version 56.
Version 58.0.3029.51 beta (64-bit) Platform 9334.33.0 (Official Build) beta-channel lulu ARC Version 3872362 Firmware Google_Lulu.6301.136.57
I also experience the same delay in logging in on my Dell Chromebook 13 (i3 4gb). 
Could the issue be that on first login after rebooting, the device is taking longer than normal to connect to the wifi? The reason i suspect this is because the wifi icon doesn't move and remains in the disconnected state throughout that period. 

I am on 

Version 57.0.2987.137 (64-bit)
Platform 9202.60.0 (Official Build) stable-channel lulu
ARC Version 3852170
Firmware Google_Lulu.6301.136.57
#CBC-RS/TC-watchlist

Comment 11 by ctwq...@gmail.com, Apr 14 2017

Version: Version 58.0.3029.68 (Official Build) beta (64-bit)
Platform: 9334.42.0 (Official Build) beta-channel lulu
Arc: 3900769
Firmware: Google_Lulu.6301.136.57
Channel: Currently on beta
Machine: Dell 7310, i5, 8 GB Ram, 

I've had this issue for months, probably since December/January. I believe it started with version 56. It takes roughly 30 seconds to boot.
I also experience this issue on Beta channel with my Dell CB 13 (Celeron, 4GB).

Version 58.0.3029.68 beta (64-bit)
Platform 9334.42.0 (Official Build) beta-channel lulu
ARC Version 3900769
Firmware Google_Lulu.6301.136.57


Comment 13 by tkd...@gmail.com, Apr 15 2017

Version 57.0.2987.146 (64-bit)
Platform 9202.64.0 (Official Build) stable-channel lulu
ARC Version 3880239
Firmware Google_Lulu.6301.136.57

Same issue for at least a couple of months on my i3 (8GB). Also just updated today, and the issue remains.
I'm also seeing the 30 second login delay on my Dell Chromebook 7310 .(Celeron, 4GB) This delay started after the version 56 update and is still present.


Version 57.0.2987.146 (64-bit)
Platform 9202.64.0 (Official Build) stable-channel lulu
ARC Version 3880239
Firmware Google_Lulu.6301.136.57
same here with CB13 - i3 - 8Gb Ram - 32Gb. Powerwashed, full replace with linux than revert back to clean Chrome OS but issue is here from 3 months now.

Versione 58.0.3029.68 beta (64-bit)
Piattaforma 9334.42.0 (Official Build) beta-channel lulu
Versione ARC 3900769
Firmware Google_Lulu.6301.136.57

Comment 16 by a...@serwatuk.com, Apr 20 2017

Interestingly, today when I powered on the Chromebook it did NOT experience the 30 second delay in any way. It did yesterday and there's been no updates since then.

Link to the device log and about system page info
https://drive.google.com/open?id=0B0vYGtV0mN4XVWNFdFl4cmNmbWc
Just experienced this on my Dell CB13, something like a 20s delay.

I noticed the wifi icon was trying to connect the whole time. This is a slow process that can take 15-30s, I wonder if it's related.

Version 57.0.2987.146 (64-bit)
Platform 9202.64.0 (Official Build) stable-channel lulu
ARC Version 3880239
Firmware Google_Lulu.6301.136.57
Check for and apply updates
to serwatuk, what version are you on and are you still not experiencing the delay?

Comment 19 by a...@serwatuk.com, Apr 25 2017

Comes and goes. Some times I had it, some times I didn't. Cold boot or
restart didn't seem to make a difference.

I was on beta until last night. Had to change back to stable (long story).
But I didn't see the delay this morning. I'll keep my eye out for it now
that I'm back on stable.
So what needs to happen to upgrade this from a Priority 3?
I assume google needs to see many more users with the same problem. It looks like OS 58 beta users are still experiencing the problem. I've been trying to find people who have other brand chromebooks on the web to see if the problem is isolated to dell; but haven't found any yet. Does anyone know if there is a wifi card in the dell 7310 that could be the problem? I have another dell 7310 celeron that does not have the boot delay which my i3 has; so i wonder if the problem occurs only with dell i3's.
I just upgraded to OS 58 and still have the 30 second delay. I'm sending my chromebook back to dell to see if its a hardware problem. I'll report findings.
I've had this boot delay since version 56 with my Dell Chromebook 13 (i5). I've tried both stable and beta channels but no help. Please prioritise this issue.

Comment 24 Deleted

Also see exactly the same issue the majority of boots, here's my latest dmesg log:

[    6.598840] cfg80211:   (57240000 KHz - 65880000 KHz @ 2160000 KHz), (N/A, 4000 mBm)
[   36.715442] ata1.00: exception Emask 0x0 SAct 0x3 SErr 0x0 action 0x6 frozen
[   36.715466] ata1.00: failed command: READ FPDMA QUEUED
[   36.715487] ata1.00: cmd 60/40:00:60:e0:30/00:00:00:00:00/40 tag 0 ncq 32768 in
                        res 40/00:00:e0:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
[   36.715516] ata1.00: status: { DRDY }
[   36.715527] ata1.00: failed command: READ FPDMA QUEUED
[   36.715546] ata1.00: cmd 60/80:08:a0:e0:30/00:00:00:00:00/40 tag 1 ncq 65536 in
                        res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[   36.715574] ata1.00: status: { DRDY }
[   36.715593] ata1: hard resetting link
[   36.980298] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[   36.981044] ata1.00: configured for UDMA/133
[   36.981086] ata1.00: device reported invalid CHS sector 0
[   36.981121] ata1: EH complete






I have a Dell 13 that has been running the stable channel since I bought it about 6 months ago. Currently on Version 58.0.3029.89 (Official Build) (64-bit)
Only in the last few updates has it started to lag at login somewhat like OP describes.

What I'm having is a 3-4 second delay after reaching the login and typing my credentials. Up until that point the system is lightning fast and the same as always.

I would really like to see Google fix this as this machine has been very stable otherwise.
Cc: rjahagir@chromium.org helenzhang@chromium.org ka...@chromium.org sontis@chromium.org
Components: OS>Kernel
Labels: -Pri-3 M-58 Pri-2
Status: Untriaged (was: Unconfirmed)
Confirmed with version 58.0.3029.106/9334.65.0 stable-channel lulu
Firmware Google_Lulu.6301.136.57  
and was able to reproduce 4 out of 6 times.

Logs attached: https://pantheon.corp.google.com/storage/browser/chromiumos-test-logs/bugfiles/cr/697805/
I'm still experiencing the 30 sec boot delay on my Dell Chromebook 13 (i3, 4GB). Have been since version 56.

Version 58.0.3029.89 beta (64-bit)
Platform 9334.58.0 (Official Build) beta-channel lulu
ARC Version 3935470
Firmware Google_Lulu.6301.136.57
Problem was fixed for me in beta 58.0.3029.89 but since updating to beta 59.0.3071.47, the problem has returned.  I tried resetting all the chrome://flags but the issue is still there.  I've also tried using the new stable version 58.0.3029.112 and clearing flags but the issue is still there.

It would appear the problem was fixed in beta 58.0.3029.89 and has now been lost!
Same issue on my Dell CB13.

Version 58.0.3029.112 (64-bit)
Platform 9334.69.0 (Official Build) stable-channel lulu
ARC Version 3989055
Firmware Google_Lulu.6301.136.57

[   36.712579] ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen
[   36.712603] ata1.00: failed command: READ FPDMA QUEUED
[   36.712624] ata1.00: cmd 60/08:00:78:74:18/00:00:00:00:00/40 tag 0 ncq 4096 in
                        res 40/00:00:e0:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
[   36.712653] ata1.00: status: { DRDY }
[   36.712672] ata1: hard resetting link
[   36.977484] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Have the issue on mine:

Version 58.0.3029.112 (64-bit)
Platform 9334.69.0 (Official Build) stable-channel lulu
ARC Version 3989055
Firmware Google_Lulu.6301.136.57

[    6.264802] tpm_tis tpm_tis: command 0x21 (size 14) returned code 0x0
[    6.330590] wlan0: authenticate with 60:e3:27:a6:32:7c
[    6.335444] wlan0: send auth to 60:e3:27:a6:32:7c (try 1/3)
[    6.336273] wlan0: authenticated
[    6.337127] wlan0: associate with 60:e3:27:a6:32:7c (try 1/3)
[    6.347306] wlan0: RX AssocResp from 60:e3:27:a6:32:7c (capab=0x411 status=0 aid=3)
[    6.348843] wlan0: associated
[    6.348881] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
[    6.348939] cfg80211: Calling CRDA for country: US
[    6.949564] iwlwifi 0000:01:00.0: No association and the time event is over already...
[   36.737000] ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen
[   36.737024] ata1.00: failed command: READ FPDMA QUEUED
[   36.737045] ata1.00: cmd 60/08:00:10:60:88/00:00:01:00:00/40 tag 0 ncq 4096 in
                        res 40/00:00:e0:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
[   36.737073] ata1.00: status: { DRDY }
[   36.737093] ata1: hard resetting link
[   37.002122] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[   37.002851] ata1.00: configured for UDMA/133
[   37.002909] ata1: EH complete
[   37.046132] tpm_tis tpm_tis: command 0x65 (size 18) returned code 0x0
To add to my comment above, I see the delay on the dev channel as well as on stable. During the delay, system is not responsiv: no typing, no mouse...
Okay, hope I'm not speaking too soon - but as of my last update this issue appears to have been resolved!

Version 59.0.3071.57 (64-bit)
Platform 9460.42.0 (Official Build) stable-channel lulu
ARC Version 4010211
Firmware Google_Lulu.6301.136.57

Dell Chromebook 13 (i3, 4GB)
Just updated today and am still getting the 30 second pause which i've had for a while now!
This is more like booting my windows notebook!

Version 58.0.3029.140 (64-bit)
Platform 9334.72.0 (Official Build) stable-channel lulu
ARC Version 4015103
Firmware Google_Lulu.6301.136.57
Updated to the latest 58 stable available to me, and issues was still present. Switched to 59 beta, and initially it seemed better, but after a powerwash, the issue was back. Have you tried several cold boots James?

Taking that back. The first and second cold boot after 59 beta install was slow, but now I cold booted 6-7 times without delay. Keeping fingers crossed, and will test with 59 stable once thats available to me.

Would be great if others would test 59 stable/beta.
Issue seems solved on my Dell 7310 with Chrome OS 59 beta. Been shutting down and booting the last hour in all ways imaginable (using power key, the menu, logging out and closing display, even power washed once more)

9460.42.0 (Official Build) beta-channel lulu
Firmware Google_Lulu.6301.136.57
I've done multiple cold boots on 59 beta and so far it's working perfectly - no freeze/slow down. Haven't tried a powerwash yet though.
I can also confirm the problem has been resolved in beta 59.0.3071.57, however it is still present in current stable build (58.0.3029.140).
I'm now in stable 59.0.3071.91 and I still experience random slow boots. They are less frequent now but they are still there.
Slow boots on Dev channel too.

Version:
61.0.3129.0
9647.0.0 (Official Build) dev-channel lulu Firmware


Dell said replacing main logic board solved the problem so I sent mine under warranty. When I got it back, the delay was gone; but that was because installing the board reverted the OS back to version 53. As soon as I updated to OS 58, the delay returned. Now I am on stable 59.0.3071.91 and the delay is gone and has been for over a week. I told this to dell and suggested they not tell users that the logic board is the fix. I also said they should have updated to the latest OS and they would have discovered that the problem still existed.
Still having same issues as before with 59.0.3071.134 stable
This had been resolved for me under 59 but now seems to be back as of:

Version 60.0.3112.80 (Official Build) beta (64-bit)

Dell Chromebook 13 - 7310 (i3, 4GB)
This issue has been happening for the past several months for me. The only way to fix it is to powerwash your chromebook and reset it to factory settings. Unfortunately, connecting to the internet prompts it to update upon restart and brings back the slow boot issues.

Still happening on a Dell Chromebook 13 -7310 (Celeron, 4GB)

Version 61.0.3163.70 (Official Build) beta (64-bit)
Cc: abodenha@chromium.org
Albert, is this a known issue?
It's a known issue in Chromebook Central and a few other places I think.

See this CBC thread for some reports: 
Chrome OS 56 stable and very slow boot time on Dell Chromebook 13 ( https://goo.gl/aqaeKr )

It's hard to determine the cause or cure, Dell has even replaced some mobo's.

My Dell Chromebook 13 (7310) / Lulu with 8GB / i5-5300U has never had a problem.

~Dennis (not Albert :))
dell replaced my dell 13 7310 motherboard and it didnt help. i have an older 7310 which never had the delay problem

I am in Beta 63, and this is still an issue.

It cuts one of the most valuable features of chromebooks: quick boot.

I am available to do testings.
Using i3 version
Plataform
10032.59.0 (Official Build) beta-channel lulu
Firmware
Google_Lulu.6301.136.57
Channel
beta
Versión de ARC
4472141
Blink
537.36 (@)
V8
6.3.292.44
My Dell Chromebook 13 (i5) was just updated to 63.0.3239.116 in stable channel and slow boots are very frequent now. Please prioritise this. I'm also available to carry out tests if needed.

Version: 63.0.3239.116 (Official Build) (64-bit)
Platform: 10032.75.0 (Official Build) stable-channel lulu
Firmware: Google_Lulu.6301.136.57
Channel: Currently on stable
ARC Version: 4505339
Blink: 537.36 (@)
V8: 6.3.292.48
Cc: kirtika@chromium.org snanda@chromium.org
Labels: -Pri-2 Pri-1
snanda@ and kirtika@ this seems to be biting a subset of Lulu users for a while now. There's some speculation that it might be network related.  Can anyone on your team take a look?

Comment 53 by a...@serwatuk.com, Jan 7 2018

7310 here, having the problem and available to help troubleshoot/test if you need it.
Also having this problem on and off since version 56 with my 7310 i3.
Currently on beta channel (for the Android apps support).

You might want to look at this thread over at Chromebook Central regarding the same issue:
https://productforums.google.com/forum/#!topic/chromebook-central/ES_ILOHxy1c

Cheers

Jesper

Platform 10176.22.0 (Official Build) beta-channel lulu
Firmware Google_Lulu.6301.136.57
Channel Currently on beta

Comment 55 by cbre...@bscsd.org, Jan 11 2018

Same issue here after updating chrome on several 7310's.
the issue is in 64 and 64 beta.
I've had this issue since early reports last year when I received os version 56 on stable.

I keep hoping with each update it will be fixed, but unfortunately not.

Google Chrome	63.0.3239.140 (Official Build) (64-bit)
Revision	0
Platform	10032.86.0 (Official Build) stable-channel lulu
Firmware Version	Google_Lulu.6301.136.57



Owner: gwendal@chromium.org
Gwendal, could you please look into the ATA errors being reported here?
Blockedon: 809225
#27 From the log:
2017-05-09T22:44:48.676606+00:00 ERR kernel: [   40.746453] ata1.00: exception Emask 0x0 SAct 0x3 SErr 0x0 action 0x6 frozen
2017-05-09T22:44:48.676629+00:00 ERR kernel: [   40.746478] ata1.00: failed command: READ FPDMA QUEUED
2017-05-09T22:44:48.676635+00:00 ERR kernel: [   40.746500] ata1.00: cmd 60/00:00:c0:86:48/02:00:02:00:00/40 tag 0 ncq 262144 in
2017-05-09T22:44:48.676658+00:00 ERR kernel: [   40.746500]          res 40/00:00:e0:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
2017-05-09T22:44:48.676661+00:00 ERR kernel: [   40.746530] ata1.00: status: { DRDY }
2017-05-09T22:44:48.676664+00:00 ERR kernel: [   40.746541] ata1.00: failed command: READ FPDMA QUEUED
2017-05-09T22:44:48.676668+00:00 ERR kernel: [   40.746560] ata1.00: cmd 60/00:08:c0:88:48/02:00:02:00:00/40 tag 1 ncq 262144 in
2017-05-09T22:44:48.676671+00:00 ERR kernel: [   40.746560]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
2017-05-09T22:44:48.676674+00:00 ERR kernel: [   40.746588] ata1.00: status: { DRDY }
2017-05-09T22:44:48.676678+00:00 INFO kernel: [   40.746607] ata1: hard resetting link
2017-05-09T22:44:48.941606+00:00 INFO kernel: [   41.011387] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
2017-05-09T22:44:48.941634+00:00 INFO kernel: [   41.012101] ata1.00: configured for UDMA/133
2017-05-09T22:44:48.941637+00:00 WARNING kernel: [   41.012144] ata1.00: device reported invalid CHS sector 0
2017-05-09T22:44:48.941640+00:00 INFO kernel: [   41.012183] ata1: EH complete

There is no link error, but the response to the read command was lost or never sent. That would point a SSD error, although a proper SSD would return a more meaningful error (media error).
[btw, ignore "device reported invalid CHS sector 0", the tf (taskfile) is invalid as it has not been returned by the device.]
Looking in the smart data, there are not reported media errors. 

At 10.7 (kernel time), when the io was lost (the timeout is 30s), chrome/arc were starting, a lot of activity (shill, wifi...). I wonder if we are losing an interrupt there. 


For reference, lstnr search query:
psd-CHROMEOS_RELEASE_BOARD:lulu AND ( description:slow OR description:freeze )

report a lot of issues.
example: Report ID: 84895565156, 85019084963, 84632340811

Error found in the lab too: 14/26 machines tested have timeout. 
[command used: 
for i in $( atest host list -b board:lulu -s ready -N) ; do
  ssh root@${i}.cros cat /var/log/messages | grep -qe "(timeout)" && echo $i; done

For instance, a simple test triggers it:
https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/hosts/chromeos4-row6-rack3-host5/126835-cleanup

Cc: mka@chromium.org
Although only present since release-R61-9765.B, https://chromium-review.googlesource.com/#/c/chromiumos/platform2/+/568592/ is enabling power management for the SSD when we use to not.
Same issue even on 63
Cc: tbroch@chromium.org
mka, gwendal, tbroch: what do we want to do with c#60 observation?

Comment 63 by gwendal@google.com, Feb 20 2018

Looking in the lab, problem seems to be limited to LITEON ssds: [LITEON IT LST-xx or LITEON CS1-SPxx]

 for i in $(./atest host list -b storage:ssd -s ready -N) ; do ssh root@${i}.cros cat /var/log/messages\* | grep -qe "Emask 0x4 (timeout)" && ( echo -n "$i: " ; ssh root@$i.cros cat /sys/block/sda/device/model ) ; done

[of 688 machines were listed, 9 had a timeout issues, all were using LITEON ssds]

To reproduce the issue, interleaving a flow a management commands (PIO) with NCQ write (not too many, otherwise the PIO command will not be scheduled) reproduces the problem:


FILENAME=/dev/sda5 FILESIZE=1073741824 VERIFY_ONLY=0 fio --output-format=json "/tmp/read_ncq" & ( while true ; do echo -n "b: "; date ; smartctl -x /dev/sda > /dev/null ; echo -n "a: "; date; done )

Each smartctl runs ends triggering a timeout:

b: Tue Feb 20 09:58:57 PST 2018
a: Tue Feb 20 09:59:27 PST 2018
b: Tue Feb 20 09:59:27 PST 2018
a: Tue Feb 20 09:59:58 PST 2018
b: Tue Feb 20 09:59:58 PST 2018
a: Tue Feb 20 10:00:29 PST 2018
b: Tue Feb 20 10:00:29 PST 2018
a: Tue Feb 20 10:01:00 PST 2018
b: Tue Feb 20 10:01:00 PST 2018
a: Tue Feb 20 10:01:31 PST 2018
b: Tue Feb 20 10:01:31 PST 2018
a: Tue Feb 20 10:02:02 PST 2018
b: Tue Feb 20 10:02:02 PST 2018
a: Tue Feb 20 10:02:33 PST 2018
b: Tue Feb 20 10:02:33 PST 2018
a: Tue Feb 20 10:03:04 PST 2018
b: Tue Feb 20 10:03:04 PST 2018

in vlm:

2018-02-20T09:59:27.504781-08:00 ERR kernel: [ 3253.564182] ata1.00: cmd ca/00:08:d0:1a:20/00:00:00:00:00/e0 tag 0 dma 4096 out
2018-02-20T09:59:27.504784-08:00 ERR kernel: [ 3253.564182]          res 40/00:00:e0:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
2018-02-20T09:59:58.512809-08:00 ERR kernel: [ 3284.560950] ata1.00: failed command: WRITE DMA
2018-02-20T09:59:58.512812-08:00 ERR kernel: [ 3284.560973] ata1.00: cmd ca/00:08:e8:c9:22/00:00:00:00:00/e0 tag 0 dma 4096 out
2018-02-20T09:59:58.512814-08:00 ERR kernel: [ 3284.560973]          res 40/00:00:e0:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
2018-02-20T10:00:29.520772-08:00 ERR kernel: [ 3315.557700] ata1.00: failed command: WRITE DMA
2018-02-20T10:00:29.520775-08:00 ERR kernel: [ 3315.557722] ata1.00: cmd ca/00:08:98:b7:05/00:00:00:00:00/e0 tag 0 dma 4096 out
2018-02-20T10:00:29.520778-08:00 ERR kernel: [ 3315.557722]          res 40/00:00:e0:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
2018-02-20T10:01:00.496784-08:00 ERR kernel: [ 3346.522547] ata1.00: failed command: WRITE DMA
2018-02-20T10:01:00.496787-08:00 ERR kernel: [ 3346.522568] ata1.00: cmd ca/00:08:60:3b:1c/00:00:00:00:00/e0 tag 0 dma 4096 out
2018-02-20T10:01:00.496789-08:00 ERR kernel: [ 3346.522568]          res 40/00:00:e0:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
2018-02-20T10:01:31.504803-08:00 ERR kernel: [ 3377.519345] ata1.00: failed command: WRITE DMA
2018-02-20T10:01:31.504806-08:00 ERR kernel: [ 3377.519366] ata1.00: cmd ca/00:08:c0:80:24/00:00:00:00:00/e0 tag 0 dma 4096 out
2018-02-20T10:01:31.504808-08:00 ERR kernel: [ 3377.519366]          res 40/00:00:e0:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
2018-02-20T10:02:02.512809-08:00 ERR kernel: [ 3408.516142] ata1.00: failed command: WRITE DMA
2018-02-20T10:02:02.512812-08:00 ERR kernel: [ 3408.516165] ata1.00: cmd ca/00:08:c8:ce:0b/00:00:00:00:00/e0 tag 0 dma 4096 out
2018-02-20T10:02:02.512815-08:00 ERR kernel: [ 3408.516165]          res 40/00:00:e0:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)

fio job used:
[seq_read]
filename=${FILENAME}
size=${FILESIZE}

time_based
runtime=5m

ioengine=libaio
iodepth=8
rate_iops=1000
direct=1

readwrite=randwrite
bs=4k


An obvious workaround is to disable NCQ for these SSDs, but that will hurt performance. Checking if this could be kernel related.


Comment 64 by gwendal@google.com, Feb 20 2018

Confirmed the problem is limited to LITEON as it also happens with 4.4 kernels. (does not happen on Sandisk)
Happens even when device/queue_depth is set to 1.

Enabling block, scsi and ata for kernel tracing, It looks like the regular DMA READ/WRITE issued after a ATA PIO SMART command is not getting processed.
qd_1_vlm
10.3 KB View Download
qd_1_trace
3.7 MB View Download

Comment 65 by gwendal@google.com, Feb 21 2018

4.14 is also failing. Contacting  LITE-ON. 
For what it's worth... I too have had the slow boot issue with my Dell 7310 for a year or more and have patiently waited release after release hoping that it would be fixed.
Version 63.0.3239.140 (Official Build) (64-bit)
Just updated to 64.0.3282.167 (Official Build) and I can confirm that the problem still exists.
I found a workaround. Retrieving less internal SSD data at boot (smartctl -a --brief instead of smartctl -x) avoids the timeout issue.

See https://chromium-review.googlesource.com/#/c/chromiumos/platform2/+/936789
I just replaced my SSD drive and it helped - the boot delay is gone!
Project Member

Comment 70 by bugdroid1@chromium.org, Feb 28 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform2/+/d2eb549f29013fa79c2586b53ddd4baa15e828de

commit d2eb549f29013fa79c2586b53ddd4baa15e828de
Author: Gwendal Grignou <gwendal@chromium.org>
Date: Wed Feb 28 23:21:10 2018

installer: Do not collect all smart information on LiteON SSD

After sendind command to retrieve advanced SMART logs, the next IO timeout - it
is not processed by the ssd - delaying login by up to 30s.

BUG= chromium:697805 
TEST=On lulu, using
fio <read traffic> & ( while true ; do echo -n "b: "; date ; smartctl -x
/dev/sda > /dev/null ; echo -n "a: "; date; done )
triggers IO timeout, while
fio <read traffic> & ( while true ; do echo -n "b: "; date ; smartctl -a -f brief
/dev/sda > /dev/null ; echo -n "a: "; date; done )
does not.

Change-Id: I707ca440a5f37eed23baf232b01c872bd13ad34c
Signed-off-by: Gwendal Grignou <gwendal@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/936789
Commit-Ready: ChromeOS CL Exonerator Bot <chromiumos-cl-exonerator@appspot.gserviceaccount.com>

[modify] https://crrev.com/d2eb549f29013fa79c2586b53ddd4baa15e828de/installer/share/storage-info-common.sh
[modify] https://crrev.com/d2eb549f29013fa79c2586b53ddd4baa15e828de/installer/test/storage_info_unit_test

Should this be merged to 65?
Labels: Merge-Request-65
Project Member

Comment 73 by sheriffbot@chromium.org, Mar 1 2018

Labels: -Merge-Request-65 Merge-Review-65 Hotlist-Merge-Review
This bug requires manual review: We are only 4 days from stable.
Please contact the milestone owner if you have questions.
Owners: cmasso@(Android), cmasso@(iOS), bhthompson@(ChromeOS), govind@(Desktop)

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Labels: -Hotlist-Merge-Review -Merge-Review-65 Merge-Approved-65
Project Member

Comment 75 by bugdroid1@chromium.org, Mar 1 2018

Labels: merge-merged-release-R65-10323.B
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform2/+/4e8ac4d786dd2b5a477ec6c2eb1115098f08f4b5

commit 4e8ac4d786dd2b5a477ec6c2eb1115098f08f4b5
Author: Gwendal Grignou <gwendal@chromium.org>
Date: Thu Mar 01 23:46:27 2018

installer: Do not collect all smart information on LiteON SSD

After sendind command to retrieve advanced SMART logs, the next IO timeout - it
is not processed by the ssd - delaying login by up to 30s.

BUG= chromium:697805 
TEST=On lulu, using
fio <read traffic> & ( while true ; do echo -n "b: "; date ; smartctl -x
/dev/sda > /dev/null ; echo -n "a: "; date; done )
triggers IO timeout, while
fio <read traffic> & ( while true ; do echo -n "b: "; date ; smartctl -a -f brief
/dev/sda > /dev/null ; echo -n "a: "; date; done )
does not.

Change-Id: I707ca440a5f37eed23baf232b01c872bd13ad34c
Signed-off-by: Gwendal Grignou <gwendal@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/936789
Commit-Ready: ChromeOS CL Exonerator Bot <chromiumos-cl-exonerator@appspot.gserviceaccount.com>
(cherry picked from commit d2eb549f29013fa79c2586b53ddd4baa15e828de)
Reviewed-on: https://chromium-review.googlesource.com/944827

[modify] https://crrev.com/4e8ac4d786dd2b5a477ec6c2eb1115098f08f4b5/installer/share/storage-info-common.sh
[modify] https://crrev.com/4e8ac4d786dd2b5a477ec6c2eb1115098f08f4b5/installer/test/storage_info_unit_test

Project Member

Comment 76 by sheriffbot@chromium.org, Mar 5 2018

Cc: bhthompson@google.com
This issue has been approved for a merge. Please merge the fix to any appropriate branches as soon as possible!

If all merges have been completed, please remove any remaining Merge-Approved labels from this issue.

Thanks for your time! To disable nags, add the Disable-Nags label.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Labels: -Merge-Approved-65
Status: Fixed (was: Untriaged)
Will add more SSD in the list if necessary. 
Got the update to 65 today and the problem seems solved. 
Labels: Hotlist-Enterprise

Sign in to add a comment