New issue
Advanced search Search tips

Issue 863615 link

Starred by 3 users

Issue metadata

Status: WontFix
Owner: ----
Closed: Jul 20
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 3
Type: Bug



Sign in to add a comment

Termina 10869.0.0: initial cicerone -> garcon connection unreliable

Project Member Reported by smbar...@chromium.org, Jul 13

Issue description

With the update to 10869.0.0, certain setups seem to be having issues during VM/container startup after reboot. Garcon informs cicerone that the container is ready, but when connecting back to garcon, the connection fails.

Repro case unknown. It seems to be a problem on VM/containers upgraded from 10739.0.0 or older.
 
journalctl
22.7 KB View Download
messages
50.3 KB View Download
I see nothing wrong (aside from the failures for cicerone to talk to garcon) in those logs unfortunately.  Even if there is some kind of transient failure, gRPC should automatically reconnect.  The gRPC server is definitely running in garcon, it logs that it is.

Is there some way for me to downgrade the termina image to 10739.0.0 so I can upgrade it again to see if I can repro it? (without having to revert my whole source tree to 10739.0.0 that is)
I'm unable to get termina 10739.0.0 to actually start up the VM completely; so I can't reproduce it going that route.
I was able to get 10739.0.0 to startup with a fresh VM image...but it won't load ones I downgraded from. I tried that a few times (loading 10739.0.0 and then updating to ToT) and still didn't reproduce this.
I think the downgrade won't work since we moved from LXD 2.21 to LXD 3.0.0.

I'm suspecting something on the CrOS side actually. My first VM start after a clean boot is failing reliably, but once I do get it started it will stop/start reliably.
Status: WontFix (was: Untriaged)
After deleting the VM disk and setting up Linux again, everything seems fine.

Not reproducible.
I have a VM that is now in a state that will not start.

There is data on there (been a few days since I backed up) that I really don't want to lose, so I really hope I don't have to delete and start over.

I can do a vmc stop and vmc start.

But lxc list shows the container to be in stopped state.

Trying to manually do an lxc start causes the log message

lxc penguin 20180720034409.480 WARN     lxc_conf - conf.c:lxc_map_ids:2863 - newgidmap is lacking necessary privileges
Status: Unconfirmed (was: WontFix)
Tried
lxc start penguin --verbose --debug, and ended up with this error

Error: Missing source '/run/sshd/penguin/ssh_host_key' for disk 'ssh_host_key'
Status: WontFix (was: Unconfirmed)
Hmm ... rebooted the machine ... and now I can't reproduce the problem anymore.
Closing this back as Won't Fix.

Sign in to add a comment