shill: rate limit/remote info message "Suppressed autoconnect to service 0 (no endpoints)" |
||||
Issue descriptionLooking at crbug.com/912801 , where airport kiosk eMMC seems to have run out of spare blocks, I notice a lot of info in net.log: [INFO:service.cc(313)] Suppressed autoconnect to service X (no endpoints) There are 11682 over a 24h period, 8 per minutes on average. 2018-11-17T15:04:47.017327-06:00 INFO shill[898]: [INFO:wifi.cc(320)] Scan on wlan0 from ScanTimerHandler 2018-11-17T15:04:49.318610-06:00 INFO shill[898]: [INFO:service.cc(313)] Suppressed autoconnect to service 0 (no endpoints) 2018-11-17T15:04:49.322046-06:00 INFO shill[898]: [INFO:service.cc(313)] Suppressed autoconnect to service 1 (no endpoints) 2018-11-17T15:04:49.322194-06:00 INFO shill[898]: [INFO:service.cc(313)] Suppressed autoconnect to service 2 (no endpoints) 2018-11-17T15:04:49.322312-06:00 INFO shill[898]: [INFO:service.cc(313)] Suppressed autoconnect to service 3 (no endpoints) 2018-11-17T15:04:49.324179-06:00 INFO shill[898]: [INFO:wifi.cc(387)] ScanDone 2018-11-17T15:04:49.327264-06:00 INFO shill[898]: [INFO:service.cc(291)] wifi service 264 constructed. 2018-11-17T15:04:49.328268-06:00 INFO shill[898]: [INFO:wifi_service.cc(184)] Constructed WiFi service 264 name: [SSID=16] 2018-11-17T15:04:49.328994-06:00 INFO shill[898]: [INFO:wifi_service.cc(798)] Rep ep updated for 264. [SSID=16], bssid: e2:55:7d:00:00:15, sig: -68, sec: rsn, freq: 2412 2018-11-17T15:04:49.335773-06:00 INFO shill[898]: [INFO:service.cc(291)] wifi service 265 constructed. 2018-11-17T15:04:49.335880-06:00 INFO shill[898]: [INFO:wifi_service.cc(184)] Constructed WiFi service 265 name: [SSID=12] 2018-11-17T15:04:49.336142-06:00 INFO shill[898]: [INFO:wifi_service.cc(798)] Rep ep updated for 265. [SSID=12], bssid: e2:55:7d:00:00:0e, sig: -69, sec: rsn, freq: 2412 2018-11-17T15:04:49.339644-06:00 INFO shill[898]: [INFO:wifi_service.cc(798)] Rep ep updated for 191. [SSID=31], bssid: 94:b4:0f:00:00:31, sig: -71, sec: 802_1x, freq: 2462 2018-11-17T15:04:49.347109-06:00 INFO shill[898]: [INFO:service.cc(297)] Service 262 destroyed. 2018-11-17T15:04:49.354196-06:00 INFO shill[898]: [INFO:service.cc(313)] Suppressed autoconnect to service 0 (no endpoints) 2018-11-17T15:04:49.354299-06:00 INFO shill[898]: [INFO:service.cc(313)] Suppressed autoconnect to service 1 (no endpoints) 2018-11-17T15:04:49.354356-06:00 INFO shill[898]: [INFO:service.cc(313)] Suppressed autoconnect to service 2 (no endpoints) 2018-11-17T15:04:49.354418-06:00 INFO shill[898]: [INFO:service.cc(313)] Suppressed autoconnect to service 3 (no endpoints)
,
Dec 30
shill can be rather spammy in some situations, so this isn't super surprising. Do you actually expect this is causing lifetime issues for the eMMC?
,
Jan 2
,
Jan 2
Looking a little closer: those basically look like normal scan reports, which get triggered when we're not connected to any Wifi networks. The steady-state scan timer behavior when disconnected is to scan once every minute, and a few messages about new/changed network info for each scan is not surprising. And in particular, these "suppressed" connections are noting that there are saved networks that were not found in these scans. It doesn't seem like this is a request to change the scanning behavior. Once per minute doesn't seem excessive, when we aren't happy with the current Wifi state (disconnected). So the most I could recommend here is that we downgrade the priority of the "Suppressed autoconnect" messages so they don't show up at the default INFO level. It's not clear to me that we'd want to downgrade the other messages (e.g., BSS updates, as nearby networks come and go), so we wouldn't avoid all potential scanning-based logspam. I *don't* see where rate limiting would be very helpful. I don't believe a sane rate limit would extend across the 1 minute scan intervals, and the repeats you see within a second or two are distinct pieces of information. I'll repeat/rephrase my question from #2 though: are there guidelines for data writes for respecting flash lifetime? Let's say I approximate 11,682 messages per day as 1MB/day -- is that too much? What if we eliminated half of those (512KB/day) -- is that still too much?
,
Jan 7
#4: See https://bugs.chromium.org/p/chromium/issues/detail?id=912801#c30, these logs are not enough to explain the eMMC being worn out. If you think the log frequency is acceptable/actionable, let's close the bugs.
,
Jan 9
It does look like we've suppressed some of the "suppressed" messages before ;) https://gerrit.chromium.org/gerrit/38974 I'm semi-ambivalent about the "no endpoints" reason code logging; it's mostly self-evident, if you have enough context (we log when endpoints are discovered and removed). But it still could be useful in some cases, as it helps clarify services we could have connected to, if we could find them in scans. But the frequency seems fine to me, so given #5...let's close.
,
Jan 9
On a related note, there are some low hanging fruits. Dropping the log headers (i.e. [INFO:wifi.cc(320)]) can reduce the log size -- we do need to go through existing log messages to ensure there is no ambiguity after dropping headers.
,
Jan 9
I would not prefer dropping the prefixes/headers. They're helpful for a lot of things, like turning up verbosity (e.g., ff_debug --level XX) and still telling the difference between ERROR and VERBOSE<N>. They also help to (implicitly) inform about the module that produced the log. Without that, there's a lot of logs we'd need to modify to be more specific I think. And FWIW, syslog also includes log level (e.g., INFO/ERR/NOTICE). So those types of headers are not exactly abnormal practice.
,
Jan 9
Re #8: We're not completely losing the level. INFO/WARNING/ERROR are actually repeated twice in net.log. The only thing we would lose is verbosity level as they mapped to DEBUG. But verbosity level isn't quite useful as, arguably, we haven't been very consistent about level. Consider these DEBUG shill[4822]: [VERBOSE2:device_info.cc(692)] (device_info) AddLinkMsgHandler(index=1, flags=0x10049, change=0), new_device=0 INFO shill[4822]: [INFO:wifi.cc(377)] ScanDone becoming these DEBUG shill[4822]: (device_info) AddLinkMsgHandler(index=1, flags=0x10049, change=0), new_device=0 INFO shill[4822]: ScanDone The filenames and line numbers are also noisy. Rather than relying on them for precise locations of messages, the messages themselves should carry enough information. |
||||
►
Sign in to add a comment |
||||
Comment 1 by vapier@google.com
, Dec 29