Chrome Cache API exception |
||||
Issue descriptionWe got the following feedback from one of the enterprise partners: When trying to write to the caches api, they are getting the following error: "Uncaught (in promise) DOMException: Entry was not found." This is not happening always. Is there some way to remove the simple cache backend and create a new empty one when we get this error? Full log can be found at https://gist.github.com/csexton/a44912524e7afbc9dc1db7c753358b09 Some relevant logs below: [8685:8865:0815/104506.864619:INFO:remote_commands_service.cc(161)] Remote commands fetched. [8685:8865:0815/104506.899042:ERROR:simple_version_upgrade.cc(164)] File structure does not match the disk cache backend. [8685:8865:0815/104506.899291:ERROR:simple_backend_impl.cc(630)] Simple Cache Backend: wrong file structure on disk: /home/chronos/u-1134631a3449ab75f309eabbd2a6c84545e065f4/Storage/ext/bcdedldcbikpjmjbllhobfaohpajojli/def/Service Worker/CacheStorage/9a2b2fcfe89d2acf3d92909929566b9a51c251b1/5aa019f5-27e0-418e-9a62-2bb1a8d207c2 [8685:8866:0815/104506.899685:ERROR:disk_cache.cc(132)] Unable to create cache [8685:8685:0815/104506.951788:ERROR:multi_user_window_manager_stub.cc(60)] Not implemented reached in virtual void chrome::MultiUserWindowManagerStub::RemoveObserver(chrome::MultiUserWindowManager::Observer *) [8685:8685:0815/104507.580941:INFO:upload_job_impl.cc(399)] URL fetch completed.
,
Aug 16 2017
+cbentzel That error is from simple disk cache backend failing to recognize the existing file. The cache API is a Web API, not platform app API.
,
Aug 16 2017
Is this the cache web API? What is the context of this issue - what are they actually doing?
,
Aug 16 2017
,
Aug 17 2017
I think it's CacheStorage, going by the directory? Maybe it needs to have delete-and-retry logic the way AppCache and ServiceWorker do (unless it does already?), or to just pass force=true to CreateCacheBackend if it doesn't put anything other than the cache in the directory to have CreateCacheBackend take care of that. The debug output itself suggests that the index file in that directory doesn't look like one of ours --- too small, wrong magic, or some FS trouble. Its size and the first 8 bytes may give some idea of what's wrong, but it's unlikely to be too insightful. Also, is this on ChromeOS, by any chance?
,
Aug 17 2017
Yes, it is on Chrome OS.
,
Aug 17 2017
I believe this issue originated with us, sure looks like our logs there :)
This is on ChromeOS, seeing it one 2 of the 4 test machines.
The test machines:
- Have different hardware
- Running different ChromeOS versions (M59 and M60)
- Fresh installs from USB recovery as of last week
- One in debug mode with remote devtools enabled
One interesting note: Caches API does work in a sandboxed `<webview>` tag, just not in the chrome app itself.
Below is the method that is failing with the "Uncaught (in promise) DOMException: Entry was not found."
loadOrFetchDetails(body, url) {
let self = this;
return new Promise(function(resolve, reject) {
caches.open(CACHE_NAME).then(function(cache) {
cache.match(url).then(function(matchedResponse) {
if (matchedResponse) {
matchedResponse.json().then(function(jsonData) {
let details = jsonData.details;
resolve(details, true);
});
}
else {
let request = new Request(url, {
method: "POST",
headers: new Headers({"Content-Type": "application/json"}),
body: JSON.stringify(body)
});
fetch(request).then(function(fetchResponse) {
cache.put(url, fetchResponse.clone());
return fetchResponse;
})
.then(function(response) {
response.json().then(function(jsonData) {
let details = jsonData.details;
resolve(details);
});
})
.catch(function(fetchError) {
reject("Error fetching details");
});
}
});
});
});
}
Happy to supply more details, I can easily reproduce here.
,
Aug 17 2017
This might give some insight into https://bugs.chromium.org/p/chromium/issues/detail?id=620424 then --- which notes that backend error rate on creation is much higher on ChromeOS.
,
Aug 17 2017
Thanks for providing the detail, though I am like 50 layers below all this JS stuff. A couple of more questions: 1) Any chance you could hexdump the first 8 bytes of the "index" file in the directory in question? What size is it? 2) Are you doing any sort of syncing of data that may be copying the cache directories around?
,
Aug 17 2017
1) The index file
Looks like "index" and "index-dir/the-real-index" are both 0 bytes:
localhost 5aa019f5-27e0-418e-9a62-2bb1a8d207c2 # ls -al ./**
-rw-r--r--. 1 chronos chronos 0 Aug 9 13:33 ./d70841621db5d4b7_0
-rw-r--r--. 1 chronos chronos 0 Aug 9 13:33 ./fa5e53aa798c61ca_0
-rw-r--r--. 1 chronos chronos 0 Aug 9 13:33 ./index
./index-dir:
-rw-r--r--. 1 chronos chronos 0 Aug 9 13:33 the-real-index
2) Data syncing
We are not doing anything with syncing or interacting with the filesystem directly at all, everything is using the ChromeOS JS APIs.
,
Aug 17 2017
That's pretty unexpected. We can make the _0 files 0-length on purposes in some circumstances, but the other ones definitely shouldn't be, and it's unclear how one would even get to that state short a machine crash, since they're written out at pretty different times. Best I can say for getting out of this state is that caches.delete(CACHE_NAME) might just work and might "unwedge" it, though that's a pretty horrid workaround, but if it does, it would be interesting if trouble re-occurs.
,
Aug 17 2017
0 sized file feels like that we are FS corruption again. Could you attach /var/log/messages? It would be interesting to see whether there are signs of FS corruption there.
,
Aug 17 2017
Attached, it was only 17 lines long when I cat'd it. I reproduced the error twice, and noticed each time it added the following message:
2017-08-17T14:29:01.778706-04:00 DEBUG kernel: [ 9336.979035] SELinux: initialized (dev proc, type proc), uses genfs_contexts
,
Aug 17 2017
Re #13: Looks like the message file just get rolled. Can you attach /var/log/messages.1 ?
,
Aug 17 2017
Sure thing, attached the previous log file.
,
Aug 17 2017
There are FS corruption logs after cryptohome mount. Most likely the user profile stored in the cryptohome is somehow corrupted, which probably explain the 0 sized file. Usually the corruptions are caused by device shuts down uncleaning (power loss, kernel crash etc). 2017-08-17T10:58:54.468322-04:00 INFO cryptohomed[1231]: An asynchronous mount request with sequence id: 1 finished; doing PKCS11 init... ... 2017-08-17T10:58:54.483468-04:00 DEBUG kernel: [ 7.682993] Valid eCryptfs headers not found in file header region or xattr region, inode 261890 ...
,
Aug 17 2017
Is there a suggested resolution for ChromeOS? Are we talking about a full ChromeOS recovery? I can certainly do that, but since we could reproduce I wanted to keep it in this state for any testing.
,
Aug 17 2017
I'll leave it to xiyuan@ to provide ChromeOS-specific expertise. WRT to underlying thing: I do suspect we may have an actual robustness bug in CacheStorage layer wrt to how it deals with FS corruption (it's a cache, so we should be dropping it rather than just erroring out), but creating some size 0 files to reproduce your state on disk shouldn't be a problem.
,
Aug 17 2017
Re #17: For this particular bug, the corruption might be limited to just the cache file. We can try to following to remedy the situation: 1. Try caches.delete(CACHE_NAME) as in #11 when the exception happens to see if that could get the app out of the state. This would be ideal if the app can handle the situation itself; 2. Try uninstall then install the app. This nukes all the data associated with the app and could bring the app back to a working state; 3. Try remove and the user back. This nukes the whole user profile and re-create the user's cryptohome; 4. If none of the above works, then full recovery is the only way out;
,
Aug 17 2017
Awesome, option #1 worked! I was able to run `caches.delete(CACHE_NAME)` in the console and the app went back to caching content for me. I may add a startup guard to my app that writes a simple value to the caches and reads it out to ensure the filesystem is not corrupt. One final question about this (and the related issue where it was failing 0.03% the time), have you see this occur once the cache files have been established and are being used? Or is this only when creating the new files? I wondering if it would be necessary to have a health monitor job periodically check the caches system or if a check during startup is all we need.
,
Aug 18 2017
We don't have a root cause for that bug (just an observation from statistics) so I can't be sure in general; for the cause you hit a startup check is sufficient, though. |
||||
►
Sign in to add a comment |
||||
Comment 1 by sduraisamy@chromium.org
, Aug 16 2017