Bug in compression::GetUncompressedSize and compression::GzipUncompress |
|||||||
Issue descriptionIncomplete or truncated gzipped files do not have their size in the last 4 bytes, as compression::GetUncompressedSize() assumes. Instead, these files might have an arbitrary value, which could be quite high. In such a case, when the value taken from GetUncompressedSize() is used for e.g. std::string::resize(), as GzipUncompress does, the process is liable to run out of memory and crash. To convince oneself of the validity of this issue, trying compressing a short string with Z_FLUSH_SYNC, then close the file without compressing anything with Z_FINISH (which would produce the compression footer/trailer). On such a file, GetUncompressedSize would return 0xFFFF0000 as the size, and GzipUncompress would crash.
,
Jul 5
Can you clarify where you were seeing this? I didn't realize we ever called this on untrusted content.
,
Jul 5
I have triggered this by writing (unrelated) unit tests that created truncated files. Writing to disk is not guaranteed to succeed; applications can crash while writing, external drives disconnected, remote (network) drives become unreachable, data can be corrupted. All .gz files are untrusted content, IMHO.
,
Jul 6
+Jay, who was probably the most recent person to dig deep into the zlib code.
,
Jul 10
To summarize some of my thoughts: * Files are inherently untrusted sources, because they can be truncated or corrupted. * The bug described can cause a crash (due to OOM) from multiple places. Some FYIs will be sent to owners of affected code, for them to decide if this is a major issue for them now, before this can be properly fixed. * This is a problem both through GetUncompressedSize itself (it is public) as well as through the other compression::* functions which rely on it. * If this were up to me, I would probably break this down into two functions - (1) one that would perform a CRC check and provide a guaranteed size, at the cost of reading the entire file, and (2) another version that would, like the current implementation, only read the last few bytes, but would be given an explicit ceiling by the caller, and an error would be given if the size appears to be above the reasonable ceiling; the reasoning for providing the ceiling to the function is that it forces the caller of the function to be aware of the danger.
,
Jul 10
I don't know much about zlib, but your 4th points makes sense to me. We should always be using that code in a sandboxed environment so this shouldn't be much of a security concern, but it would of course be way better not to crash. I am now no longer working on Chrome so I won't be able to help.
,
Jul 10
About the sandbox - what if this code were used from the browser process? Is there anything to prevent it? Or did I misunderstand?
,
Jul 10
Thanks for adding your thoughts, Jay! Elad, if you're up for taking the time to implement your suggested changes, I'm happy to review them. I won't have a chance to implement this myself. FYI, Alexei and I are listed as owners for the library mostly because nobody else was, rather than because we have ever actively maintained it. I think we'd both be happy if someone else wanted to become more of a proper owner for the code! But in the meantime, mostly individual changes to the code get made as they're needed, by the people who find themselves needing them.
,
Jul 19
,
Jul 19
,
Jul 19
Alerting people who have last touched affected code (outside of unit tests): * components/metrics/log_decoder.cc (paulmiller for introducing the code, tobiasjs for usage) * components/variations/variations_seed_store.cc (veranika) * content/browser/webui/web_ui_url_loader_factory.cc (jam) * content/child/blink_platform_impl.cc (jbroman) * extensions/renderer/resource_bundle_source_map.cc (rdevlin.cronin, isherman) * extensions/renderer/string_source_map.cc (rdevlin.cronin, isherman) * third_party/blink/renderer/platform/media/resource_bundle_helper.cc (beccahughes)
,
Jul 19
The blink_platform_impl.cc use is exclusively used for data that was shipped with Chrome in our resource pak. It's trusted data by necessity, and it's not clear that there's anything we can do to recover gracefully if the data is corrupt. (e.g. running without a UA stylesheet would be bad).
,
Jul 19
About #12 - yes, I see that you CHECK(), too, so I guess this is not relevant for you.
,
Jul 19
Extensions usage is similar to #12 in that it's only for resources compiled with Chromium.
,
Jul 19
isherman@ seems to be a better owner for components/variations/variations_seed_store.cc
,
Jul 19
resource_bundle_helper.cc is in the same position as #12 and we use CHECK too.
,
Jul 20
web_ui_url_loader_factory also uses it with trusted data that we ship.
,
Jul 21
1. Do we perhaps want to fail more gracefully if those files become corrupt, or is crashing due to OOM acceptable here? 2. It's good if none of the current uses of compression::GzipUncompress are an issue, but one should probably still fix it, because there's no guarantee that there won't be unsafe uses of this unsafe function later. |
|||||||
►
Sign in to add a comment |
|||||||
Comment 1 by eladalon@chromium.org
, Jul 5