New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 792255 link

Starred by 1 user

Issue metadata

Status: Untriaged
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 3
Type: Bug



Sign in to add a comment

[backend_impl.cc(1843)] Check failed: data_->header.num_bytes >= 0 (-4 vs. 0)

Project Member Reported by chongz@chromium.org, Dec 5 2017

Issue description

OS: Windows release build

What steps will reproduce the problem?
(1) On Win release, patch https://chromium-review.googlesource.com/c/chromium/src/+/803912/3
(2) Build 'browser_tests'
(3) Run multiple times 'out\Default\browser_tests.exe --gtest_filter=ChromeNetworkServiceRestartBrowserTest.StoragePartitionGetNetworkContext'

What is the expected result?
The test always pass.

What happens instead?
The test will hit a DCHECK at a high chance (>50%)

The patch basically does the following:
1. Load URL "/echo";
2. Terminate Network Service process;
3. Restart Network Service process;
4. Load URL "/echo" again.
The DCHECK seems to happen in 4.

---

Example Log:
https://logs.chromium.org/v/?s=chromium%2Fbb%2Ftryserver.chromium.win%2Fwin10_chromium_x64_rel_ng%2F39758%2F%2B%2Frecipes%2Fsteps%2Fbrowser_tests__with_patch__on_Windows-10-14393%2F0%2Flogs%2FChromeNetworkServiceRestartBrowserTest.StoragePartitionGetNetworkContext%2F0
```
[6000:4604:1201/212204.602:FATAL:backend_impl.cc(1843)] Check failed: data_->header.num_bytes >= 0 (-4 vs. 0)
Backtrace:
	base::debug::StackTrace::StackTrace [0x00007FF644444384+36]
	logging::LogMessage::~LogMessage [0x00007FF6443993CE+94]
	disk_cache::BackendImpl::SubstractStorageSize [0x00007FF644682E21+129]
	disk_cache::BackendImpl::ModifyStorageSize [0x00007FF644682D5B+43]
	disk_cache::EntryImpl::DeleteEntryData [0x00007FF64474E78E+910]
	disk_cache::EntryImpl::~EntryImpl [0x00007FF64475125F+143]
	disk_cache::EntryImpl::`scalar deleting destructor' [0x00007FF644750F60+16]
	disk_cache::BackendImpl::MatchEntry [0x00007FF6446811AD+717]
	disk_cache::BackendImpl::OpenEntryImpl [0x00007FF64467F408+152]
	disk_cache::BackendImpl::SyncOpenEntry [0x00007FF64467F307+119]
	disk_cache::BackendIO::ExecuteBackendOperation [0x00007FF6447400E6+342]
	base::debug::TaskAnnotator::RunTask [0x00007FF644440C40+288]
	base::internal::IncomingTaskQueue::RunTask [0x00007FF64446071B+123]
	base::MessageLoop::RunTask [0x00007FF6443CD159+665]
	base::MessageLoop::DeferOrRunPendingTask [0x00007FF6443CD5B3+195]
	base::MessageLoop::DoWork [0x00007FF6443CD80D+557]
	base::MessagePumpForIO::DoRunLoop [0x00007FF64446434D+173]
	base::MessagePumpWin::Run [0x00007FF644463475+101]
	base::MessageLoop::Run [0x00007FF6443CC9A9+201]
	base::RunLoop::Run [0x00007FF6443A8C38+248]
	base::Thread::Run [0x00007FF6443EA715+197]
	base::Thread::ThreadMain [0x00007FF6443EABF7+727]
	base::PlatformThread::GetCurrentThreadPriority [0x00007FF6443C1629+569]
	BaseThreadInitThunk [0x00007FFD99038364+20]
	RtlUserThreadStart [0x00007FFD9BAF5E91+33]
Received fatal exception EXCEPTION_BREAKPOINT
Backtrace:
	base::debug::BreakDebugger [0x00007FF64444E46D+13]
	?Run@?$Invoker@U?$BindState@P6AXPEBDHV?$BasicStringPiece@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@@base@@1@Z$$V@internal@base@@$$A6AXPEBDHV?$BasicStringPiece@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@@3@1@Z@interna [0x00007FF644F5B281+65]
	logging::LogMessage::~LogMessage [0x00007FF6443997FF+1167]
	disk_cache::BackendImpl::SubstractStorageSize [0x00007FF644682E21+129]
	disk_cache::BackendImpl::ModifyStorageSize [0x00007FF644682D5B+43]
	disk_cache::EntryImpl::DeleteEntryData [0x00007FF64474E78E+910]
	disk_cache::EntryImpl::~EntryImpl [0x00007FF64475125F+143]
	disk_cache::EntryImpl::`scalar deleting destructor' [0x00007FF644750F60+16]
	disk_cache::BackendImpl::MatchEntry [0x00007FF6446811AD+717]
	disk_cache::BackendImpl::OpenEntryImpl [0x00007FF64467F408+152]
	disk_cache::BackendImpl::SyncOpenEntry [0x00007FF64467F307+119]
	disk_cache::BackendIO::ExecuteBackendOperation [0x00007FF6447400E6+342]
	base::debug::TaskAnnotator::RunTask [0x00007FF644440C40+288]
	base::internal::IncomingTaskQueue::RunTask [0x00007FF64446071B+123]
	base::MessageLoop::RunTask [0x00007FF6443CD159+665]
	base::MessageLoop::DeferOrRunPendingTask [0x00007FF6443CD5B3+195]
	base::MessageLoop::DoWork [0x00007FF6443CD80D+557]
	base::MessagePumpForIO::DoRunLoop [0x00007FF64446434D+173]
	base::MessagePumpWin::Run [0x00007FF644463475+101]
	base::MessageLoop::Run [0x00007FF6443CC9A9+201]
	base::RunLoop::Run [0x00007FF6443A8C38+248]
	base::Thread::Run [0x00007FF6443EA715+197]
	base::Thread::ThreadMain [0x00007FF6443EABF7+727]
	base::PlatformThread::GetCurrentThreadPriority [0x00007FF6443C1629+569]
	BaseThreadInitThunk [0x00007FFD99038364+20]
	RtlUserThreadStart [0x00007FFD9BAF5E91+33]
../../chrome/browser/chrome_network_service_restart_browsertest.cc(94): error:       Expected: net::OK
      Which is: 0
To be equal to: LoadBasicRequest(partition->GetNetworkContext())
      Which is: -2
```



jam@'s comment:
```
+Maks & Josh, owners of net/disk_cache

Looks like this cl is triggering a possible bug in the disk cache code.

 	browser_tests.exe!disk_cache::BackendImpl::SubstractStorageSize(int bytes) Line 1844	C++	Symbols loaded.
 	browser_tests.exe!disk_cache::BackendImpl::ModifyStorageSize(int old_size, int new_size) Line 945	C++	Symbols loaded.
>	browser_tests.exe!disk_cache::EntryImpl::DeleteEntryData(bool everything) Line 518	C++	Symbols loaded.
 	browser_tests.exe!disk_cache::EntryImpl::~EntryImpl() Line 983	C++	Symbols loaded.
The problem is that DeleteEntryData() calls backend_->ModifyStorageSize(entry_.Data()->key_len, 0);

key_len is 27. However in BackendImpl::SubstractStorageSize data_->header.num_bytes was 23, so then after 27 is subtracted from it it is negative and the DCHECK fires. I'm not familiar with this code, so deferring to experts
```



morlovich@'s comment:
```
So what I think it comes down to is that the cache size is lazily updated, and some deferred updates here:

https://cs.chromium.org/chromium/src/net/disk_cache/blockfile/entry_impl.cc?rcl=41ff695f9fb5bf551f25da79f4b5ef79bdcd1081&l=964

get interrupted (by the process kill, I presume?) --- the off-by-4 mentioned in the earlier comments basically requires this loop to have run with [0] and not [1]; for some reason that happens consistently to me (which is a bit disconcerting --- I don't like things I can't explain). I don't know if all this deferring is worth the effort, since the scary "flush the index" calls it avoids don't do anything[1], but even if I change this particular thing, blockfile cache and crashes mean a potentially bad time next time, basically as a design flaw for the system. 

[1] they would do something on Android, but blockfile cache isn't even built there.
```

See the patch mentioned in reproduce steps (1) for more discussions.
 
We also probably shouldn't be [D]CHECKing if on-disk files were corrupted.  Disks do all sorts of strange things.  Memory, too, I suppose, though that's a little more in our control.
Agreed, but corrupting the field has effects other than the DCHECK alone...

Sign in to add a comment