New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 866396 link

Starred by 1 user

Issue metadata

Status: Started
Owner:
Cc:
EstimatedDays: ----
NextAction: ----
OS: Android
Pri: 3
Type: Bug



Sign in to add a comment

Collect and report native code resident set size (RSS) across processes on Android

Project Member Reported by lizeb@chromium.org, Jul 23

Issue description

Code resident set size is a murky metric that we don't really report currently. Here are several ways we try to approximate it:

1. Calling mincore() on the range of native code
2. Reporting SUM(PSS(code) for all chrome processes)


Issues with these:
==================

1. mincore()
------------

mincore() will report that a page is "in core" if accessing it would not cause a major page fault. This means that unless there is memory pressure, the native library prefetch that we do will make it report that everything is in core. Similarly, the numbers will be affected by other processes running on the system (for instance WebView, when using Monochrome). This is actually visible on the LibraryLoader.PercentageOfResidentCodeBeforePrefetch.ColdStartup UMA histogram.

2. SUM(PSS)
-----------

PSS only sums up to RSS when taking all the processes sharing a given page into account. Since we can only look at Chrome's processes, the resulting number is an underestimate in the presence of other processes sharing the same pages, such as WebView with monochrome. In addition, collecting the numbers requires parsing /proc/self/smaps, which is quite costly.


To sum it up, mincore() will overstimate memory usage, and PSS will underestimate it.


Proposal
========

We know the virtual address range of executable code in each process, using the anchor symbols declared in anchor_functions.h. These are guaranteed to be correct (except for component builds) since they are generated via a linker script.

Parsing /proc/self/pagemap in the matching range tells us whether a page is mapped in a given process. Taking the union of all the mapped page sets in all Chrome processes gives us a picture of code RSS which is unaffected by prefetch and WebView.

/proc/self/pagemap doesn't necessarily contain the page frame numbers depending on the kernel, but the only relevant info in this case is the offset relative to the start of the executable code range, which we know in each process, so we don't need the physical addresses to compute the overall footprint.

The rough sketch is then:
- In each process, open and parse /proc/self/pagemap and collect the set of mapped code pages
- Send these over IPC/Mojo to the browser process, merge the sets, then compute the metric.

In terms of cost, this would require, per process:
- 3 syscalls (open(), read(), close()) provided that we can allocate 8 bytes per page (~80kB per process)
- 1 bit per page for the bitfield
- 1 IPC to the browser process

This relies on the assumption that we can read /proc/self/pagemap, which needs to be checked.

 
Owner: mattcary@chromium.org
Status: Started (was: Available)
Cc: ssid@chromium.org
I'm pulling together a quick implementation and will benchmark so we understand the costs.

According to ssid@, smaps/ parsing as is done by a detailed memory dump is ~200ms, and mincore (using ProcessMemoryDump::CountResidentBytes) is ~2ms.
From a conversation with mattcary@:

Parsing /proc/self/pagemap does *not* work on Android Go with O MR1, as it uses the mediatek 3.18 kernel. It should work on MSM 4.4 (snapdragon) kernel.


Story time:
===========

Once upon a time, /proc/self/pagemap was disclosing the Page Frame Number (PFN) for each page, thus revealing the physical address associated with a virtual address. This was used to exploit the "rowhammer" vulnerability, thus it was decided to remove PFNs from /proc/self/pagemap unless the calling process has CAP_SYS_ADMIN.
This was done in 4.0 and 4.1, then in 4.2 the code was changed to zero out the PFN unless the calling process has CAP_SYS_ADMIN.

Even though rowhammer was not (at the time at least) exploitable on ARM, it was likely picked up as a security patch backport in 3.18, but only the simple fix of denying /proc/self/pagemap access.

Indeed, see https://android.googlesource.com/kernel/mediatek/+/android-3.18/fs/proc/task_mmu.c#1393:

static int pagemap_open(struct inode *inode, struct file *file)
{
	/* do not disclose physical addresses: attack vector */
	if (!capable(CAP_SYS_ADMIN))
		return -EPERM;
	pr_warn_once("Bits 55-60 of /proc/PID/pagemap entries are about "
			"to stop being page-shift some time soon. See the "
			"linux/Documentation/vm/pagemap.txt for details.\n");
	return 0;
}


vs in MSM 4.4 (and upstream) https://android.googlesource.com/kernel/msm/+/android-4.4/fs/proc/task_mmu.c#1390:

static int pagemap_open(struct inode *inode, struct file *file)
{
	struct mm_struct *mm;
	mm = proc_mem_open(inode, PTRACE_MODE_READ);
	if (IS_ERR(mm))
		return PTR_ERR(mm);
	file->private_data = mm;
	return 0;
}


Conclusion: doesn't work on some devices at least, should work on recent-ish MSM ones, need to check with a more recent MTK kernel.

Cc: digit@chromium.org
For benchmarking this 'security' patch should be easy to revert. I remember digit@ mentioning that rebuilding reflashing the kernel is a relatively easy process (go/androidkernel?)
Update: this will be a useful stats to put in, even if it will only be valid on new devices. It will be helpful to have something to track native library size in the future as this will be a long-term concern about chrome's footprint.

We have a new testing device on order, and once I can test my CL on that device I can move forward with launching these stats.

Sign in to add a comment