New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 650299 link

Starred by 1 user

Issue metadata

Status: Duplicate
Owner: ----
Closed: Dec 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment

moblab: rate limit apache log file - limit messages in result

Project Member Reported by gwendal@chromium.org, Sep 26 2016

Issue description

Running CTS test suite over the WE, moblab ([2620:0:1000:1305:2e60:cff:fe6d:99be], currently 100.107.3.223]) frontend does not work anymore:

<html><head>
<title>500 Internal Server Error</title>

apache2/error.log:
TemplateDoesNotExist: 500.html

Looking at mysql log file, it looks like the database went down because there was a space issue:

Version: '5.5.32-MariaDB'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  Source distribution
160923 12:45:49 [ERROR] mysqld: Table './wmdb/msg_log' is marked as crashed and should be repaired
160923 12:45:49 [Warning] Checking table:   './wmdb/msg_log'
160924 11:26:55  InnoDB: Error: Write to file ./rawdb/import_tko_machines.ibd failed at offset 0 0.
InnoDB: 65536 bytes should have been written, only 0 were written.
InnoDB: Operating system error number 28.
InnoDB: Check that your OS and file system support files of this size.
InnoDB: Check also that the disk is not full or a disk quota exceeded.
InnoDB: Error number 28 means 'No space left on device'.
InnoDB: Some operating system error numbers are described at
InnoDB: http://dev.mysql.com/doc/refman/5.5/en/operating-system-error-codes.html
160924 11:26:55  InnoDB: TRUNCATE TABLE rawdb/import_tko_machines failed to create a new tablespace
160924 11:26:55  InnoDB: TRUNCATE TABLE wmdb/import_tko_machines failed to create a new tablespace
160924 11:27:52  InnoDB: Assertion failure in thread 139792667289344 in file os0file.c line 4355
InnoDB: Failing assertion: ret
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.5/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
160924 11:27:52 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.

To report this bug, see http://kb.askmonty.org/en/reporting-bugs

We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.

Server version: 5.5.32-MariaDB
key_buffer_size=16777216
read_buffer_size=262144
max_used_connections=7
max_threads=153
thread_count=3
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 136596 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0x7f240aa1a580
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f2404469eb0 thread_stack 0x48000
??:0(??)[0x7f24087b3f4f]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x7f23d80b4378): is an invalid pointer
Connection ID (thread ID): 28522
Status: NOT_KILLED



Looking at disk usage, indeed, /var is full:
df
Filesystem     1K-blocks      Used Available Use% Mounted on
/dev/root        2015824   1860116    155708  93% /
devtmpfs         1986292         0   1986292   0% /dev
tmp              1988648       232   1988416   1% /tmp
run              1988648       372   1988276   1% /run
shmfs            1988648      4580   1984068   1% /dev/shm
/dev/sda1      118657308 108655212   3951588  97% /var
/dev/sda8          11760        24     11412   1% /usr/share/oem
media            1988648         0   1988648   0% /media
none             1988648         0   1988648   0% /sys/fs/cgroup

Looking at /var, the culprit is apache log:
du --max-depth=2 /var | sort -rn
2546820 /var
1996884 /var/log
1870892 /var/log/apache2
455912  /var/spool
455896  /var/spool/crash
93844   /var/lib
93028   /var/lib/mysql
14888   /var/log/devserver
...

ls -l /var/log/apache2/
total 1870888
-rw-r--r-- 1 apache apache 1898477079 Sep 26 09:12 access_log
-rw-r--r-- 1 apache apache   17297923 Sep 26 09:11 error_log

access_log should be rate limited to avoid this situation.

Note that some users experienced similar issue (error 28) resulting in spurious disk qual failures. I was expecting their results partition was too small, but it may be due to this issue as well.

 
screen-exchange
2.8 KB View Download
mysqld.err
32.0 KB Download
Summary: moblab: rate limit apache log file - limit messages in result (was: moblab: rate limit apache log file)
Actually, the result directory is full:
103104896       /mnt/stateful_partition/dev_image/moblab-storage/results

ssh guado du --max-depth=2 /mnt/stateful_partition/dev_image/moblab-storage | sort -rn > /tmp/moblab-storage.du.txt

Some test result weights ~600MB, so a 128GB won't be able to hold more than 200 of them: a single arc-cts may fill the result directory.

Looking at such a big file:
du --max-depth=2 results/216-moblab | sort -n
8       results/216-moblab/192.168.231.102/host_keyvals
40      results/216-moblab/192.168.231.102/ssp_logs
256     results/216-moblab/192.168.231.102/debug
195008  results/216-moblab/192.168.231.102/sysinfo
507640  results/216-moblab/192.168.231.102/cheets_CTS.android.libcorelegacy22
703068  results/216-moblab/192.168.231.102
703072  results/216-moblab

Looking in more details:
localhost cheets_CTS.android.libcorelegacy22 # du --max-depth=2 . | sort -n
4       ./cheets_CTSHelper/profiling
4       ./cheets_CTSHelper/results
4       ./profiling
60      ./cheets_CTSHelper/debug
200     ./debug
236     ./results/android-cts
240     ./results
808     ./cheets_CTSHelper/sysinfo
888     ./cheets_CTSHelper
194892  ./sysinfo/reboot_current
311080  ./sysinfo/var
506216  ./sysinfo
507640  .

Reveal that messages in various sysinfo directories is too big:
find . -size +1M -exec ls -l {} \+
-rw-r--r-- 1 moblab moblab 103281918 Sep 24 10:21 ./reboot_current/var/log/messages
-rw-r--r-- 1 moblab moblab  10859711 Sep 24 10:21 ./reboot_current/var/log/net.log
-rw-r--r-- 1 moblab moblab   4012751 Sep 23 18:56 ./reboot_current/var/log/screenshots/login-screen-0.png
-rw-r--r-- 1 moblab moblab   4524030 Sep 24 10:21 ./reboot_current/var/log/secure
-rw-r--r-- 1 moblab moblab  47399790 Sep 24 02:15 ./reboot_current/var/log/ui/ui.20160924-013142
-rw-r--r-- 1 moblab moblab 103429440 Sep 24 10:19 ./var/log/messages
-rw-r--r-- 1 moblab moblab  10893108 Sep 24 10:18 ./var/log/net.log
-rw-r--r-- 1 moblab moblab   4012751 Sep 23 18:56 ./var/log/screenshots/login-screen-0.png
-rw-r--r-- 1 moblab moblab   4550294 Sep 24 10:19 ./var/log/secure
-rw-r--r-- 1 moblab moblab  47399790 Sep 24 02:15 ./var/log/ui/ui.20160924-013142
-rw-r--r-- 1 moblab moblab 103304101 Sep 24 10:21 ./var/log_diff/messages
-rw-r--r-- 1 moblab moblab  10859711 Sep 24 10:21 ./var/log_diff/net.log
-rw-r--r-- 1 moblab moblab   4524030 Sep 24 10:21 ./var/log_diff/secure

There is > 300MB worth of duplicate messages here.

moblab-storage.du.txt
9.6 KB View Download

Comment 2 by krk@chromium.org, Dec 7 2016

Did you have a 1 TB USB storage attached to the moblab? We're now making it a requirement for CTS

Comment 3 by krk@chromium.org, Jan 11 2017

Ping
Mergedinto: 701467
Status: Duplicate (was: Untriaged)
The apache log file size has been limited, however external storage is still recommended generally but required for CTS

Sign in to add a comment