moblab: rate limit apache log file - limit messages in result |
||
Issue descriptionRunning CTS test suite over the WE, moblab ([2620:0:1000:1305:2e60:cff:fe6d:99be], currently 100.107.3.223]) frontend does not work anymore: <html><head> <title>500 Internal Server Error</title> apache2/error.log: TemplateDoesNotExist: 500.html Looking at mysql log file, it looks like the database went down because there was a space issue: Version: '5.5.32-MariaDB' socket: '/var/run/mysqld/mysqld.sock' port: 3306 Source distribution 160923 12:45:49 [ERROR] mysqld: Table './wmdb/msg_log' is marked as crashed and should be repaired 160923 12:45:49 [Warning] Checking table: './wmdb/msg_log' 160924 11:26:55 InnoDB: Error: Write to file ./rawdb/import_tko_machines.ibd failed at offset 0 0. InnoDB: 65536 bytes should have been written, only 0 were written. InnoDB: Operating system error number 28. InnoDB: Check that your OS and file system support files of this size. InnoDB: Check also that the disk is not full or a disk quota exceeded. InnoDB: Error number 28 means 'No space left on device'. InnoDB: Some operating system error numbers are described at InnoDB: http://dev.mysql.com/doc/refman/5.5/en/operating-system-error-codes.html 160924 11:26:55 InnoDB: TRUNCATE TABLE rawdb/import_tko_machines failed to create a new tablespace 160924 11:26:55 InnoDB: TRUNCATE TABLE wmdb/import_tko_machines failed to create a new tablespace 160924 11:27:52 InnoDB: Assertion failure in thread 139792667289344 in file os0file.c line 4355 InnoDB: Failing assertion: ret InnoDB: We intentionally generate a memory trap. InnoDB: Submit a detailed bug report to http://bugs.mysql.com. InnoDB: If you get repeated assertion failures or crashes, even InnoDB: immediately after the mysqld startup, there may be InnoDB: corruption in the InnoDB tablespace. Please refer to InnoDB: http://dev.mysql.com/doc/refman/5.5/en/forcing-innodb-recovery.html InnoDB: about forcing recovery. 160924 11:27:52 [ERROR] mysqld got signal 6 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. To report this bug, see http://kb.askmonty.org/en/reporting-bugs We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. Server version: 5.5.32-MariaDB key_buffer_size=16777216 read_buffer_size=262144 max_used_connections=7 max_threads=153 thread_count=3 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 136596 K bytes of memory Hope that's ok; if not, decrease some variables in the equation. Thread pointer: 0x0x7f240aa1a580 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 0x7f2404469eb0 thread_stack 0x48000 ??:0(??)[0x7f24087b3f4f] Trying to get some variables. Some pointers may be invalid and cause the dump to abort. Query (0x7f23d80b4378): is an invalid pointer Connection ID (thread ID): 28522 Status: NOT_KILLED Looking at disk usage, indeed, /var is full: df Filesystem 1K-blocks Used Available Use% Mounted on /dev/root 2015824 1860116 155708 93% / devtmpfs 1986292 0 1986292 0% /dev tmp 1988648 232 1988416 1% /tmp run 1988648 372 1988276 1% /run shmfs 1988648 4580 1984068 1% /dev/shm /dev/sda1 118657308 108655212 3951588 97% /var /dev/sda8 11760 24 11412 1% /usr/share/oem media 1988648 0 1988648 0% /media none 1988648 0 1988648 0% /sys/fs/cgroup Looking at /var, the culprit is apache log: du --max-depth=2 /var | sort -rn 2546820 /var 1996884 /var/log 1870892 /var/log/apache2 455912 /var/spool 455896 /var/spool/crash 93844 /var/lib 93028 /var/lib/mysql 14888 /var/log/devserver ... ls -l /var/log/apache2/ total 1870888 -rw-r--r-- 1 apache apache 1898477079 Sep 26 09:12 access_log -rw-r--r-- 1 apache apache 17297923 Sep 26 09:11 error_log access_log should be rate limited to avoid this situation. Note that some users experienced similar issue (error 28) resulting in spurious disk qual failures. I was expecting their results partition was too small, but it may be due to this issue as well.
,
Dec 7 2016
Did you have a 1 TB USB storage attached to the moblab? We're now making it a requirement for CTS
,
Jan 11 2017
Ping
,
Dec 18 2017
The apache log file size has been limited, however external storage is still recommended generally but required for CTS |
||
►
Sign in to add a comment |
||
Comment 1 by gwendal@chromium.org
, Sep 26 2016Actually, the result directory is full: 103104896 /mnt/stateful_partition/dev_image/moblab-storage/results ssh guado du --max-depth=2 /mnt/stateful_partition/dev_image/moblab-storage | sort -rn > /tmp/moblab-storage.du.txt Some test result weights ~600MB, so a 128GB won't be able to hold more than 200 of them: a single arc-cts may fill the result directory. Looking at such a big file: du --max-depth=2 results/216-moblab | sort -n 8 results/216-moblab/192.168.231.102/host_keyvals 40 results/216-moblab/192.168.231.102/ssp_logs 256 results/216-moblab/192.168.231.102/debug 195008 results/216-moblab/192.168.231.102/sysinfo 507640 results/216-moblab/192.168.231.102/cheets_CTS.android.libcorelegacy22 703068 results/216-moblab/192.168.231.102 703072 results/216-moblab Looking in more details: localhost cheets_CTS.android.libcorelegacy22 # du --max-depth=2 . | sort -n 4 ./cheets_CTSHelper/profiling 4 ./cheets_CTSHelper/results 4 ./profiling 60 ./cheets_CTSHelper/debug 200 ./debug 236 ./results/android-cts 240 ./results 808 ./cheets_CTSHelper/sysinfo 888 ./cheets_CTSHelper 194892 ./sysinfo/reboot_current 311080 ./sysinfo/var 506216 ./sysinfo 507640 . Reveal that messages in various sysinfo directories is too big: find . -size +1M -exec ls -l {} \+ -rw-r--r-- 1 moblab moblab 103281918 Sep 24 10:21 ./reboot_current/var/log/messages -rw-r--r-- 1 moblab moblab 10859711 Sep 24 10:21 ./reboot_current/var/log/net.log -rw-r--r-- 1 moblab moblab 4012751 Sep 23 18:56 ./reboot_current/var/log/screenshots/login-screen-0.png -rw-r--r-- 1 moblab moblab 4524030 Sep 24 10:21 ./reboot_current/var/log/secure -rw-r--r-- 1 moblab moblab 47399790 Sep 24 02:15 ./reboot_current/var/log/ui/ui.20160924-013142 -rw-r--r-- 1 moblab moblab 103429440 Sep 24 10:19 ./var/log/messages -rw-r--r-- 1 moblab moblab 10893108 Sep 24 10:18 ./var/log/net.log -rw-r--r-- 1 moblab moblab 4012751 Sep 23 18:56 ./var/log/screenshots/login-screen-0.png -rw-r--r-- 1 moblab moblab 4550294 Sep 24 10:19 ./var/log/secure -rw-r--r-- 1 moblab moblab 47399790 Sep 24 02:15 ./var/log/ui/ui.20160924-013142 -rw-r--r-- 1 moblab moblab 103304101 Sep 24 10:21 ./var/log_diff/messages -rw-r--r-- 1 moblab moblab 10859711 Sep 24 10:21 ./var/log_diff/net.log -rw-r--r-- 1 moblab moblab 4524030 Sep 24 10:21 ./var/log_diff/secure There is > 300MB worth of duplicate messages here.9.6 KB
9.6 KB View Download