Ticket #905 (new maintenance)
TN site down due to redis not running
Reported by: | chris | Owned by: | chris |
---|---|---|---|
Priority: | blocker | Milestone: | Maintenance |
Component: | Live server | Keywords: | |
Cc: | ade, sam, paul | Estimated Number of Hours: | 0.0 |
Add Hours to Ticket: | 0 | Billable?: | yes |
Total Hours: | 0.55 |
Description
I'm working on this...
Change History
comment:2 Changed 9 months ago by chris
- Add Hours to Ticket changed from 0.0 to 0.4
- Total Hours changed from 0.0 to 0.4
The site is back up.
Here is what I did (restarting MySQL and Redis):
chris@puffin:~$ sudo -i puffin:~# /etc/init.d/mysql status /usr/bin/mysqladmin: connect to server at 'localhost' failed error: 'Too many connections' [info] . puffin:~# /etc/init.d/mysql stop [FAIL] Stopping MariaDB database server: mysqld failed! puffin:~# /etc/init.d/mysql stop [FAIL] Stopping MariaDB database server: mysqld failed! puffin:~# /etc/init.d/mysql stop [FAIL] Stopping MariaDB database server: mysqld failed! puffin:~# ps -lA | grep mysql 4 S 0 7292 1 0 80 0 - 2329 - ? 00:00:00 mysqld_safe 4 S 105 8232 7292 7 80 0 - 1243829 - ? 14:42:58 mysqld 0 S 0 32619 32618 0 80 0 - 10336 - ? 00:00:00 mysqldump puffin:~# /etc/init.d/mysql stop [FAIL] Stopping MariaDB database server: mysqld failed! puffin:~# /etc/init.d/mysql restart [FAIL] Stopping MariaDB database server: mysqld failed! [ ok ] Starting MariaDB database server: mysqld already running. puffin:~# /etc/init.d/mysql status /usr/bin/mysqladmin: connect to server at 'localhost' failed error: 'Too many connections' [info] . puffin:~# vi /etc/mysql/my.cnf puffin:~# /etc/init.d/redis-server status redis-server is not running puffin:~# /etc/init.d/redis-server start Starting redis-server: touch: cannot touch `/var/run/redis/redis.pid': No such file or directory puffin:~# mkdir -p /var/run/redis puffin:~# /etc/init.d/redis-server start Starting redis-server: redis-server. puffin:~# /etc/init.d/mysql restart [FAIL] Stopping MariaDB database server: mysqld failed! [ ok ] Starting MariaDB database server: mysqld already running. puffin:~# /etc/init.d/redis-server status redis-server is running puffin:~# /etc/init.d/redis-server status redis-server is running puffin:~# /etc/init.d/mysql restart [FAIL] Stopping MariaDB database server: mysqld failed! [ ok ] Starting MariaDB database server: mysqld already running. puffin:~# killall mysqld puffin:~# killall mysqld_safe puffin:~# killall mysqld^C puffin:~# ps -lA | grep mysql 4 S 0 7292 1 0 80 0 - 2330 - ? 00:00:00 mysqld_safe 4 R 105 32510 7292 10 80 0 - 681761 - ? 00:00:00 mysqld puffin:~# ps -lA | grep mysql 4 S 0 7292 1 0 80 0 - 2330 - ? 00:00:00 mysqld_safe 4 S 105 32510 7292 11 80 0 - 694055 - ? 00:00:01 mysqld puffin:~# killall -9 mysqld_safe puffin:~# killall -9 mysqld puffin:~# ps -lA | grep mysql puffin:~# /etc/init.d/mysql start [ ok ] Starting MariaDB database server: mysqld . . . .. [info] Checking for corrupt, not cleanly closed and upgrade needing tables..
It looks like redis stopped, I'm not sure when, the Munin stats stop about a week ago.
Also at 1am there was an issue with MySQL, these are some extracts from the /var/log/syslog.1:
Feb 25 01:03:57 puffin mysqld: 160225 1:03:57 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction Feb 25 01:03:57 puffin mysqld: 160225 1:03:57 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction ... Feb 25 01:15:23 puffin mysqld: InnoDB: Warning: a long semaphore wait: Feb 25 01:15:23 puffin mysqld: --Thread 140248992335616 has waited at buf0buf.c line 4099 for 241.00 seconds the semaphore: Feb 25 01:15:23 puffin mysqld: Mutex at 0x7f8e0d09f478 '&buf_pool->mutex', lock var 1 Feb 25 01:15:23 puffin mysqld: waiters flag 1 Feb 25 01:15:23 puffin mysqld: InnoDB: Warning: a long semaphore wait: Feb 25 01:15:23 puffin mysqld: --Thread 140244815951616 has waited at buf0buf.ic line 376 for 241.00 seconds the semaphore: Feb 25 01:15:23 puffin mysqld: Mutex at 0x7f8dabb070d0 '&block->mutex', lock var 1 Feb 25 01:15:23 puffin mysqld: waiters flag 1 Feb 25 01:15:23 puffin mysqld: InnoDB: Warning: a long semaphore wait: Feb 25 01:15:23 puffin mysqld: --Thread 140244814354176 has waited at buf0buf.c line 1836 for 241.00 seconds the semaphore: Feb 25 01:15:23 puffin mysqld: Mutex at 0x7f8e0d09f4f8 '&buf_pool->LRU_list_mutex', lock var 1 Feb 25 01:15:23 puffin mysqld: waiters flag 1 Feb 25 01:15:23 puffin mysqld: InnoDB: Warning: a long semaphore wait: Feb 25 01:15:23 puffin mysqld: --Thread 140244810893056 has waited at buf0buf.c line 4082 for 241.00 seconds the semaphore: Feb 25 01:15:23 puffin mysqld: Mutex at 0x7f8e0d09f4f8 '&buf_pool->LRU_list_mutex', lock var 1 Feb 25 01:15:23 puffin mysqld: waiters flag 1 Feb 25 01:15:23 puffin mysqld: InnoDB: Warning: a long semaphore wait: Feb 25 01:15:23 puffin mysqld: --Thread 140246175950592 has waited at buf0lru.c line 1106 for 241.00 seconds the semaphore: Feb 25 01:15:23 puffin mysqld: Mutex at 0x7f8e0d09f478 '&buf_pool->mutex', lock var 1 Feb 25 01:15:23 puffin mysqld: waiters flag 1 Feb 25 01:15:23 puffin mysqld: InnoDB: Warning: semaphore wait: Feb 25 01:15:23 puffin mysqld: --Thread 140248992335616 has waited at buf0buf.c line 4099 for 241.00 seconds the semaphore: Feb 25 01:15:23 puffin mysqld: Mutex at 0x7f8e0d09f478 '&buf_pool->mutex', lock var 1 Feb 25 01:15:23 puffin mysqld: waiters flag 1 Feb 25 01:15:23 puffin mysqld: InnoDB: Warning: semaphore wait: Feb 25 01:15:23 puffin mysqld: --Thread 140244815951616 has waited at buf0buf.ic line 376 for 241.00 seconds the semaphore: Feb 25 01:15:23 puffin mysqld: Mutex at 0x7f8dabb070d0 '&block->mutex', lock var 1 Feb 25 01:15:23 puffin mysqld: waiters flag 1 Feb 25 01:15:23 puffin mysqld: InnoDB: Warning: semaphore wait: Feb 25 01:15:23 puffin mysqld: --Thread 140244814354176 has waited at buf0buf.c line 1836 for 241.00 seconds the semaphore: Feb 25 01:15:23 puffin mysqld: Mutex at 0x7f8e0d09f4f8 '&buf_pool->LRU_list_mutex', lock var 1 Feb 25 01:15:23 puffin mysqld: waiters flag 1 Feb 25 01:15:23 puffin mysqld: InnoDB: Warning: semaphore wait: Feb 25 01:15:23 puffin mysqld: --Thread 140244810893056 has waited at buf0buf.c line 4082 for 241.00 seconds the semaphore: Feb 25 01:15:23 puffin mysqld: Mutex at 0x7f8e0d09f4f8 '&buf_pool->LRU_list_mutex', lock var 1 Feb 25 01:15:23 puffin mysqld: waiters flag 1 Feb 25 01:15:23 puffin mysqld: InnoDB: Warning: semaphore wait: Feb 25 01:15:23 puffin mysqld: --Thread 140246175950592 has waited at buf0lru.c line 1106 for 241.00 seconds the semaphore: Feb 25 01:15:23 puffin mysqld: Mutex at 0x7f8e0d09f478 '&buf_pool->mutex', lock var 1 Feb 25 01:15:23 puffin mysqld: waiters flag 1 ... Feb 25 01:15:23 puffin mysqld: InnoDB: Warning: Writer thread is waiting this semaphore: Feb 25 01:15:23 puffin mysqld: --Thread 140248993400576 has waited at buf0lru.c line 1141 for 240.00 seconds the semaphore: Feb 25 01:15:23 puffin rsyslogd-2177: imuxsock begins to drop messages from pid 8233 due to rate-limiting Feb 25 01:15:32 puffin rsyslogd-2177: imuxsock lost 585 messages from pid 8233 due to rate-limiting Feb 25 01:15:32 puffin mysqld: Feb 25 01:15:32 puffin mysqld: ===================================== Feb 25 01:15:32 puffin mysqld: 160225 1:15:32 INNODB MONITOR OUTPUT Feb 25 01:15:32 puffin mysqld: ===================================== Feb 25 01:15:32 puffin mysqld: Per second averages calculated from the last 60 seconds Feb 25 01:15:32 puffin mysqld: ----------------- Feb 25 01:15:32 puffin mysqld: BACKGROUND THREAD Feb 25 01:15:32 puffin mysqld: ----------------- Feb 25 01:15:32 puffin mysqld: srv_master_thread loops: 787246 1_second, 786418 sleeps, 75744 10_second, 35237 background, 35237 flush Feb 25 01:15:32 puffin mysqld: srv_master_thread log flush and writes: 674339 Feb 25 01:15:32 puffin mysqld: ---------- Feb 25 01:15:32 puffin mysqld: SEMAPHORES Feb 25 01:15:32 puffin mysqld: ---------- Feb 25 01:15:32 puffin mysqld: OS WAIT ARRAY INFO: reservation count 199312, signal count 191399 Feb 25 01:15:32 puffin mysqld: --Thread 140248992335616 has waited at buf0buf.c line 4099 for 250.00 seconds the semaphore: Feb 25 01:15:32 puffin mysqld: Mutex at 0x7f8e0d09f478 '&buf_pool->mutex', lock var 1 Feb 25 01:15:32 puffin mysqld: waiters flag 1 Feb 25 01:15:32 puffin mysqld: --Thread 140244815951616 has waited at buf0buf.ic line 376 for 250.00 seconds the semaphore: Feb 25 01:15:32 puffin mysqld: Mutex at 0x7f8dabb070d0 '&block->mutex', lock var 1 Feb 25 01:15:32 puffin mysqld: waiters flag 1 Feb 25 01:15:32 puffin mysqld: --Thread 140244814354176 has waited at buf0buf.c line 1836 for 250.00 seconds the semaphore: Feb 25 01:15:32 puffin mysqld: Mutex at 0x7f8e0d09f4f8 '&buf_pool->LRU_list_mutex', lock var 1 Feb 25 01:15:32 puffin mysqld: waiters flag 1 Feb 25 01:15:32 puffin mysqld: --Thread 140244810893056 has waited at buf0buf.c line 4082 for 250.00 seconds the semaphore: Feb 25 01:15:32 puffin mysqld: Mutex at 0x7f8e0d09f4f8 '&buf_pool->LRU_list_mutex', lock var 1 Feb 25 01:15:32 puffin mysqld: waiters flag 1 Feb 25 01:15:32 puffin mysqld: --Thread 140246175950592 has waited at buf0lru.c line 1106 for 250.00 seconds the semaphore: Feb 25 01:15:32 puffin mysqld: Mutex at 0x7f8e0d09f478 '&buf_pool->mutex', lock var 1 Feb 25 01:15:32 puffin mysqld: waiters flag 1 Feb 25 01:15:32 puffin mysqld: --Thread 140246184343296 has waited at buf0lru.c line 1058 for 249.00 seconds the semaphore: Feb 25 01:15:32 puffin mysqld: Mutex at 0x7f8e0d09f4f8 '&buf_pool->LRU_list_mutex', lock var 1 Feb 25 01:15:32 puffin mysqld: waiters flag 1 Feb 25 01:15:32 puffin mysqld: --Thread 140248993400576 has waited at buf0lru.c line 1141 for 249.00 seconds the semaphore: Feb 25 01:15:32 puffin mysqld: Mutex at 0x7f8e0d09f4f8 '&buf_pool->LRU_list_mutex', lock var 1 Feb 25 01:15:32 puffin mysqld: waiters flag 1 Feb 25 01:15:32 puffin mysqld: --Thread 140244813821696 has waited at btr0cur.c line 671 for 246.00 seconds the semaphore: Feb 25 01:15:32 puffin mysqld: S-lock on RW-latch at 0x7f8da009d588 '&new_index->lock' Feb 25 01:15:32 puffin mysqld: a writer (thread id 140248993400576) has reserved it in mode exclusive Feb 25 01:15:32 puffin mysqld: number of readers 0, waiters flag 1, lock_word: 0 Feb 25 01:15:32 puffin mysqld: Last time read locked in file btr0cur.c line 671 Feb 25 01:15:32 puffin mysqld: Last time write locked in file /home/buildbot/buildbot/build/mariadb-5.5.48/storage/xtradb/btr/btr0cur.c line 664 Feb 25 01:15:32 puffin mysqld: --Thread 140244812756736 has waited at btr0cur.c line 671 for 242.00 seconds the semaphore: Feb 25 01:15:32 puffin mysqld: S-lock on RW-latch at 0x7f8da009d588 '&new_index->lock' Feb 25 01:15:32 puffin mysqld: a writer (thread id 140248993400576) has reserved it in mode exclusive Feb 25 01:15:32 puffin mysqld: number of readers 0, waiters flag 1, lock_word: 0 ... Feb 25 01:15:32 puffin mysqld: a writer (thread id 140248993400576) has reserved it in mode exclusive Feb 25 01:15:32 puffin rsyslogd-2177: imuxsock begins to drop messages from pid 8233 due to rate-limiting Feb 25 01:15:46 puffin kernel: [687843.174807] Firewall: *UDP_IN Blocked* IN=eth0 OUT= MAC=00:16:3e:19:68:02:00:00:24:d1:8a:9a:08:00 SRC=74.91.21.2 DST=81.95.52.103 LEN=443 TOS=0x00 PREC=0x00 TTL=50 ID=58757 DF PROTO=UDP SPT=5543 DPT=5060 LEN=423 Feb 25 01:15:53 puffin rsyslogd-2177: imuxsock lost 297 messages from pid 8233 due to rate-limiting Feb 25 01:15:53 puffin mysqld: InnoDB: ###### Diagnostic info printed to the standard error stream
I must admit I don't know what caused this.
comment:3 Changed 9 months ago by chris
- Add Hours to Ticket changed from 0.0 to 0.15
- Total Hours changed from 0.4 to 0.55
Here is some info on what can cause a MySQL DEADLOCK:
Since this is the first time I'm aware that this this has happened and this server is due to be de-commissioned by the end of April due to the switch to WordPress I suggest we don't spend much time trying to find and fix the cause unless it happens again.
comment:4 Changed 9 months ago by paul
It'll probably be difficult to find the query responsible for the deadlock, if this doesn't happen very often. If this problem starts to happen to more frequently maybe we could turn on the mysql query log and try to associate the deadlock error with a particular mysql query; we can then put try / catch around it or something. On Thu, Feb 25, 2016 at 10:59 AM, Transition Technology Trac < trac@tech.transitionnetwork.org> wrote: > #905: TN site down due to redis not running > -------------------------------------+------------------------------------- > Reporter: chris | Owner: chris > Type: maintenance | Status: new > Priority: blocker | Milestone: > Component: Live server | Maintenance > Keywords: | Resolution: > Add Hours to Ticket: 0.15 | Estimated Number of Hours: 0.0 > Total Hours: 0.4 | Billable?: 1 > -------------------------------------+------------------------------------- > Changes (by chris): > > * hours: 0.0 => 0.15 > * totalhours: 0.4 => 0.55 > > > Comment: > > Here is some info on what can cause a MySQL DEADLOCK: > > * https://stackoverflow.com/questions/7782120/which-query-is-causing- > deadlock-found-when-trying-to-get-lock-try-restarting-tr > > Since this is the first time I'm aware that this this has happened and > this server is due to be [ticket:846#comment:86 de-commissioned by the end > of April] due to the [ticket:904 switch to WordPress] I suggest we don't > spend much time trying to find and fix the cause unless it happens again. > > -- > Ticket URL: <https://tech.transitionnetwork.org/trac/ticket/905#comment:3> > Transition Technology <https://tech.transitionnetwork.org/trac> > Support and issues tracking for the Transition Network Web Project. > -- Paul Booker Drupal Support for Websites and Linux Servers Website: http://www.paulbooker.co.uk Tel: +44 01922 861636