Context Navigation

← Previous Change
Ticket History
Next Change →

Changes between Initial Version and Version 13 of Ticket #563

Timestamp:

06/24/13 10:53:00 (3 years ago)

Author:

chris

Comment:

These are the recent summary of results from the /usr/local/bin/50x-errors script which is run every day just before the nginx logs are rotated:

Date: Wed, 19 Jun 2013 14:53:27 +0100
Subject: 7 502, 173 503 and 0 504 errors from puffin.webarch.net

Date: Thu, 20 Jun 2013 06:25:29 +0100
Subject: 12 502, 1570 503 and 0 504 errors from puffin.webarch.net

Date: Fri, 21 Jun 2013 06:25:20 +0100
Subject: 464 502, 65 503 and 1 504 errors from puffin.webarch.net

Date: Sat, 22 Jun 2013 06:25:24 +0100
Subject: 0 502, 149 503 and 1 504 errors from puffin.webarch.net

Date: Sun, 23 Jun 2013 06:25:47 +0100
Subject: 1 502, 212 503 and 0 504 errors from puffin.webarch.net

Date: Mon, 24 Jun 2013 06:25:16 +0100
Subject: 2 502, 103 503 and 1 504 errors from puffin.webarch.net

It's worth noting that since the downtime on 20th June, which is in the stats above for 21st June as the stats are generated at 6:25am, there have been very few 502 or 504 errors. The 503 errors for the last few days have been generated when the site is in high load mode and this has been happening quite often, see the spikes on the graphs here: ticket:555#comment:54.

The thresholds in second.sh were initially multiplied by four, see ticket:555#comment:43, and then by five, see ticket:555#comment:52, and the RAM was doubled after the 20th June downtime, from 4GB to 8GB.

I think it would perhaps be worth doing a bit more work on the 50x-errors script -- if it counted, sorted and listed the user agents that were triggering the 503 errors this would be a good check that it is an error that is just served to bots.

Apart from that this ticket is probably ready to be closed.

Legend:

: Unmodified
: Added
: Removed
: Modified

Ticket #563
- Property Add Hours to Ticket changed from 0.0 to 0.5
- Property Total Hours changed from 0.0 to 6.25
- Property Status changed from new to accepted
- Property Priority changed from critical to major

Ticket #563 – Description

initial	v13
	1	The BOA {{{/var/xdrago/second.sh}}} script is run every minute via the root crontab and if it detects a certain load level it changes the nginx config to a "high load" config which results in bots being served 503 errors when they spider the site. When the load goes higher and hits another threshold the {{{second.sh}}} script stops the site, see ticket:555.
	2
	3	== Original Description ==
	4
1	5	The site is generating a lot of 503 errors, 83 since 6:30am today and there were around 750 yesterday.
2	6