Ticket #542 (closed maintenance: fixed)

Opened 4 years ago

Last modified 3 years ago

Parrot RAM

Reported by: chris Owned by: chris
Priority: minor Milestone: Maintenance
Component: Parrot server Keywords:
Cc: ed Estimated Number of Hours: 0.0
Add Hours to Ticket: 0 Billable?: yes
Total Hours: 0.5

Description

I'm concerned that wiki:ParrotServer might not be able to cope with load spikes due to it only having 1GB of RAM, these are the Munin graphs we need to key an eye on:

If the server runs out of RAM it will basically stop responding.

Attachments

multips_memory-week.png (32.6 KB) - added by chris 4 years ago.
multips memory by week
memory-week.png (56.7 KB) - added by chris 4 years ago.
memory week
parrot-spike-memory-day.png (46.8 KB) - added by chris 4 years ago.
Parrot Memory Usage Spike (Memory usage)
parrot-spike-multips_memory-day.png (22.0 KB) - added by chris 4 years ago.
Parrot Memory Usage Spike (Process RSS)

Change History

comment:1 Changed 4 years ago by chris

  • Add Hours to Ticket changed from 0.0 to 0.25
  • Priority changed from major to blocker
  • Total Hours changed from 0.0 to 0.25

Parrot ran out of ram last night and stopped responding, I have just rebooted it, this is from /var/log/messages:

May  3 01:09:15 parrot kernel: [221374.764085] mysqld invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0

What are we going to do? Two options spring to mind:

  1. Reduce the number of apache processes allowed to reduce the memory usage, this would result in people getting "Service Unavailable" errors at peak times.
  2. Add some additional RAM, I'd suggest 1GB to start with, this might make all the difference.

comment:2 Changed 4 years ago by chris

Just before the server ran out of RAM it hit the max number of clients for the In Transition 2.0 movie site:

[Fri May 03 01:17:26 2013] [warn] MaxClientsVhost reached for movie.parrot.webarch.net, refusing client.
[Fri May 03 01:17:35 2013] [warn] MaxClientsVhost reached for movie.parrot.webarch.net, refusing client.
[Fri May 03 01:17:36 2013] [warn] MaxClientsVhost reached for movie.parrot.webarch.net, refusing client.

The server is currently set to allow a maximum of 50 apache processes per site. This is clearly too many for 1GB of RAM but it's also too low for peak traffic spikes.

Another thing to seriously consider is adding a varnish reverse proxy in front of apache, this would be fairly quick to set up, it would need more RAM but it would dramatically reduce the number of apache precesses at peak times, we have another virtual server on this physical server running WordPress and apache and varnish with 2GB of RAM and 75% of port 80 traffic it handled directly by varnish, it also speeds up the site a lot.

comment:3 Changed 4 years ago by ed

  • Milestone set to Maintenance

Changed 4 years ago by chris

multips memory by week

Changed 4 years ago by chris

memory week

comment:4 Changed 4 years ago by chris

  • Add Hours to Ticket 0 deleted

I have just added a couple of images to this ticket of the munin stats, from https://penguin.transitionnetwork.org/munin/transitionnetwork.org/parrot.transitionnetwork.org/

The two outages shown on this graph (the white gaps) were caused by disk failures in the server, these have been replaced since.

The big step up in this graph represents when the servers RAM was increased from 1GB to 2GB.

/trac/attachment/ticket/542/memory-week.png

The situation with the RAM on wiki:ParrotServer is due to be discussed in a call on Tuesday 14th May, see ticket:537#comment:9

Changed 4 years ago by chris

Parrot Memory Usage Spike (Memory usage)

Changed 4 years ago by chris

Parrot Memory Usage Spike (Process RSS)

comment:5 Changed 4 years ago by chris

  • Add Hours to Ticket changed from 0.0 to 0.25
  • Priority changed from blocker to minor
  • Total Hours changed from 0.25 to 0.5

Parrot is doing OK on 2GB of RAM, but it is swapping when there is a spike in traffic, see this spike for example, which resulted in 52 Apache processes:

/trac/attachment/ticket/542/parrot-spike-memory-day.png
/trac/attachment/ticket/542/parrot-spike-multips_memory-day.png

Latest Munin stats are here:

https://penguin.transitionnetwork.org/munin/transitionnetwork.org/parrot.transitionnetwork.org/

Adding Varnish to the mix should help a lot, but it is possible that for load spikes like this, an additional 1GB might be needed.

I think this ticket is probably worth leaving open for now, but I have reduced the Priority to Minor.

comment:6 Changed 4 years ago by ed

Any more GBs and it's not going to be a viable cost! Let's bear in mind the Varnish work in the maintenance monthly budget.

comment:7 Changed 3 years ago by chris

  • Status changed from new to closed
  • Resolution set to fixed

wiki:ParrotServer is still swapping some with 3GB RAM but it's not critical compared to when the server only had 1GB of RAM and this ticket was opened, so I think this ticket can be closed.

Note: See TracTickets for help on using tickets.