Ticket #585 (closed maintenance: fixed)

Opened 3 years ago

Last modified 3 years ago

TTech Meeting 5th September 2013

Reported by: chris Owned by: ed
Priority: major Milestone: Maintenance
Component: Live server Keywords:
Cc: jim, chris Estimated Number of Hours: 0.0
Add Hours to Ticket: 0 Billable?: yes
Total Hours: 5.45

Description

This ticket has been created for the TTech Skype meeting due to take place today.

Attachments

sql.report.log (230.4 KB) - added by jim 3 years ago.
Processed slow query log - puffin 5 sep 2013

Change History

comment:1 Changed 3 years ago by chris

  • Add Hours to Ticket changed from 0.0 to 1.5
  • Total Hours changed from 0.0 to 1.5

Some notes for the meeting.

Spam and Bots

I have changed the webalizer stats to use the same username and passwords as trac as I couldn't remember the password I set, you can access them here:

According to these stats (and they don't, by any means, include everything as I couldn't work out how to get nginx to log everything) for August we:

  • Served 69GB of data
  • Had 1.4m hits
  • Had 217k visits

By response code:

  • 1.1m 200 - OK
  • 145k 301/302 - redirects
  • 105k 404 - not found
  • 5k 502 - Bad Gateway
  • 6k 503 - Service Unavailable

The biggest single source of bandwidth by URL is Rob's RSS feed, 34.5GB of data, 150k hits. This represents half the total site data transfer, the RSS files is 0.4MB but will hopefully be mostly served gzipped at 0.1MB.

More data is transferred for /user/register than the front page, 5.1GB compared to 3.6GB.

The most popular user agents are bots, by hits:

  • 110k Googlebot
  • 78k Bingbot
  • 74k AhrefsBot
  • 43k Drupal
  • 43k Pingdom
  • 36k Baiduspider
  • 26k FeedBurner
  • 26k magpie-crawler
  • 17k Wget
  • 15k Apple-PubSub
  • 12k msnbot

Drupal settings

MySQL has been running for 1 day and 8 hours and these figures are based on that.

According to mysqltuner.pl this is flagged up:

  • Joins performed without indexes: 4657
  • Adjust your join queries to always utilize indexes
  • join_buffer_size (> 128.0M, or always use indexes with joins)

And according to tuning-primer.sh:

  • Current join_buffer_size = 128.00 M
  • You have had 4657 queries where a join could not use an index properly
  • join_buffer_size >= 4 M This is not advised
  • You should enable "log-queries-not-using-indexes" Then look for non indexed joins in the slow query log.

And:

  • The slow query log is enabled.
  • Current long_query_time = 5.000000 sec.
  • You have 57509 out of 8073570 that take longer than 5.000000 sec. to complete

And:

  • Of 137391 temp tables, 25% were created on disk

BOA

Would it be worth considering a dedicated MySQL server?

Hardware

The physical server the virtual machines are running on has stopped responding a couple of times in the last month, this appears to be due to faulty RAM, 3 chips were recorded as having faults in the BIOS log and these were all replaced on Tuesday evening.

There doesn't appear to be an issue with memory:

Or CPU:

Two immediate options which could be considered:

  • Moving the file system to our new ZFS Sheffield file server
  • Moving the site to hardware Iceland

The ZFS files server uses RAM and SSD caches and one big advantage is that we could change the way backups are done by using file system snapshots.

Other options which would require hardware purchase:

  • Dedicated server with SSD disks

comment:2 Changed 3 years ago by chris

The main ticket the load spikes are documented on is ticket:555

comment:3 Changed 3 years ago by chris

  • Add Hours to Ticket changed from 0.0 to 1.0
  • Total Hours changed from 1.5 to 2.5

TODO for Chris:

  • Look at CSF for blocking spammers
  • Look at caching of RSS feeds, can they be served directly by Nginx?
  • Change Mysql long_query_time = 5 to 10 seconds and see what is logged
  • Install New Relic on 13th Sept

comment:4 follow-up: ↓ 5 Changed 3 years ago by jim

Actually, thinking about it please DO NOT change the slow query time...

Using https://github.com/LeeKemp/mysql-slow-query-log-parser on the 4.8Mb slow log from today gets the attached - very readable and providing some good leads I've already tried to deal with.

I'd rather we had the 5s ones to compare the changes I've made now we have this tool.

Changed 3 years ago by jim

Processed slow query log - puffin 5 sep 2013

comment:5 in reply to: ↑ 4 Changed 3 years ago by chris

Replying to jim:

Actually, thinking about it please DO NOT change the slow query time...

No problem, I have created a ticket just for Mysql tuning: ticket:587 let's take this there...

comment:6 Changed 3 years ago by chris

  • Add Hours to Ticket changed from 0.0 to 0.1
  • Total Hours changed from 2.5 to 2.6

Follow up tickets regarding matters discussed in this meeting:

comment:7 Changed 3 years ago by jim

Plus #590 Drupal performance improvements

comment:8 Changed 3 years ago by jim

  • Add Hours to Ticket changed from 0.0 to 0.15
  • Total Hours changed from 2.6 to 2.75

I've now emailed Ben, Sarah and Amber to inform them of these tickets to help keep them in the loop when Ed's away.

comment:9 Changed 3 years ago by chris

  • Add Hours to Ticket changed from 0.0 to 1.0
  • Total Hours changed from 2.75 to 3.75

I'm adding this comment to record the Skype chat held on 11th September 2013.

I wonder if we should have one TTech Meetings ticket where we can all record out time for meetings?

I have created a ticket for the ZFS migration that was agreed today and is due to take place late one evening next week.

comment:10 Changed 3 years ago by jim

  • Add Hours to Ticket changed from 0.0 to 0.8
  • Total Hours changed from 3.75 to 4.55

Adding my time too.

comment:11 Changed 3 years ago by chris

  • Add Hours to Ticket changed from 0.0 to 0.45
  • Total Hours changed from 4.55 to 5.0

On 26th September Jim and I had a phone meeting about New Relic ticket:586 and general state of the load spike issue, ticket:555 and the ZFS migration, ticket:593.

I'll write some notes up on those tickets.

comment:12 Changed 3 years ago by jim

  • Add Hours to Ticket changed from 0.0 to 0.45
  • Total Hours changed from 5.0 to 5.45

Adding my time too...

comment:13 Changed 3 years ago by chris

  • Status changed from new to closed
  • Resolution set to fixed

Closing this ticket.

Note: See TracTickets for help on using tickets.