Ticket #670 (closed maintenance: fixed)
Roll back performance customisations and use stock BOA settings where possible
Reported by: | jim | Owned by: | jim |
---|---|---|---|
Priority: | minor | Milestone: | Maintenance |
Component: | Live server | Keywords: | |
Cc: | ed, chris, jim, sam, planetlarg | Estimated Number of Hours: | 0.0 |
Add Hours to Ticket: | 0 | Billable?: | yes |
Total Hours: | 6.41 |
Description
Issue
Given so much has changed since the initial issues on the server, I now strongly recommend reverting all settings changes that do not add features back to stock BOA settings after the next BOA release.
These would include all MySQL, PHP, FPM, Redis and other settings that have been for performance reasons, or to combat the situation where there was hardware/IO issues with the underlying server. I'm most interested in FPM and MySQL settings.
The next version of BOA will include some improvements we need (see 629: Upgrades to BOA which should handle load on our server with a lot of CPU cores. With this in place we'll be able to revert more easily to stock settings.
Rationale
I'm not talking about rolling back changes that provide us with features or mission-critical capabilities, just the changes to the subsystems I list above for performance reasons.
It's my belief that these enhancements no longer match the needs of the server since the changes to filesystem and underlying hardware fixes have been completed. They also represent an ongoing risk around updates, future planning -- plus it's possible they might mean Puffin's web services need more memory than it otherwise would, costing TN more than it should need to spend on hardware.
Proposed solution
- Await the next version of BOA, and setup the enhanced load settings per the documentation.
- Revert all other changes to conf files for MySQL, PHP, FPM, Redis that do not add a feature or are not mission-critical.
- Review /root/.barracuda.cnf and turn off any overrides and customisations we don't now need as a result of 2).
- Run the BOA BOND.sh script to do the tuning of the server appropriate to the memory requirements. This will tune for the current levels on first pass.
- Review Munin and site performance. If we need to make any tweaks then we can do a minimal set as required -- keeping an eye on memory usage.
- Once a few days have gone by I would hope that the overall memory use will be lower, OR with more cached data. At this point we can either re-run the barracuda installer with the _RESERVED_RAM set to 1-4 Gb, or simply reduce the memory available to Puffin.
- Repeat from 4, using BOND.sh to optimise for the new memory footprint.
Clearly, it's possible no memory savings can be made, or just 1Gb or so is sensible. Either way, rolling back the changes made for a system that has changed immensely is worth attempting to compare current (tweaked) performance to the stock system. Since current settings are all documented and can be backed up, we should be able to test this with no risk and the ability to roll back as needed.
Next steps
- Chris and Ed to give their thoughts.
- Ed to green-light before we proceed in too much detail or take any action.
- Chris and Jim to establish the changes and outcomes.
- Chris, Jim and whoever to do the new optimisation process.
Attachments
Change History
comment:1 Changed 3 years ago by jim
- Add Hours to Ticket changed from 0.0 to 0.2
- Total Hours changed from 0.0 to 0.2
comment:3 Changed 3 years ago by chris
On Sat 11-Jan-2014 at 08:51:30PM -0000, Transiton Technology Trac wrote:
Given so much has changed since the initial issues on the server, I now
strongly recommend reverting all settings changes that do not add features
back to stock BOA settings after the next BOA release.
I'm happy to give this a try, the main thing we need to watch for is the number of php-fpm processes.
comment:4 Changed 3 years ago by chris
Can you post copies of all the files that will be clobbered on the next BOA upgrade to Trac so we have them available for reference. The main thing that I expect will change is that there will be a dramatic shift of memory allocation away from MySQL and to php-fpm.
comment:5 Changed 3 years ago by ed
- Cc sam added
- Owner changed from ed to jim
- Status changed from new to assigned
I'm fine with this if it is about how we are needing less specialisations for BOA - particularly around the handover, and watching JK's epic Sherlock impersonation over the weekend on #610 and this will make it more standard, and theefore more handover-able.
Jim and Chris to work together *very very* closely and document the arse of it please.
Adding Sam cc
Changed 3 years ago by chris
- Attachment nginx.conf.txt added
/var/aegir/config/server_master/nginx.conf
comment:6 Changed 3 years ago by chris
- Add Hours to Ticket changed from 0.0 to 2.16
- Total Hours changed from 0.2 to 2.36
I have attached all the files that I think have been changed from the default BOA settings, but it is possible that I might have missed one or two, Jim can you check this list?
They have been posted here so that when the next BOA upgrade clobbers all these files we will, if needs be, be able to revert the clobbering.
I have also done some updating of wiki:PuffinServer but more is needed.
/etc/csf/csf.allow
A backup of this file has also been created on the server for future diffing:
- /etc/csf/csf.allow.2013-01-13.bak
In this file we have allowed some specific IP addresses, two for munin (though at the moment only the penguin one is needed):
tcp:in:d=4949:s=81.95.52.102 # munin.webarch.net tcp:in:d=4949:s=81.95.52.111 # penguin.webarch.net
And we have allowed the Webarchitects monitoring server, this enables email alerts to be sent to the Webarchitects sysadmins when a service which is being monitored goes down:
81.95.52.66 # webarch monitoring server - Manually allowed - Wed Aug 7 10:56:54 2013
See ticket:544 for more on this.
/etc/csf/csf.blocklists
A backup of this file has also been created on the server for future diffing:
- /etc/csf/csf.blocklists.2013-01-13.bak
In this file we have enabled various blacklists, specifically:
# Spamhaus Don't Route Or Peer List (DROP) # Details: http://www.spamhaus.org/drop/ SPAMDROP|86400|0|http://www.spamhaus.org/drop/drop.lasso # Spamhaus Extended DROP List (EDROP) # Details: http://www.spamhaus.org/drop/ SPAMEDROP|86400|0|http://www.spamhaus.org/drop/edrop.lasso # DShield.org Recommended Block List # Details: http://dshield.org DSHIELD|86400|0|http://feeds.dshield.org/block.txt # BOGON list # Details: http://www.team-cymru.org/Services/Bogons/ BOGON|86400|0|http://www.cymru.com/Documents/bogon-bn-agg.txt # Project Honey Pot Directory of Dictionary Attacker IPs # Details: http://www.projecthoneypot.org HONEYPOT|86400|0|http://www.projecthoneypot.org/list_of_ips.php?t=d&rss=1 # BruteForceBlocker IP List # Details: http://danger.rulez.sk/index.php/bruteforceblocker/ BFB|86400|0|http://danger.rulez.sk/projects/bruteforceblocker/blist.php # OpenBL.org 30 day List # Details: http://www.openbl.org OPENBL|86400|0|http://www.us.openbl.org/lists/base_30days.txt # Autoshun Shun List # Details: http://www.autoshun.org/ AUTOSHUN|86400|0|http://www.autoshun.org/files/shunlist.csv
The enabling of these blacklists was done on ticket:589.
/etc/csf/csf.conf
A backup of this file has also been created on the server for future diffing:
- /etc/csf/csf.conf.2013-01-13.bak
This files has various amendments, the following list is based on doing a diff with the oldest backup, the ones listed are the ones which have either been done before we set _CUSTOM_CONFIG_CSF=YES in /root/.barracuda.cnf and look significant, or ones which have clearly been done manually, this means that not all of these settings will be clobbered with the next BOA upgrade, this is how the diff was done:
cd /etc/csf diff csf.conf-pre-BOA-2.0.4-121215-1555 csf.conf | vim -
To allow Mosh connections:
# Allow incoming UDP ports UDP_IN = "20,21,53,123,161,33434:33523,60000:60040" # Allow outgoing UDP ports # To allow outgoing traceroute add 33434:33523 to this list UDP_OUT = "20,21,53,113,123,161,33434:33523,60000:60040"
See ticket:673.
Enable email alerts to be sent to me for monitoring:
LF_ALERT_TO = "chris@webarchitects.co.uk" X_ARF_TO = "chris@webarchitects.co.uk"
Switch off testing and auto updates:
TESTING = "0" AUTO_UPDATES = "0"
TCP ports:
TCP_IN = "20,21,22,37,53,80,443,2401,5280,9418,30000:50000" TCP_OUT = "20,21,22,25,37,53,80,110,143,443,465,587,873,993,995,1129,2401,3306,5280,9418,11371,27017,30000:50000"
Disallow pings:
ICMP_IN = "0"
Ensure that the server isn't vulnerable to a DOS which exploits the behaviour of csf IP blocking:
DENY_IP_LIMIT = "100"
Port flood settings:
SYNFLOOD = "1" CONNLIMIT = "22;19,80;19,443;19,53;5" PORTFLOOD = "22;tcp;9;29,1433;tcp;1;900"
Logging:
DROP_OUT_LOGGING = "1" LOGFLOOD_ALERT = "1"
We are not running a IMAP or POP3 server and we are not using Apache:
LF_POP3D = "0" LF_IMAPD = "0" LF_HTACCESS = "0" LF_MODSEC = "0" LT_EMAIL_ALERT = "0"
Distributed attack settings:
LF_DISTATTACK = "1" LF_DISTATTACK_UNIQ = "3" LF_DISTFTP = "5" LF_DISTFTP_UNIQ = "5" LF_DISTFTP_PERM = "900"
Process time tracking:
PT_LIMIT = "0"
User process tracking:
PT_USERPROC = "0" PT_USERMEM = "0" PT_USERTIME = "0" PT_USERKILL_ALERT = "0"
Forkbomb:
PT_FORKBOMB = "250"
Port scan tracking:
PS_INTERVAL = "120" PS_LIMIT = "19"
User ID tracking:
UID_INTERVAL = "0" UID_LIMIT = "10" UID_PORTS = "0:65535,ICMP"
We only have CSF on one server:
CLUSTER_BLOCK = "0"
/etc/mysql/my.cnf
A backup of this file has also been created on the server for future diffing:
- /etc/mysql/my.cnf.2013-01-13.bak
All the changes to /etc/mysql/my.cnf should be linked from ticket:587, reading through the file these are the ones that stand out:
[mysqld] tmpdir = /run/shm/mysql join_buffer_size = 256M key_buffer_size = 256M max_connections = 40 max_user_connections = 40 query_cache_limit = 2M query_cache_size = 768M query_cache_min_res_unit = 1K sort_buffer_size = 512K bulk_insert_buffer_size = 256K table_open_cache = 6144 table_definition_cache = 6144 table_cache = 20480 tmp_table_size = 2048M max_heap_table_size = 4096M max_tmp_tables = 32768 open_files_limit = 196608 innodb_buffer_pool_size = 1536M
/var/aegir/config/server_master/nginx.conf
A backup of this file has also been created on the server for future diffing:
- /var/aegir/config/server_master/nginx.conf.2013-01-13.bak
The changes made to this file are to ename Munin graphs based on Nginx and php-fpm status and they are documented on here: wiki:PuffinServer#nginxconfigchanges
location /nginx_status { stub_status on; access_log off; allow 127.0.0.1; allow 81.95.52.103; deny all; } location ~ ^/(status|ping)$ { fastcgi_pass 127.0.0.1:9090; fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name; fastcgi_intercept_errors on; include fastcgi_params; access_log off; allow 127.0.0.1; deny all; }
/opt/local/etc/php53-fpm.conf
A backup of this file has also been created on the server for future diffing:
- /opt/local/etc/php53-fpm.conf.2013-01-13.bak
The changes to this file, (see wiki:PuffinServer#php-fpmconfigchanges) relate to enabling the Munin graphs:
pm.status_path = /status ping.path = /ping
And to reducing the number of php-fpm processes:
pm.start_servers = 4 pm.max_spare_servers = 4
/var/xdrago/second.sh
The changes in this file, (see wiki:PuffinServer#xdragoshellscriptchanges) are to increase the suicide thresholds:
CTL_ONEX_SPIDER_LOAD=2716 CTL_FIVX_SPIDER_LOAD=2716 CTL_ONEX_LOAD=10108 CTL_FIVX_LOAD=6216 CTL_ONEX_LOAD_CRIT=13216 CTL_FIVX_LOAD_CRIT=10885
See ticket:555 for background info.
comment:7 Changed 3 years ago by chris
- Add Hours to Ticket changed from 0.0 to 0.35
- Total Hours changed from 2.36 to 2.71
Memory: MySQL vs php-fpm
I suspect that the key change that reverting the default BOA configs will result in is a dramatic shift of memory allocation away from MySQL to php-fpm.
See the changes from the default setting documented above:
This week (not an unusual week as far as I'm aware) the site has been running with a average of 1.22 php-fpm processes and has spiked to a max of around 12 active processes:
Each process takes around 90MB of RAM (though some of this might be shared between processes?):
Times when the default BOA settings has set the number of php-fpm processes a lot higher (it's currently set to 5) can be seen in this graph:
At thoses times the higher minimum number of php-fpm processes resulted in a higher overall memory usage by php-fpm:
Since, most of the time, there is no need for a lot of php-fpm processes the minimum number of processes has been dramatically reduced and the memory that this has saved has been allocated to MySQL via large increases in the cache settings.
comment:8 Changed 3 years ago by chris
Oops, all the filenames used for the Munin graphs above should have 2014 in them not 2013...
comment:9 follow-up: ↓ 10 Changed 3 years ago by jim
Good info, thanks Chris... Question: according to the chart, memory per FPM process was at its lowest in March when we commissioned the server and the settings were default -- what's changed between then and now do you think?
I wonder if FPM's usage is shared as you say, or other settings/caches/buffers have an impact on it.
The purpose of this ticket is a sanity check and to establish a) if we need our current customistions, b) if they can be improved, c) if the lessons can be learned and passed back to the BOA project.
comment:10 in reply to: ↑ 9 Changed 3 years ago by chris
- Add Hours to Ticket changed from 0.0 to 0.25
- Total Hours changed from 2.71 to 2.96
Replying to jim:
Question: according to the chart, memory per FPM process was at its lowest in March when we commissioned the server and the settings were default -- what's changed between then and now do you think?
I don't know, looking at the Timeline (note annoying use of US date format)
/trac/timeline?from=06%2F01%2F13&daysback=30&authors=&milestone=on&ticket=on&changeset=on&wiki=on&update=Update it could be the upgrade to BOA 2.0.9?
comment:11 Changed 3 years ago by chris
- Cc planetlarg added
- Add Hours to Ticket changed from 0.0 to 0.25
- Component changed from Unassigned to Live server
- Total Hours changed from 2.96 to 3.21
Nick added as a CC.
This graph illustrates the additional memory we have allocated to MySQL and the tweaks we have made varying the RAM allocated to the query cache between 1GB and 0.5GB:
/etc/redis/redis.conf
A backup of this file has also been created on the server for future diffing:
- /etc/redis/redis.conf.2014-01-15.bak
Looking at a diff with the oldest backup in /etc/redis/:
diff redis.conf-pre-BOA-2.0.5-130108-1232 redis.conf | vim - 281c281 < maxmemory 512MB --- > maxmemory 1024MB
We have doubled the memory available to Redis to 1GB.
maxmemory 1024MB
comment:12 Changed 3 years ago by chris
- Add Hours to Ticket changed from 0.0 to 0.85
- Total Hours changed from 3.21 to 4.06
/root/.barracuda.cnf
I have backed up the /root/.barracuda.cnf file to /root/.barracuda.cnf.2014-01-17.bak and removed the _NEWRELIC_KEY variable from /root/.barracuda.cnf as it's no longer needed and attached it here:
In order to reverse all the customisation that we have done I think these are the things we need to look at in /root/.barracuda.cnf before the next BOA update, ticket:629
_XTRAS_LIST="PDS CSF CHV"
The options for this are, listed here http://drupalcode.org/project/barracuda.git/blob/HEAD:/docs/NOTES.txt#l2
Xtras included with "ALL" wildcard: CGP --- Collectd Graph Panel CHV --- Chive DB Manager CSF --- Firewall CSS --- Compass Tools FTP --- Pure-FTPd server with forced FTPS PDS --- Fast DNS Cache Server (pdnsd) WMN --- Webmin Control Panel Xtras which need to be listed explicitly: BDD --- SQL Buddy DB Manager BND --- Bind9 DNS Server BZR --- Bazaar FMG --- FFmpeg support GIT --- Latest Git from sources SR1 --- Apache Solr 1 with Jetty 7 SR3 --- Apache Solr 3 with Jetty 8 SR4 --- Apache Solr 4 with Jetty 8 or 9
Is anyone using the Chive DB Manager? It is available here:
Chive is a web interface to MySQL, see http://www.chive-project.com/
If Chive would be useful to people I can add some documentation about it to wiki:PuffinServer.
I'm not sure why we have a FTP server running when we don't have FTP set in _XTRAS_LIST? See the note at the end of ticket:674#comment:4
_AUTOPILOT=YES
We really need to change this to NO as I think this is the cause of the problems with the Debian upgrade, see ticket:535#comment:23
_PHP_FPM_WORKERS=AUTO
If we end up with a lot of unneeded PHP-FPM processes, see ticket:670#Memory:MySQLvsphp-fpm we might want to set this to 4 or so.
_PHP_FPM_VERSION=5.3 _PHP_CLI_VERSION=5.3
We are using the default versiopn of PHP, see http://drupalcode.org/project/barracuda.git/blob/HEAD:/BARRACUDA.sh.txt the other options are 5.5, 5.4, and 5.2.
_LOAD_LIMIT_ONE=8664 _LOAD_LIMIT_TWO=5328
See wiki:PuffinServer#LoadSpikes for notes on these thresholds.
#_CUSTOM_CONFIG_SQL=NO _CUSTOM_CONFIG_SQL=YES
We have a customised /etc/mysql/my.cnf, see ticket:670#etcmysqlmy.cnf
#_CUSTOM_CONFIG_PHP_5_3=NO _CUSTOM_CONFIG_PHP_5_3=YES
We have a customised /opt/local/etc/php53-fpm.conf see ticket:670#optlocaletcphp53-fpm.conf
_SYSTEM_UPGRADE_ONLY=YES
Should this be set to NO? As it is it will prevent cause the skipping of Aegir Master Instance upgrades, see http://drupalcode.org/project/barracuda.git/blob/HEAD:/BARRACUDA.sh.txt#l346
_SQUEEZE_TO_WHEEZY=YES
This can be changed to NO since we are on Wheezy now.
Anything else I have missed?
comment:13 follow-up: ↓ 14 Changed 3 years ago by jim
- Add Hours to Ticket changed from 0.0 to 0.05
- Total Hours changed from 4.06 to 4.11
Chris, my time is running out so I'm going to leave this in your capable hands -- though if there are specific questions or tasks I'll answer/do them when they come up.
All the above looks good... The only thing we've added to the BOA setup (though it's not a change from stock as BOA supports this) within the Aegir/Drupal? world is the /data/conf/override.global.inc file to do some Session 443 and developer tweaks.
So I'm presently happy with this if you are.
comment:14 in reply to: ↑ 13 Changed 3 years ago by chris
- Add Hours to Ticket changed from 0.0 to 0.25
- Total Hours changed from 4.11 to 4.36
Replying to jim:
So I'm presently happy with this if you are.
I'm happy that we have documented all the tweaks that have been made and I'm happy to see what the default settings would be, but I expect after the next BOA upgrade we will need to redo these changes:
- Edit the firewall scripts to stop some things being blocked and block other things ticket:670#etccsfcsf.allow
- Edit the database config to give it more RAM, it is currently using 2.5G see ticket:670#etcmysqlmy.cnf
- Edit the PHP-FPM config to reduce the number of processes and amount of RAM it has ticket:670#Memory:MySQLvsphp-fpm see ticket:670#optlocaletcphp53-fpm.conf
But I'd be happy to find that the above tweaks were not needed.
comment:15 Changed 3 years ago by chris
Reviewing the changes we need to make to /root/.barracuda.cnf prior to tonights upgrade, see ticket:707.
I think we can change this:
#_XTRAS_LIST="PDS CSF CHV" _XTRAS_LIST="PDS CSF"
As we don't need Chive do we?
CHV --- Chive DB Manager
Note that:
### Note that removing any item from this ### list once it is already installed, will ### NOT uninstall anything.
So Chive need manually uninstalling, on the other hand the upgrade will result in:
- Use Two-Factor-like Authentication logic for Chive DB Manager access.
Which will make it a lot more secure as someone will need to ping their IP address from the server before they can use Chive, but since people with ssh access can use the !MySQL command line I'm still not sure Chive is needed?
I'll include it for now as the removal process would be additional work.
The full list of options is here: http://drupalcode.org/project/barracuda.git/blob/HEAD:/BARRACUDA.sh.txt
This has been changed to No:
_CUSTOM_CONFIG_PHP_5_3=NO #_CUSTOM_CONFIG_PHP_5_3=YES
I haven't changed this:
_CUSTOM_CONFIG_CSF=YES
As I really think it would be a waste of time to redo all the tweaks to the firewall, see ticket:670#comment:6
This has been changed:
#_SYSTEM_UPGRADE_ONLY=YES _SYSTEM_UPGRADE_ONLY=NO
This has been changed:
#_BUILD_FROM_SRC=YES _BUILD_FROM_SRC=NO
This has been changed:
_CUSTOM_CONFIG_SQL=NO #_CUSTOM_CONFIG_SQL=YES
But I expect we will want to change this back to YES and use the existing !MySQL config as a lot of time has been invested in it.
This has been changed as we are using Wheezy:
#_SQUEEZE_TO_WHEEZY=YES _SQUEEZE_TO_WHEEZY=NO
This is the resulting updated file:
### ### Configuration created on 121215-1545 ### with Barracuda version BOA-2.0.4 ### ### NOTE: the group of settings displayed bellow will *not* be overriden ### on upgrade by the Barracuda script nor by this configuration file. ### They can be defined only on initial Barracuda install. ### _HTTP_WILDCARD=YES _MY_OWNIP="81.95.52.103" #_MY_OWNIP="" _MY_HOSTN="puffin.webarch.net" #_MY_HOSTN="" _MY_FRONT="master.puffin.webarch.net" _THIS_DB_HOST=localhost #_THIS_DB_HOST=FQDN _SMTP_RELAY_TEST=YES _SMTP_RELAY_HOST="" _LOCAL_NETWORK_IP="" _LOCAL_NETWORK_HN="" ### ### NOTE: the group of settings displayed bellow ### will *override* all listed settings in the Barracuda script, ### both on initial install and upgrade. ### _MY_EMAIL="chris@webarchitects.co.uk" _XTRAS_LIST="PDS CSF CHV" _AUTOPILOT=NO _DEBUG_MODE=NO _DB_SERVER=MariaDB _SSH_PORT=22 _LOCAL_DEBIAN_MIRROR="ftp.debian.org" _LOCAL_UBUNTU_MIRROR="archive.ubuntu.com" _FORCE_GIT_MIRROR="" _DNS_SETUP_TEST=YES _NGINX_EXTRA_CONF="" _NGINX_WORKERS=AUTO _PHP_FPM_WORKERS=AUTO #_BUILD_FROM_SRC=YES _BUILD_FROM_SRC=NO _PHP_MODERN_ONLY=YES _PHP_FPM_VERSION=5.3 _PHP_CLI_VERSION=5.3 #_LOAD_LIMIT_ONE=1444 #_LOAD_LIMIT_TWO=888 _LOAD_LIMIT_ONE=8664 _LOAD_LIMIT_TWO=5328 _CUSTOM_CONFIG_CSF=YES _CUSTOM_CONFIG_SQL=NO #_CUSTOM_CONFIG_SQL=YES _CUSTOM_CONFIG_REDIS=NO _CUSTOM_CONFIG_PHP_5_2=NO _CUSTOM_CONFIG_PHP_5_3=NO #_CUSTOM_CONFIG_PHP_5_3=YES _SPEED_VALID_MAX=3600 _NGINX_DOS_LIMIT=300 #_SYSTEM_UPGRADE_ONLY=YES _SYSTEM_UPGRADE_ONLY=NO _USE_MEMCACHED=NO _NEWRELIC_KEY= _USE_STOCK=NO ### ### Configuration created on 121215-1545 ### with Barracuda version BOA-2.0.4 ### _EXTRA_PACKAGES= _PHP_EXTRA_CONF="" _STRONG_PASSWORDS=NO _DB_BINARY_LOG=NO _DB_ENGINE=InnoDB _NGINX_LDAP=NO _PHP_GEOS=NO _PHP_MONGODB=NO _AEGIR_UPGRADE_ONLY=NO ### Squeeze to Wheezy upgrade config ### See /trac/ticket/535 #_SQUEEZE_TO_WHEEZY=YES _SQUEEZE_TO_WHEEZY=NO _NGINX_FORWARD_SECRECY=YES _NGINX_SPDY=YES #_BUILD_FROM_SRC=NO _NGINX_NAXSI=NO _PHP_ZEND_OPCACHE=YES _PERMISSIONS_FIX=YES _MODULES_FIX=YES _MODULES_SKIP="" _SSL_FROM_SOURCES=NO _SSH_FROM_SOURCES=NO _RESERVED_RAM=0
In the description of this ticket Jim suggests:
Run the BOA BOND.sh script to do the tuning of the server appropriate to the memory requirements.
There isn't a copy of this on the server, so:
cd /usr/local/bin lynx -dump -source https://raw.githubusercontent.com/omega8cc/boa/master/aegir/tools/BOND.sh.txt > BOND.sh chmod 750 BOND.sh
Try running it:
Tuner [Mon Mar 31 14:03:52 BST 2014] ==> INFO: This script is ran as a root user Tuner [Mon Mar 31 14:03:52 BST 2014] ==> ERROR: This script should be used only when the same version of BARRACUDA was used before Tuner [Mon Mar 31 14:03:52 BST 2014] ==> Your system has to be configured/upgraded by BARRACUDA version BOA-2.2.0 first Tuner [Mon Mar 31 14:03:52 BST 2014] ==> Bye
So this is something to do after the upgrade.
The time spent on this comment has been recorded on ticket:707#comment:5
comment:16 follow-up: ↓ 17 Changed 3 years ago by jim
- Add Hours to Ticket changed from 0.0 to 0.05
- Total Hours changed from 4.36 to 4.41
Hi Chris, a few answers:
- We do need Chive, it's very useful to have in the background where it costs us nothing, so please leave it enabled.
- CSF changes are necessary so yes, keep those customisations.
- The MySQL changes may or may not be needed post update, and their settings should be managed with a view to reducing the overal memory allocation for the entire VM... 8Gb is a lot, I would wager we can get by with 4-6Gb given the enhancements provided by the Zend Opcache and BOA 2.2.0... So for me these settings are certainly candidates for 'back to stock' unless there's a proven reason not to -- what do you think?
comment:17 in reply to: ↑ 16 ; follow-up: ↓ 19 Changed 3 years ago by chris
- Add Hours to Ticket changed from 0.0 to 0.25
- Total Hours changed from 4.41 to 4.66
Replying to jim:
- We do need Chive, it's very useful to have in the background where it costs us nothing, so please leave it enabled.
OK.
- CSF changes are necessary so yes, keep those customisations.
OK.
- The MySQL changes may or may not be needed post update, and their settings should be managed with a view to reducing the overal memory allocation for the entire VM... 8Gb is a lot, I would wager we can get by with 4-6Gb given the enhancements provided by the Zend Opcache and BOA 2.2.0... So for me these settings are certainly candidates for 'back to stock' unless there's a proven reason not to -- what do you think?
!MySQL using 2.5GB of RAM at the moment, see:
It has 768M of data in the query cache:
The dumped database size is 221M.
I would expect to see the performance reduce if the amount of RAM available to MySQL is reduced, but I'm happy to test this assumption if needs be.
I'm not sure 8GB of RAM for the server is a lot given the size and complexity of the site and traffic it gets. For reference these are the latest bandwidth stats from Xen:
puffin / monthly month rx | tx | total | avg. rate ------------------------+-------------+-------------+--------------- Apr '13 68.61 GiB | 14.06 GiB | 82.66 GiB | 267.52 kbit/s May '13 65.49 GiB | 22.61 GiB | 88.10 GiB | 275.92 kbit/s Jun '13 68.12 GiB | 16.18 GiB | 84.31 GiB | 272.85 kbit/s Jul '13 113.14 GiB | 21.98 GiB | 135.12 GiB | 423.18 kbit/s Aug '13 124.42 GiB | 17.20 GiB | 141.62 GiB | 443.56 kbit/s Sep '13 139.33 GiB | 13.78 GiB | 153.10 GiB | 495.49 kbit/s Oct '13 143.35 GiB | 13.97 GiB | 157.32 GiB | 492.72 kbit/s Nov '13 121.11 GiB | 12.47 GiB | 133.57 GiB | 432.29 kbit/s Dec '13 112.36 GiB | 10.83 GiB | 123.19 GiB | 385.82 kbit/s Jan '14 133.04 GiB | 15.02 GiB | 148.06 GiB | 463.72 kbit/s Feb '14 110.55 GiB | 10.57 GiB | 121.13 GiB | 420.01 kbit/s Mar '14 113.76 GiB | 10.79 GiB | 124.54 GiB | 395.56 kbit/s ------------------------+-------------+-------------+--------------- estimated 115.36 GiB | 10.94 GiB | 126.30 GiB |
But I guess we could try reducing it by 1 or 2GB, wiki:PenguinServer really could do with more, so it could be moved there:
- https://penguin.transitionnetwork.org/munin/transitionnetwork.org/penguin.transitionnetwork.org/multips_memory.html
- https://penguin.transitionnetwork.org/munin/transitionnetwork.org/penguin.transitionnetwork.org/memory.html
If the RAM is reduced the time it would be noticed the most would be when there are traffic spikes I expect.
According to the Piwik stats the biggest traffic spike this year was on 14th Feb with 3.2k visitors and 5.5k page views (note this excludes bots, people with JS disabled and people with Do Not Track headers set).
comment:18 Changed 3 years ago by chris
Last night the server was updated to the latest BOA and this morning the server went down, see ticket:707#comment:23 and my first impression is that we are now back to the load spike suicide situation wiki:PuffinServer#LoadSpikes
comment:19 in reply to: ↑ 17 Changed 3 years ago by chris
Replying to chris:
I would expect to see the performance reduce if the amount of RAM available to MySQL is reduced
The amount of RAM available to MySQL has been halved by the BOA upgrade, for more detail see ticket:587#comment:11 but I'm not sure how to measure if this has made things faster or slower, it looks like the slow query log is no longer generated -- there are no stats anymore:
comment:20 Changed 3 years ago by chris
- Add Hours to Ticket changed from 0.0 to 0.5
- Total Hours changed from 4.66 to 5.16
This is the /root/.barracuda.cnf after the upgrade to BOA 2.2.2:
### ### Configuration created on 121215-1545 ### with Barracuda version BOA-2.0.4 ### ### NOTE: the group of settings displayed bellow will *not* be overriden ### on upgrade by the Barracuda script nor by this configuration file. ### They can be defined only on initial Barracuda install. ### _HTTP_WILDCARD=YES _MY_OWNIP="81.95.52.103" #_MY_OWNIP="" _MY_HOSTN="puffin.webarch.net" #_MY_HOSTN="" _MY_FRONT="master.puffin.webarch.net" _THIS_DB_HOST=localhost #_THIS_DB_HOST=FQDN _SMTP_RELAY_TEST=YES _SMTP_RELAY_HOST="" _LOCAL_NETWORK_IP="" _LOCAL_NETWORK_HN="" ### ### NOTE: the group of settings displayed bellow ### will *override* all listed settings in the Barracuda script, ### both on initial install and upgrade. ### _MY_EMAIL="chris@webarchitects.co.uk" _XTRAS_LIST="PDS CSF CHV" _AUTOPILOT=NO _DEBUG_MODE=NO _DB_SERVER=MariaDB _SSH_PORT=22 _LOCAL_DEBIAN_MIRROR="ftp.debian.org" _LOCAL_UBUNTU_MIRROR="archive.ubuntu.com" _FORCE_GIT_MIRROR="" _DNS_SETUP_TEST=YES _NGINX_EXTRA_CONF="" _NGINX_WORKERS=AUTO _PHP_FPM_WORKERS=AUTO _PHP_FPM_VERSION=5.3 _PHP_CLI_VERSION=5.3 _CUSTOM_CONFIG_CSF=YES _CUSTOM_CONFIG_SQL=NO #_CUSTOM_CONFIG_SQL=YES _CUSTOM_CONFIG_REDIS=NO _CUSTOM_CONFIG_PHP_5_2=NO _CUSTOM_CONFIG_PHP_5_3=NO #_CUSTOM_CONFIG_PHP_5_3=YES _SPEED_VALID_MAX=3600 _NGINX_DOS_LIMIT=300 #_SYSTEM_UPGRADE_ONLY=YES _SYSTEM_UPGRADE_ONLY=NO _NEWRELIC_KEY= _USE_STOCK=NO ### ### Configuration created on 121215-1545 ### with Barracuda version BOA-2.0.4 ### _EXTRA_PACKAGES= _PHP_EXTRA_CONF="" _STRONG_PASSWORDS=YES _DB_BINARY_LOG=NO _DB_ENGINE=InnoDB _NGINX_LDAP=NO _PHP_GEOS=NO _PHP_MONGODB=NO _AEGIR_UPGRADE_ONLY=NO ### Squeeze to Wheezy upgrade config ### See /trac/ticket/535 #_SQUEEZE_TO_WHEEZY=YES _SQUEEZE_TO_WHEEZY=NO _NGINX_FORWARD_SECRECY=YES _NGINX_SPDY=YES _NGINX_NAXSI=NO _PERMISSIONS_FIX=YES _MODULES_FIX=YES _MODULES_SKIP="" _SSL_FROM_SOURCES=NO _SSH_FROM_SOURCES=NO _RESERVED_RAM=0 _PHP_MULTI_INSTALL="5.3" _CUSTOM_CONFIG_LSHELL=NO _CUSTOM_CONFIG_PHP55=NO _CUSTOM_CONFIG_PHP54=NO _CUSTOM_CONFIG_PHP53=NO _CUSTOM_CONFIG_PHP52=NO _CPU_SPIDER_RATIO=3 _CPU_MAX_RATIO=6 _CPU_CRIT_RATIO=9 _PHP_FPM_DENY="" _REDIS_LISTEN_MODE=PORT _STRICT_BIN_PERMISSIONS=YES
Jim has suggested on the Ttech list:
switch Redis port to 'socket' which is recommended from 'port'.
So this line has been changed:
_REDIS_LISTEN_MODE=SOCKET
Time recorded on this tick includes time spend looking at Munin stats, responding the email on the Ttech list and a phone call with Ed.
comment:21 Changed 3 years ago by chris
To get these graphs working:
- https://penguin.transitionnetwork.org/munin/transitionnetwork.org/puffin.transitionnetwork.org/phpfpm_connections.html
- https://penguin.transitionnetwork.org/munin/transitionnetwork.org/puffin.transitionnetwork.org/phpfpm_status.html
It was necessary to customise the stock BOA settings, see ticket:707#comment:32 and wiki:PuffinServer#nginxconfigchanges
comment:22 follow-up: ↓ 23 Changed 2 years ago by chris
- Add Hours to Ticket changed from 0.0 to 1.0
- Total Hours changed from 5.16 to 6.16
Since we have "Rolled back performance customisations and use stock BOA settings where possible" the server has been having load spikes, some of these have been so big that they would have triggered a server suicide had the previous xdrago shell scripts been in place, in the last week we have had these spikes, the following lines are from the email subject lines from the server alerts:
May 25 lfd on puffin.webarch.net: High 1 minute load average alert - 3.29 May 25 lfd on puffin.webarch.net: High 1 minute load average alert - 4.80 May 25 lfd on puffin.webarch.net: High 1 minute load average alert - 3.35 May 25 lfd on puffin.webarch.net: High 1 minute load average alert - 4.38 May 25 lfd on puffin.webarch.net: High 1 minute load average alert - 3.46 May 25 lfd on puffin.webarch.net: High 1 minute load average alert - 3.33 May 25 lfd on puffin.webarch.net: High 1 minute load average alert - 4.54 May 25 lfd on puffin.webarch.net: High 1 minute load average alert - 19.42 May 25 lfd on puffin.webarch.net: High 1 minute load average alert - 60.34 May 25 lfd on puffin.webarch.net: High 1 minute load average alert - 29.69 May 25 lfd on puffin.webarch.net: High 1 minute load average alert - 11.17 May 25 lfd on puffin.webarch.net: High 1 minute load average alert - 4.22 May 25 lfd on puffin.webarch.net: High 1 minute load average alert - 3.05 May 25 lfd on puffin.webarch.net: High 1 minute load average alert - 5.95 May 25 lfd on puffin.webarch.net: High 1 minute load average alert - 3.37 May 26 lfd on puffin.webarch.net: High 1 minute load average alert - 4.11 May 26 lfd on puffin.webarch.net: High 1 minute load average alert - 6.55 May 26 lfd on puffin.webarch.net: High 1 minute load average alert - 9.45 May 26 lfd on puffin.webarch.net: High 1 minute load average alert - 3.89 May 26 lfd on puffin.webarch.net: High 1 minute load average alert - 4.75 May 26 lfd on puffin.webarch.net: High 1 minute load average alert - 3.80 May 26 lfd on puffin.webarch.net: High 1 minute load average alert - 4.29 May 26 lfd on puffin.webarch.net: High 1 minute load average alert - 3.21 May 26 lfd on puffin.webarch.net: High 1 minute load average alert - 3.03 May 26 lfd on puffin.webarch.net: High 1 minute load average alert - 20.50 May 26 lfd on puffin.webarch.net: High 1 minute load average alert - 11.40 May 26 lfd on puffin.webarch.net: High 1 minute load average alert - 4.29 May 26 lfd on puffin.webarch.net: High 1 minute load average alert - 5.24 May 26 lfd on puffin.webarch.net: High 1 minute load average alert - 7.05 May 26 lfd on puffin.webarch.net: High 1 minute load average alert - 14.50 May 26 lfd on puffin.webarch.net: High 1 minute load average alert - 8.59 May 26 lfd on puffin.webarch.net: High 1 minute load average alert - 30.44 May 26 lfd on puffin.webarch.net: High 1 minute load average alert - 11.62 May 26 lfd on puffin.webarch.net: High 1 minute load average alert - 4.30 May 26 lfd on puffin.webarch.net: High 1 minute load average alert - 3.35 May 26 lfd on puffin.webarch.net: High 1 minute load average alert - 14.88 May 26 lfd on puffin.webarch.net: High 1 minute load average alert - 5.62 May 26 lfd on puffin.webarch.net: High 1 minute load average alert - 25.06 May 26 lfd on puffin.webarch.net: High 1 minute load average alert - 8.90 May 26 lfd on puffin.webarch.net: High 1 minute load average alert - 3.69 May 26 lfd on puffin.webarch.net: High 1 minute load average alert - 3.60 May 27 lfd on puffin.webarch.net: High 1 minute load average alert - 3.76 May 27 lfd on puffin.webarch.net: High 1 minute load average alert - 5.73 May 27 lfd on puffin.webarch.net: High 1 minute load average alert - 3.77 May 27 lfd on puffin.webarch.net: High 1 minute load average alert - 5.69 May 27 lfd on puffin.webarch.net: High 1 minute load average alert - 3.66 May 27 lfd on puffin.webarch.net: High 1 minute load average alert - 4.50 May 27 lfd on puffin.webarch.net: High 1 minute load average alert - 5.94 May 27 lfd on puffin.webarch.net: High 1 minute load average alert - 5.69 May 27 lfd on puffin.webarch.net: High 1 minute load average alert - 27.10 May 27 lfd on puffin.webarch.net: High 1 minute load average alert - 9.60 May 27 lfd on puffin.webarch.net: High 1 minute load average alert - 3.78 May 27 lfd on puffin.webarch.net: High 1 minute load average alert - 5.64 May 27 lfd on puffin.webarch.net: High 1 minute load average alert - 4.73 May 27 lfd on puffin.webarch.net: High 1 minute load average alert - 8.57 May 27 lfd on puffin.webarch.net: High 1 minute load average alert - 43.91 May 27 lfd on puffin.webarch.net: High 1 minute load average alert - 20.11 May 27 lfd on puffin.webarch.net: High 1 minute load average alert - 8.14 May 27 lfd on puffin.webarch.net: High 1 minute load average alert - 3.28 May 27 lfd on puffin.webarch.net: High 1 minute load average alert - 5.13 May 27 lfd on puffin.webarch.net: High 1 minute load average alert - 5.25 May 27 lfd on puffin.webarch.net: High 1 minute load average alert - 4.33 May 28 lfd on puffin.webarch.net: High 1 minute load average alert - 3.03 May 28 lfd on puffin.webarch.net: High 1 minute load average alert - 6.34 May 28 lfd on puffin.webarch.net: High 1 minute load average alert - 3.38 May 28 lfd on puffin.webarch.net: High 1 minute load average alert - 3.46 May 28 lfd on puffin.webarch.net: High 1 minute load average alert - 3.09 May 28 lfd on puffin.webarch.net: High 1 minute load average alert - 6.03 May 28 lfd on puffin.webarch.net: High 1 minute load average alert - 7.16 May 28 lfd on puffin.webarch.net: High 1 minute load average alert - 12.05 May 28 lfd on puffin.webarch.net: High 1 minute load average alert - 4.72 May 28 lfd on puffin.webarch.net: High 1 minute load average alert - 3.12 May 28 lfd on puffin.webarch.net: High 1 minute load average alert - 76.28 May 28 lfd on puffin.webarch.net: High 1 minute load average alert - 101.63 May 28 lfd on puffin.webarch.net: High 1 minute load average alert - 72.41 May 28 lfd on puffin.webarch.net: High 1 minute load average alert - 10.77 May 28 lfd on puffin.webarch.net: High 1 minute load average alert - 4.20 May 28 lfd on puffin.webarch.net: High 1 minute load average alert - 6.52 May 28 lfd on puffin.webarch.net: High 1 minute load average alert - 6.32 May 28 lfd on puffin.webarch.net: High 1 minute load average alert - 8.05 May 28 lfd on puffin.webarch.net: High 1 minute load average alert - 3.17 May 28 lfd on puffin.webarch.net: High 1 minute load average alert - 6.97 May 28 lfd on puffin.webarch.net: High 1 minute load average alert - 5.34 May 28 lfd on puffin.webarch.net: High 1 minute load average alert - 4.22 May 28 lfd on puffin.webarch.net: High 1 minute load average alert - 4.58 May 28 lfd on puffin.webarch.net: High 1 minute load average alert - 66.02 May 28 lfd on puffin.webarch.net: High 1 minute load average alert - 92.50 May 28 lfd on puffin.webarch.net: High 1 minute load average alert - 35.54 May 28 lfd on puffin.webarch.net: High 1 minute load average alert - 13.53 May 28 lfd on puffin.webarch.net: High 1 minute load average alert - 5.29 May 29 lfd on puffin.webarch.net: High 1 minute load average alert - 3.62 May 29 lfd on puffin.webarch.net: High 1 minute load average alert - 6.81 May 29 lfd on puffin.webarch.net: High 1 minute load average alert - 3.67 May 29 lfd on puffin.webarch.net: High 1 minute load average alert - 3.37 May 29 lfd on puffin.webarch.net: High 1 minute load average alert - 3.19 May 29 lfd on puffin.webarch.net: High 1 minute load average alert - 3.83 May 30 lfd on puffin.webarch.net: High 1 minute load average alert - 3.20 May 30 lfd on puffin.webarch.net: High 1 minute load average alert - 3.24 May 30 lfd on puffin.webarch.net: High 1 minute load average alert - 4.14 May 30 lfd on puffin.webarch.net: High 1 minute load average alert - 3.27 May 30 lfd on puffin.webarch.net: High 1 minute load average alert - 7.16 May 31 lfd on puffin.webarch.net: High 1 minute load average alert - 3.55 May 31 lfd on puffin.webarch.net: High 1 minute load average alert - 5.68 May 31 lfd on puffin.webarch.net: High 1 minute load average alert - 4.02 May 31 lfd on puffin.webarch.net: High 1 minute load average alert - 3.39 May 31 lfd on puffin.webarch.net: High 1 minute load average alert - 3.01 May 31 lfd on puffin.webarch.net: High 1 minute load average alert - 4.10 May 31 lfd on puffin.webarch.net: High 1 minute load average alert - 3.20 Jun 01 lfd on puffin.webarch.net: High 1 minute load average alert - 3.19 Jun 01 lfd on puffin.webarch.net: High 1 minute load average alert - 6.67 Jun 01 lfd on puffin.webarch.net: High 1 minute load average alert - 5.98 Jun 01 lfd on puffin.webarch.net: High 1 minute load average alert - 4.01 Jun 01 lfd on puffin.webarch.net: High 1 minute load average alert - 3.66 Jun 01 lfd on puffin.webarch.net: High 1 minute load average alert - 3.92 Jun 01 lfd on puffin.webarch.net: High 1 minute load average alert - 6.25 Jun 01 lfd on puffin.webarch.net: High 1 minute load average alert - 4.72 Jun 01 lfd on puffin.webarch.net: High 1 minute load average alert - 3.82 Jun 01 lfd on puffin.webarch.net: High 1 minute load average alert - 11.84 Jun 01 lfd on puffin.webarch.net: High 1 minute load average alert - 5.10 Jun 02 lfd on puffin.webarch.net: High 1 minute load average alert - 3.15 Jun 02 lfd on puffin.webarch.net: High 1 minute load average alert - 3.36 Jun 02 lfd on puffin.webarch.net: High 1 minute load average alert - 4.68 Jun 02 lfd on puffin.webarch.net: High 1 minute load average alert - 3.24 Jun 02 lfd on puffin.webarch.net: High 1 minute load average alert - 6.86 Jun 02 lfd on puffin.webarch.net: High 1 minute load average alert - 3.49
The ones below 14 are not something to worry about -- the server has 14 CPUs, these are the concerning ones:
- Sun, 25 May 2014 09:54:00 - 60.34
- Mon, 26 May 2014 06:25:56 - 20.50
- Mon, 26 May 2014 15:58:24 - 30.44
- Mon, 26 May 2014 18:45:39 - 25.06
- Tue, 27 May 2014 06:25:47 - 27.10
- Tue, 27 May 2014 08:47:04 - 43.91
- Wed, 28 May 2014 11:05:00 - 101.63
- Wed, 28 May 2014 13:52:26 - 92.50
I suspect that if we renistate the memory allocation that was removed from MySQL by the "Roll back performance customisations and use stock BOA settings where possible" policy -- MySQL memory was reduced by 50%, see ticket:587#comment:11 and ticket:707#comment:39 -- then there is a chance that these spikes would be dramatically reduced. Ed / Jim -- are you willing to give this a try?
In trac:ticket/707#comment:5 it was suggested that:
After the upgrade has been done this should be run: /usr/local/bin/BOND.sh
This hadn't been done, so:
/usr/local/bin/BOND.sh Tuner [Mon Jun 2 11:38:14 BST 2014] ==> INFO: This script is ran as a root user Tuner [Mon Jun 2 11:38:14 BST 2014] ==> ERROR: This script should be used only when the same version of BARRACUDA was used before Tuner [Mon Jun 2 11:38:14 BST 2014] ==> Your system has to be configured/upgraded by BARRACUDA version BOA-2.2.0 first Tuner [Mon Jun 2 11:38:14 BST 2014] ==> Bye
Not sure what that means exactly...
The old load spike suicide documentation has been archived to wiki:PuffinServerBoaLoadSpikes.
comment:23 in reply to: ↑ 22 Changed 2 years ago by chris
- Add Hours to Ticket changed from 0.0 to 0.25
- Total Hours changed from 6.16 to 6.41
Replying to chris:
these are the concerning ones:
- Sun, 25 May 2014 09:54:00 - 60.34
- Mon, 26 May 2014 06:25:56 - 20.50
- Mon, 26 May 2014 15:58:24 - 30.44
- Mon, 26 May 2014 18:45:39 - 25.06
- Tue, 27 May 2014 06:25:47 - 27.10
- Tue, 27 May 2014 08:47:04 - 43.91
- Wed, 28 May 2014 11:05:00 - 101.63
- Wed, 28 May 2014 13:52:26 - 92.50
It's worth noting that these coincide with increases in 50x errors, these are the subject lines from the wiki:ErrorCodeCheck email for the same period:
May 25 - 6211 403, 3013 404, 0 502, 4 503 and 0 504 errors from puffin.webarch.net May 26 - 6267 403, 4386 404, 0 502, 16 503 and 0 504 errors from puffin.webarch.net May 27 - 5825 403, 3593 404, 0 502, 19 503 and 0 504 errors from puffin.webarch.net May 28 - 5544 403, 3296 404, 0 502, 20 503 and 0 504 errors from puffin.webarch.net May 29 - 5866 403, 2685 404, 0 502, 100 503 and 7 504 errors from puffin.webarch.net May 30 - 5155 403, 2619 404, 0 502, 0 503 and 0 504 errors from puffin.webarch.net May 31 - 5148 403, 2487 404, 0 502, 0 503 and 0 504 errors from puffin.webarch.net Jun 01 - 5197 403, 2380 404, 0 502, 0 503 and 0 504 errors from puffin.webarch.net Jun 02 - 4953 403, 2503 404, 0 502, 0 503 and 0 504 errors from puffin.webarch.net
The 403's are mostly blocked bots, the 404's are mostly links to the old wiki pages, it's the 503 and 504's which are probably related to the spikes.
Note that the script that greps for the errors in the Nginx logs runs with logrotate, so the errors numbers above for 29th May relate to the load spike on 28th May.
The number of people visiting the site, as recorded by PiwikServer wasn't significantly higher than usual:
comment:24 follow-up: ↓ 25 Changed 2 years ago by ed
Re: removing the standard settings to a customised set up: Sam is going to arrange a Ttech skype to discuss Paul's success with Aegir publishing where we will take stock of how it was, how to do it next etc. I suggest that we talk about this then - and come up with a clear proposal. How about that?
comment:25 in reply to: ↑ 24 Changed 2 years ago by chris
Replying to ed:
Re: removing the standard settings to a customised set up
All I'm basically suggesting is that we change the MySQL settings so that the number of connections isn't maxed out all the time (see https://penguin.transitionnetwork.org/munin/transitionnetwork.org/puffin.transitionnetwork.org/mysql_connections.html - there is no slack any more) and so that the query cache is bigger, see these comments regarding the memory use being reduced by 50% with the default MySQL settings: ticket:587#comment:11 and ticket:707#comment:39.
Would you like me to list all lines in my.cnf that I'm suggesting are changed?
I suggest that we talk about this then - and come up with a clear proposal. How about that?
OK.
comment:26 Changed 2 years ago by ed
No point showing me lines of code, Chris, best place for this is in the BOA/Aegir meet I reckon so that everyone there hears and discuses it.
comment:27 in reply to: ↑ description Changed 11 months ago by chris
- Status changed from assigned to closed
- Resolution set to fixed
Replying to jim:
Given so much has changed since the initial issues on the server, I now strongly recommend reverting all settings changes that do not add features back to stock BOA settings after the next BOA release.
With hindsight this was a terrible suggestion, we should have ditched BOA many years ago -- commenting out all the BOA root cron jobs appears to have solved all the problems we have had over the years with load spikes, see wiki:PuffinServer#LoadSpikes -- so closing this ticket.