Ticket #707 (closed maintenance: fixed)
Upgrade to BOA-2.2.2
Reported by: | chris | Owned by: | chris |
---|---|---|---|
Priority: | critical | Milestone: | Maintenance |
Component: | Live server | Keywords: | |
Cc: | ed, jim | Estimated Number of Hours: | 0.0 |
Add Hours to Ticket: | 0 | Billable?: | yes |
Total Hours: | 10.87 |
Description
I have created a new ticket for this as I have found having one ticket (see ticket:629) for all BOA upgrades makes it really hard to review past upgrades.
Upgrades from BOA-2.0.7 to BOA-2.1.1 did have their own tickets, see wiki:PuffinServer#Upgradetickets and unless there is a convincing reason not to have one ticket per upgrade I'd rather do it like this.
Jim has pointed out on the Ttech list that:
the v2.2.0 changelog is up as of a few days ago:
http://drupalcode.org/project/barracuda.git/blob/HEAD:/CHANGELOG.txt
The Changelog starts:
- Stable BOA-2.2.0 Release - Full Edition
- Date: TBD
- Includes Aegir 2.x-boa-custom version.
- Release Notes:
There are many important changes and improvements in this release you should be aware of *before* running your BOA system upgrade.
Even if you are on a hosted BOA system with upgrades managed for you, it is very important to read at least this extensive release notes.
And if you are more curious, read also the big changelog further below, which covers only a small number of over 530 commits since BOA-2.1.3
I have yet to read the rest of the Changelog.
There is also a task to copy the proposed changes to the BOA configuration in ticket:629 over to this ticket.
Should people other than chris and ed be CC's for this ticket?
Attachments
Change History
comment:3 Changed 3 years ago by chris
Email from wiki:PuffinServer:
There is new BOA-2.2.0 Stable Edition available.
Please review the changelog and upgrade as soon as possible
to receive all security updates and new features.
Changelog: http://bit.ly/newboa
I'll do the upgrade tonight.
comment:4 Changed 3 years ago by chris
- Add Hours to Ticket changed from 0.0 to 0.25
- Total Hours changed from 0.0 to 0.25
Reading through the Changelog, these issues are ones which effect us:
Custom php.ini protection has changed and will not honor old settings
If you have custom settings in any of your php.ini files protected with
old variable in the /root/.barracuda.cnf, make a backup of your ini files
before running this upgrade. While these files will not get overwritten,
they will no longer be used, because we have introduced new, standardized
directory structure to properly support multi-PHP-versions systems.
Respective php.ini files are now located in /opt/phpXX/etc/phpXX.ini
for FPM and /opt/phpXX/lib/php.ini for CLI, where XX is 55, 54, 53 or 52,
depending on the versions listed via _PHP_MULTI_INSTALL variable in the
/root/.barracuda.cnf file. Also the variables used to protect ini files
from being overwritten have changed to _CUSTOM_CONFIG_PHPXX.
If you need any non-standard settings in any of active ini files, don't
overwrite them with the old files, but rather carefully review and apply
only the differences you need.
All PHP FPM workers in 5.5, 5.4 and 5.3 now use the 'ondemand' mode
This change will help to better manage memory use, especially on systems with
multiple PHP versions running in parallel. This will also free resources
and allocate them dynamically only when requests are coming and only to
the active FPM pools. Note that the 'ondemand' mode doesn't affect Zend
OPcache, because it is managed by the parent process(es) which stay(s) active.
The net result is that on a vanilla BOA install, without non-hostmaster sites
running, the complete stack consumes just ~200 MB of RAM (in total, so with
MariaDB, Redis and Nginx etc. included) with all three PHP-FPM versions
running in parallel: 5.5, 5.4 and 5.3:
But I don't think these will require any action on our part, they just address things we were manually fixing. Our documentation will need updating after the upgrade, wiki:PuffinServer.
comment:5 Changed 3 years ago by chris
- Add Hours to Ticket changed from 0.0 to 1.0
- Total Hours changed from 0.25 to 1.25
Reviewing the discussion on ticket:629 these are the issues we need to be aware of when doing the BOA upgrade tonight:
php-fpm status
See ticket:629#comment:32, this addresses:
Will change to:
And /etc/munin/plugin-conf.d/munin-node will need updating to:
[phpfpm*] env.url http://127.0.0.1/php-status
barracuda config
Reviewing the discussion on ticket:670, this is the current /root/.barracuda.cnf:
### ### Configuration created on 121215-1545 ### with Barracuda version BOA-2.0.4 ### ### NOTE: the group of settings displayed bellow will *not* be overriden ### on upgrade by the Barracuda script nor by this configuration file. ### They can be defined only on initial Barracuda install. ### _HTTP_WILDCARD=YES _MY_OWNIP="81.95.52.103" #_MY_OWNIP="" _MY_HOSTN="puffin.webarch.net" #_MY_HOSTN="" _MY_FRONT="master.puffin.webarch.net" _THIS_DB_HOST=localhost #_THIS_DB_HOST=FQDN _SMTP_RELAY_TEST=YES _SMTP_RELAY_HOST="" _LOCAL_NETWORK_IP="" _LOCAL_NETWORK_HN="" ### ### NOTE: the group of settings displayed bellow ### will *override* all listed settings in the Barracuda script, ### both on initial install and upgrade. ### _MY_EMAIL="chris@webarchitects.co.uk" _XTRAS_LIST="PDS CSF CHV" _AUTOPILOT=NO _DEBUG_MODE=NO _DB_SERVER=MariaDB _SSH_PORT=22 _LOCAL_DEBIAN_MIRROR="ftp.debian.org" _LOCAL_UBUNTU_MIRROR="archive.ubuntu.com" _FORCE_GIT_MIRROR="" _DNS_SETUP_TEST=YES _NGINX_EXTRA_CONF="" _NGINX_WORKERS=AUTO _PHP_FPM_WORKERS=AUTO _BUILD_FROM_SRC=YES _PHP_MODERN_ONLY=YES _PHP_FPM_VERSION=5.3 _PHP_CLI_VERSION=5.3 #_LOAD_LIMIT_ONE=1444 #_LOAD_LIMIT_TWO=888 _LOAD_LIMIT_ONE=8664 _LOAD_LIMIT_TWO=5328 _CUSTOM_CONFIG_CSF=YES #_CUSTOM_CONFIG_SQL=NO _CUSTOM_CONFIG_SQL=YES _CUSTOM_CONFIG_REDIS=NO _CUSTOM_CONFIG_PHP_5_2=NO #_CUSTOM_CONFIG_PHP_5_3=NO _CUSTOM_CONFIG_PHP_5_3=YES _SPEED_VALID_MAX=3600 _NGINX_DOS_LIMIT=300 _SYSTEM_UPGRADE_ONLY=YES _USE_MEMCACHED=NO _NEWRELIC_KEY= _USE_STOCK=NO ### ### Configuration created on 121215-1545 ### with Barracuda version BOA-2.0.4 ### _EXTRA_PACKAGES= _PHP_EXTRA_CONF="" _STRONG_PASSWORDS=NO _DB_BINARY_LOG=NO _DB_ENGINE=InnoDB _NGINX_LDAP=NO _PHP_GEOS=NO _PHP_MONGODB=NO _AEGIR_UPGRADE_ONLY=NO ### Squeeze to Wheezy upgrade config ### See /trac/ticket/535 _SQUEEZE_TO_WHEEZY=YES _NGINX_FORWARD_SECRECY=YES _NGINX_SPDY=YES #_BUILD_FROM_SRC=NO _NGINX_NAXSI=NO _PHP_ZEND_OPCACHE=YES _PERMISSIONS_FIX=YES _MODULES_FIX=YES _MODULES_SKIP="" _SSL_FROM_SOURCES=NO _SSH_FROM_SOURCES=NO _RESERVED_RAM=0
See ticket:670#comment:15 for the notes about the changes to this file, this is what it has now been updated to:
### ### Configuration created on 121215-1545 ### with Barracuda version BOA-2.0.4 ### ### NOTE: the group of settings displayed bellow will *not* be overriden ### on upgrade by the Barracuda script nor by this configuration file. ### They can be defined only on initial Barracuda install. ### _HTTP_WILDCARD=YES _MY_OWNIP="81.95.52.103" #_MY_OWNIP="" _MY_HOSTN="puffin.webarch.net" #_MY_HOSTN="" _MY_FRONT="master.puffin.webarch.net" _THIS_DB_HOST=localhost #_THIS_DB_HOST=FQDN _SMTP_RELAY_TEST=YES _SMTP_RELAY_HOST="" _LOCAL_NETWORK_IP="" _LOCAL_NETWORK_HN="" ### ### NOTE: the group of settings displayed bellow ### will *override* all listed settings in the Barracuda script, ### both on initial install and upgrade. ### _MY_EMAIL="chris@webarchitects.co.uk" _XTRAS_LIST="PDS CSF CHV" _AUTOPILOT=NO _DEBUG_MODE=NO _DB_SERVER=MariaDB _SSH_PORT=22 _LOCAL_DEBIAN_MIRROR="ftp.debian.org" _LOCAL_UBUNTU_MIRROR="archive.ubuntu.com" _FORCE_GIT_MIRROR="" _DNS_SETUP_TEST=YES _NGINX_EXTRA_CONF="" _NGINX_WORKERS=AUTO _PHP_FPM_WORKERS=AUTO #_BUILD_FROM_SRC=YES _BUILD_FROM_SRC=NO _PHP_MODERN_ONLY=YES _PHP_FPM_VERSION=5.3 _PHP_CLI_VERSION=5.3 #_LOAD_LIMIT_ONE=1444 #_LOAD_LIMIT_TWO=888 _LOAD_LIMIT_ONE=8664 _LOAD_LIMIT_TWO=5328 _CUSTOM_CONFIG_CSF=YES _CUSTOM_CONFIG_SQL=NO #_CUSTOM_CONFIG_SQL=YES _CUSTOM_CONFIG_REDIS=NO _CUSTOM_CONFIG_PHP_5_2=NO _CUSTOM_CONFIG_PHP_5_3=NO #_CUSTOM_CONFIG_PHP_5_3=YES _SPEED_VALID_MAX=3600 _NGINX_DOS_LIMIT=300 #_SYSTEM_UPGRADE_ONLY=YES _SYSTEM_UPGRADE_ONLY=NO _USE_MEMCACHED=NO _NEWRELIC_KEY= _USE_STOCK=NO ### ### Configuration created on 121215-1545 ### with Barracuda version BOA-2.0.4 ### _EXTRA_PACKAGES= _PHP_EXTRA_CONF="" _STRONG_PASSWORDS=NO _DB_BINARY_LOG=NO _DB_ENGINE=InnoDB _NGINX_LDAP=NO _PHP_GEOS=NO _PHP_MONGODB=NO _AEGIR_UPGRADE_ONLY=NO ### Squeeze to Wheezy upgrade config ### See /trac/ticket/535 #_SQUEEZE_TO_WHEEZY=YES _SQUEEZE_TO_WHEEZY=NO _NGINX_FORWARD_SECRECY=YES _NGINX_SPDY=YES #_BUILD_FROM_SRC=NO _NGINX_NAXSI=NO _PHP_ZEND_OPCACHE=YES _PERMISSIONS_FIX=YES _MODULES_FIX=YES _MODULES_SKIP="" _SSL_FROM_SOURCES=NO _SSH_FROM_SOURCES=NO _RESERVED_RAM=0
After the upgrade has been done this should be run: /usr/local/bin/BOND.sh
comment:6 follow-up: ↓ 7 Changed 3 years ago by jim
- Add Hours to Ticket changed from 0.0 to 0.05
- Total Hours changed from 1.25 to 1.3
I notice this tweet by @omega8cc:
We are working on some Known Issues affecting systems upgraded to BOA-2.2.0 release: http://bit.ly/1rXl2ND #Drupal #Aegir
Which points to this section of the change log:
# Known Issues on systems upgraded to BOA-2.2.0 release (work in progress)
==> Updated on Mon Mar 31 19:37:24 SGT 2014.
- Compass Tools don't use correct paths to Ruby 2.1.1
- Chive Authentication via SSH session doesn't work on some older instances.
- PHP: Disabled 'create_function' may break some contrib modules or code.
- The drush @foo.com generate-makefile command may not work on some systems.
So I know you're keen to get PHP and NginX updated ASAP, but I think it'd pay to wait until later this week to do the update -- more issues/tweaks will almost certainly crop up.
FWIW I tend do do my system BOA update between 1 and 2 weeks after the release as in the past I've only had to do it again a few days later... And I like to spend <1h per month dicking around with my server ideally!
Your call, obvs, but it might generate extra faff by going 'early'.
comment:7 in reply to: ↑ 6 Changed 3 years ago by chris
- Add Hours to Ticket changed from 0.0 to 0.1
- Total Hours changed from 1.3 to 1.4
Replying to jim:
- PHP: Disabled 'create_function' may break some contrib modules or code.
Is the above an issue for us?
It would be nice to get the update done in this fiancial year...
comment:8 follow-up: ↓ 9 Changed 3 years ago by jim
- Add Hours to Ticket changed from 0.0 to 0.15
- Total Hours changed from 1.4 to 1.55
It could be... Without testing I don't know. Unfortunately we're in the PHP5.2-based world of Drupal 6, so the risk is much higher...
This search for create_function within drupalcontrib.org brings back 91 results including references to Views Bulk Operations and Pathologic on the first 2 pages, both of which we use.
So I'd rate this risk as 'high' on this one, unfortunately...
comment:9 in reply to: ↑ 8 Changed 3 years ago by chris
Replying to jim:
It could be... Without testing I don't know. Unfortunately we're in the PHP5.2-based world of Drupal 6, so the risk is much higher...
That includes 5.3?
So I'd rate this risk as 'high' on this one, unfortunately...
OK, lets leave it till next month sometime.
comment:10 Changed 3 years ago by chris
FWIW they have just tweeted:
We have fixed 3 Known Issues in BOA-2.2.0 http://t.co/LjUlFkl8q7 #Aegir #Drupal
comment:11 Changed 3 years ago by chris
- Add Hours to Ticket changed from 0.0 to 0.15
- Total Hours changed from 1.55 to 1.7
The oustanding issues are not ones which affect us AFAIK:
==> Updated on Mon Mar 31 12:39:35 EDT 2014 @=> Issues hot-fixed in stable (run 'barracuda up-stable system' to apply): * Compass Tools don't use correct paths to Ruby 2.1.1 * Chive Authentication via SSH session doesn't work on some older instances. * PHP: Disabled 'create_function' may break some contrib modules or code. @=> Issues waiting for a fix: * The 'git pull' command is broken in limited shell. * The drush @foo.com generate-makefile command may not work on some systems.
comment:12 Changed 3 years ago by chris
- Summary changed from Upgrade to BOA-2.2.0 to Upgrade to BOA-2.2.1
BOA-2.2.1 is now out:
We are happy to release BOA-2.2.1 Full Edition, which includes only bug fixes to address a few issues discovered after recent major BOA-2.2.0 Release.
### Stable BOA-2.2.1 Release - Full Edition
### Date: Tue Apr 1 10:28:45 SGT 2014
### Includes Aegir 2.x-boa-custom version.
# Release Notes:
This is a bug-fix only release to address issues discovered after recent
major BOA-2.2.0 Release.
# Fixes in this release:
- Chive Authentication via SSH session doesn't work on some older instances.
- Compass Tools don't use correct paths to Ruby 2.1.1
- Cron for sites doesn't work on old instances without Nginx wildcard vhost.
- FTPS (FTP over SSL) connections may experience TLS problems.
- PHP: Disabled 'assert' may cause warnings on features revert.
- PHP: Disabled 'create_function' may break some contrib modules or code.
- The 'git pull' command is broken in limited shell.
- The 'rsync' command is broken in limited shell.
- The 'drush dl foo' command can't be run outside of site directory.
You can read the full changelog as always at: http://bit.ly/newboa
comment:13 Changed 3 years ago by chris
It's been noted on ticket:604 that the site is always slow first thing in the morning and that this is probably due to the Redis cache being "reset" at midnight each night:
According to the BOA maintainers this "feature" should be fixed in the new BOA version.
The BOA-2.2.1 CHANGELOG.txt contains:
433 * Redis: Integration module (the modern variant) upgrade to 7.x-2.x-o8-2.6-A
434 * Redis: Use modern version with enabled fast lock and aggressive flush mode.
And:
546 * Redis: Auto-Restart if socket is missing only when socket mode is enabled.
547 * Redis: Exclude cache_form bin or it will break modules like ajax_comments.
548 * Redis: Force clean restart daily, with long enough sleep time.
549 * Redis: Restore pwd protection.
550 * Redis: The cache_metatag bin needs aggressive flush mode -- see #2062379
comment:14 Changed 3 years ago by jim
FWIW I did the update last night barracuda up-stable followed by octopus up-stable all on Babylon and it all went very well. Took about 1/2 an hour.
comment:15 follow-up: ↓ 17 Changed 3 years ago by jim
And the chart Chris posted 2 comments up is actually more showing Drupal clearing its page caches every 12 hours, rather than the 3-4am system tasks. The latter is represented by a small dip in stored data, but the big drops are the 12-hourly Drupal 'system' cron tasks...
These were once an hour, [https://tech.transitionnetwork.org/trac/ticket/590#comment:37 now 12 hourly as part of work on 590 (part M) in the 'cleanup' Elysia cron job.
This remains a limitation with Drupal 6's caching infrastructure, though one I think the Redis module maintainers appear to have attempted to mitigate: https://drupal.org/node/1875584 <-- hopefully this comes along with 2.2.1...
We can open a ticket to follow up this aspect another time if necessary.
comment:16 Changed 3 years ago by jim
- Add Hours to Ticket changed from 0.0 to 0.1
- Total Hours changed from 1.7 to 1.8
comment:17 in reply to: ↑ 15 Changed 3 years ago by chris
- Add Hours to Ticket changed from 0.0 to 0.15
- Total Hours changed from 1.8 to 1.95
Replying to jim:
the chart Chris posted 2 comments up is actually more showing Drupal clearing its page caches every 12 hours
Ah, I had missed that it is every 12 hours, but it is also the case that Redis is killed and restarted at around ten past midnight each night, this can be seen in the /var/log/redis/redis-server.log:
[59475] 07 Apr 00:10:04.156 # User requested shutdown... [59475] 07 Apr 00:10:05.699 # Redis is now ready to exit, bye bye... [20917] 07 Apr 00:10:06.777 # Server started, Redis version 2.6.16
This is caused by the /var/xdrago/mysql_backup.sh script which contains:
/etc/init.d/redis-server stop killall -9 redis-server rm -f /var/run/redis.pid rm -f /var/lib/redis/* /etc/init.d/redis-server start echo "Redis server restarted"
And this is set to run via this root crontab:
08 0 * * * bash /var/xdrago/mysql_backup.sh >/dev/null 2>&1
This isn't how Redis is designed to work:
Redis is designed to be a very long running process in your server.
But as Jim has pointed out the effect of Drupal clearing it's cache seems to be the main cause of the Redis cache being emptied.
Replying to jim:
FWIW I did the update last night barracuda up-stable followed by octopus up-stable all on Babylon and it all went very well.
Do you think it would be safe to update Puffin to the latest BOA, or should we wait some more?
comment:18 Changed 3 years ago by chris
- Add Hours to Ticket changed from 0.0 to 0.25
- Total Hours changed from 1.95 to 2.2
- Summary changed from Upgrade to BOA-2.2.1 to Upgrade to BOA-2.2.2
I'm tempted to do the upgrade tonight... or should we wait some more... Jim?
New version of BOA, from the CHANGELOG.txt, note the heartbleed issues are being addressed on ticket:692#comment:18.
### Stable BOA-2.2.2 Release - Barracuda Edition ### Date: Tue Apr 8 07:24:18 PDT 2014 ### Includes Aegir 2.x-boa-custom version. # Release Notes: This is a bug-fix only release to address issues discovered after recent major BOA-2.2.0 Release and subsequent BOA-2.2.1 release. The most important problem fixed in this Release is related to known OpenSSL security issue, which has been fixed in OpenSSL 1.0.1g To learn more please visit: http://heartbleed.com @=> Note for those on self-hosted BOA (skip this if you are on a hosted Aegir) We recommend that you enable _SSL_FROM_SOURCES=YES option in your system /root/.barracuda.cnf file, to always build latest OpenSSL from sources. Note that it will also trigger OpenSSH and cURL install from sources, plus subsequent PHP rebuild to include latest SSL libraries. This Release doesn't include any updates to the Octopus installer, so there is no point in running full upgrade. It is enough to run the barracuda only, system upgrade in the "silent mode" with: $ screen $ barracuda up-stable system The system will send you an e-mail with results when the upgrade is complete, but there will be no upgrade progress displayed in the console. You can watch it, if you prefer, with command (DATE/TIME are placeholders for real values): $ tail -f /var/backups/reports/up/barracuda/DATE/barracuda-up-DATE-TIME.log # System upgrades in this release: * Nginx 1.5.13 * OpenSSL 1.0.1g (if installed from sources) * PHP 5.4.27 * PHP 5.5.11 # Fixes in this release: * Chive Authentication via SSH session may break Nginx due to race conditions. * Drush specific dt() wrapper is required in Provision for custom platforms. * Fix Compass Tools support for Omega (gems dependencies via bundle install). * Fix default shell for system level cron tasks. * Fix for csf firewall compatibility test. * Force better health check on protected vhosts on live SSH-auth update. * Issue #2229555 - On fresh boa install link missing durring install. * Issue #2229715 - Tasks queue doesn't work on the Master Instance. * Issue #2231093 - Add new line before 'UseDNS no' in the sshd_config file. * Issue #294 - New Relic ext not installed even if _NEWRELIC_KEY is not empty. * Nginx: Backup and re-create default wildcard SSL cert/key with rsa:4096 * Nginx: Generate 4096 bit long DH parameters when _NGINX_FORWARD_SECRECY=YES * PHP: Better default workers limits for the ondemand mode. * PHP: max_input_time should be set to 180 and not 60, by default. * PHP: Zend OPcache directive opcache.enable=1 must be set in all ini files. * The 'scp' command is broken in limited shell. * Too broad whitelisting breaks commands in limited shell with 'tmp' keyword. * Too restrictive open_basedir defaults break access to valid PEAR paths. * Too restrictive open_basedir defaults break access to valid Tika paths. * Use rsa:4096 by default in self-signed certs for Nginx and FTPS.
I don't think we should do this, I think we are better off using the Debian packages for OpenSSL and OpenSSH:
We recommend that you enable _SSL_FROM_SOURCES=YES option in your system
/root/.barracuda.cnf file, to always build latest OpenSSL from sources.
Note that it will also trigger OpenSSH and cURL install from sources, plus
subsequent PHP rebuild to include latest SSL libraries.
comment:19 Changed 3 years ago by chris
Going to do the BOA upgrade now.
comment:20 follow-up: ↓ 24 Changed 3 years ago by chris
- Add Hours to Ticket changed from 0.0 to 1.64
- Total Hours changed from 2.2 to 3.84
I have just changed this setting from NO, due to recent events.
_STRONG_PASSWORDS=YES
Here is the /root/.barracuda.cnf:
### ### Configuration created on 121215-1545 ### with Barracuda version BOA-2.0.4 ### ### NOTE: the group of settings displayed bellow will *not* be overriden ### on upgrade by the Barracuda script nor by this configuration file. ### They can be defined only on initial Barracuda install. ### _HTTP_WILDCARD=YES _MY_OWNIP="81.95.52.103" #_MY_OWNIP="" _MY_HOSTN="puffin.webarch.net" #_MY_HOSTN="" _MY_FRONT="master.puffin.webarch.net" _THIS_DB_HOST=localhost #_THIS_DB_HOST=FQDN _SMTP_RELAY_TEST=YES _SMTP_RELAY_HOST="" _LOCAL_NETWORK_IP="" _LOCAL_NETWORK_HN="" ### ### NOTE: the group of settings displayed bellow ### will *override* all listed settings in the Barracuda script, ### both on initial install and upgrade. ### _MY_EMAIL="chris@webarchitects.co.uk" _XTRAS_LIST="PDS CSF CHV" _AUTOPILOT=NO _DEBUG_MODE=NO _DB_SERVER=MariaDB _SSH_PORT=22 _LOCAL_DEBIAN_MIRROR="ftp.debian.org" _LOCAL_UBUNTU_MIRROR="archive.ubuntu.com" _FORCE_GIT_MIRROR="" _DNS_SETUP_TEST=YES _NGINX_EXTRA_CONF="" _NGINX_WORKERS=AUTO _PHP_FPM_WORKERS=AUTO #_BUILD_FROM_SRC=YES _BUILD_FROM_SRC=NO _PHP_MODERN_ONLY=YES _PHP_FPM_VERSION=5.3 _PHP_CLI_VERSION=5.3 #_LOAD_LIMIT_ONE=1444 #_LOAD_LIMIT_TWO=888 _LOAD_LIMIT_ONE=8664 _LOAD_LIMIT_TWO=5328 _CUSTOM_CONFIG_CSF=YES _CUSTOM_CONFIG_SQL=NO #_CUSTOM_CONFIG_SQL=YES _CUSTOM_CONFIG_REDIS=NO _CUSTOM_CONFIG_PHP_5_2=NO _CUSTOM_CONFIG_PHP_5_3=NO #_CUSTOM_CONFIG_PHP_5_3=YES _SPEED_VALID_MAX=3600 _NGINX_DOS_LIMIT=300 #_SYSTEM_UPGRADE_ONLY=YES _SYSTEM_UPGRADE_ONLY=NO _USE_MEMCACHED=NO _NEWRELIC_KEY= _USE_STOCK=NO ### ### Configuration created on 121215-1545 ### with Barracuda version BOA-2.0.4 ### _EXTRA_PACKAGES= _PHP_EXTRA_CONF="" _STRONG_PASSWORDS=YES _DB_BINARY_LOG=NO _DB_ENGINE=InnoDB _NGINX_LDAP=NO _PHP_GEOS=NO _PHP_MONGODB=NO _AEGIR_UPGRADE_ONLY=NO ### Squeeze to Wheezy upgrade config ### See /trac/ticket/535 #_SQUEEZE_TO_WHEEZY=YES _SQUEEZE_TO_WHEEZY=NO _NGINX_FORWARD_SECRECY=YES _NGINX_SPDY=YES #_BUILD_FROM_SRC=NO _NGINX_NAXSI=NO _PHP_ZEND_OPCACHE=YES _PERMISSIONS_FIX=YES _MODULES_FIX=YES _MODULES_SKIP="" _SSL_FROM_SOURCES=NO _SSH_FROM_SOURCES=NO _RESERVED_RAM=0
Following the notes, wiki:PuffinServer#UpgradingBOA
sudo -i screen cd wget -q -U iCab http://files.aegir.cc/BOA.sh.txt bash BOA.sh.txt BOA Meta Installer setup completed Please check INSTALL.txt and UPGRADE.txt at http://bit.ly/boa-docs for how-to Bye barracuda up-stable Another BOA installer is running probably - /var/run/boa_run.pid exists ls -lah /var/run/boa_run.pid -rw-r--r-- 1 root root 0 Mar 31 14:03 /var/run/boa_run.pid rm /var/run/boa_run.pid barracuda up-stable Barracuda [Fri Apr 11 21:55:47 BST 2014] ==> BOA Skynet welcomes you aboard! Barracuda [Fri Apr 11 21:55:51 BST 2014] ==> INFO: UPGRADE Barracuda [Fri Apr 11 21:55:51 BST 2014] ==> INFO: Reading your /root/.barracuda.cnf config file Barracuda [Fri Apr 11 21:55:52 BST 2014] ==> NOTE! Please review all config options displayed below Barracuda [Fri Apr 11 21:55:52 BST 2014] ==> NOTE! It will *override* all settings in the Barracuda script Barracuda [Fri Apr 11 21:55:53 BST 2014] ==> Legacy PHP-CLI 5.2 is not used on this system Barracuda [Fri Apr 11 21:55:53 BST 2014] ==> Legacy PHP-FPM 5.2 is not used on this system ### ### Configuration created on 121215-1545 ### with Barracuda version BOA-2.0.4 ### ### NOTE: the group of settings displayed bellow will *not* be overriden ### on upgrade by the Barracuda script nor by this configuration file. ### They can be defined only on initial Barracuda install. ### _HTTP_WILDCARD=YES _MY_OWNIP="81.95.52.103" #_MY_OWNIP="" _MY_HOSTN="puffin.webarch.net" #_MY_HOSTN="" _MY_FRONT="master.puffin.webarch.net" _THIS_DB_HOST=localhost #_THIS_DB_HOST=FQDN _SMTP_RELAY_TEST=YES _SMTP_RELAY_HOST="" _LOCAL_NETWORK_IP="" _LOCAL_NETWORK_HN="" ### ### NOTE: the group of settings displayed bellow ### will *override* all listed settings in the Barracuda script, ### both on initial install and upgrade. ### _MY_EMAIL="chris@webarchitects.co.uk" _XTRAS_LIST="PDS CSF CHV" _AUTOPILOT=NO _DEBUG_MODE=NO _DB_SERVER=MariaDB _SSH_PORT=22 _LOCAL_DEBIAN_MIRROR="ftp.debian.org" _LOCAL_UBUNTU_MIRROR="archive.ubuntu.com" _FORCE_GIT_MIRROR="" _DNS_SETUP_TEST=YES _NGINX_EXTRA_CONF="" _NGINX_WORKERS=AUTO _PHP_FPM_WORKERS=AUTO _PHP_FPM_VERSION=5.3 _PHP_CLI_VERSION=5.3 _CUSTOM_CONFIG_CSF=YES _CUSTOM_CONFIG_SQL=NO #_CUSTOM_CONFIG_SQL=YES _CUSTOM_CONFIG_REDIS=NO _CUSTOM_CONFIG_PHP_5_2=NO _CUSTOM_CONFIG_PHP_5_3=NO #_CUSTOM_CONFIG_PHP_5_3=YES _SPEED_VALID_MAX=3600 _NGINX_DOS_LIMIT=300 #_SYSTEM_UPGRADE_ONLY=YES _SYSTEM_UPGRADE_ONLY=NO _NEWRELIC_KEY= _USE_STOCK=NO ### ### Configuration created on 121215-1545 ### with Barracuda version BOA-2.0.4 ### _EXTRA_PACKAGES= _PHP_EXTRA_CONF="" _STRONG_PASSWORDS=YES _DB_BINARY_LOG=NO _DB_ENGINE=InnoDB _NGINX_LDAP=NO _PHP_GEOS=NO _PHP_MONGODB=NO _AEGIR_UPGRADE_ONLY=NO ### Squeeze to Wheezy upgrade config ### See /trac/ticket/535 #_SQUEEZE_TO_WHEEZY=YES _SQUEEZE_TO_WHEEZY=NO _NGINX_FORWARD_SECRECY=YES _NGINX_SPDY=YES _NGINX_NAXSI=NO _PERMISSIONS_FIX=YES _MODULES_FIX=YES _MODULES_SKIP="" _SSL_FROM_SOURCES=NO _SSH_FROM_SOURCES=NO _RESERVED_RAM=0 _PHP_MULTI_INSTALL="5.3" _CUSTOM_CONFIG_LSHELL=NO _CUSTOM_CONFIG_PHP55=NO _CUSTOM_CONFIG_PHP54=NO _CUSTOM_CONFIG_PHP53=NO _CUSTOM_CONFIG_PHP52=NO _CPU_SPIDER_RATIO=3 _CPU_MAX_RATIO=6 _CPU_CRIT_RATIO=9 _PHP_FPM_DENY="" _REDIS_LISTEN_MODE=PORT _STRICT_BIN_PERMISSIONS=YES Do you want to proceed with the upgrade? [Y/n] Y Barracuda [Fri Apr 11 21:56:48 BST 2014] ==> INFO: Checking your system version... Barracuda [Fri Apr 11 21:56:49 BST 2014] ==> Aegir on Debian/wheezy - Skynet Agent v.BOA-2.2.2 Barracuda [Fri Apr 11 21:56:49 BST 2014] ==> INFO: Updating packages sources list... Barracuda [Fri Apr 11 21:56:49 BST 2014] ==> INFO: We will use Debian mirror ftp.debian.org Barracuda [Fri Apr 11 21:57:03 BST 2014] ==> INFO: Downloading little helpers... Barracuda [Fri Apr 11 21:57:04 BST 2014] ==> INFO: Checking BARRACUDA version... Barracuda [Fri Apr 11 21:57:04 BST 2014] ==> INFO: BARRACUDA version test: OK Barracuda [Fri Apr 11 21:57:05 BST 2014] ==> UPGRADE START -> checkpoint: * Your e-mail address appears to be chris@webarchitects.co.uk - is that correct? * Your server hostname is puffin.webarch.net. * Your Aegir control panel is/will be available at https://master.puffin.webarch.net. Do you want to proceed with the upgrade? [Y/n] Y Barracuda [Fri Apr 11 21:57:45 BST 2014] ==> INFO: Cleaning up temp files in /var/opt/ Barracuda [Fri Apr 11 21:57:45 BST 2014] ==> INFO: Installing extra Drush versions Barracuda [Fri Apr 11 21:57:45 BST 2014] ==> INFO: Drush mini-4-14-03-2014 installation complete Barracuda [Fri Apr 11 21:57:46 BST 2014] ==> INFO: Drush mini-6-01-04-2014 installation complete Barracuda [Fri Apr 11 21:57:52 BST 2014] ==> INFO: Running aptitude update... Barracuda [Fri Apr 11 21:58:39 BST 2014] ==> INFO: Upgrading required libraries and tools Barracuda [Fri Apr 11 21:58:39 BST 2014] ==> NOTE! This step may take a few minutes, please wait... Barracuda [Fri Apr 11 21:59:39 BST 2014] ==> INFO: Testing Nginx version... Barracuda [Fri Apr 11 21:59:39 BST 2014] ==> INFO: Installed Nginx version nginx/1.5.7, upgrade required Barracuda [Fri Apr 11 21:59:40 BST 2014] ==> INFO: Upgrading Nginx... Barracuda [Fri Apr 11 22:00:54 BST 2014] ==> INFO: Running aptitude full-upgrade, please wait... Barracuda [Fri Apr 11 22:01:54 BST 2014] ==> INFO: Testing Nginx version... Barracuda [Fri Apr 11 22:01:54 BST 2014] ==> INFO: Installed Nginx version nginx/1.5.13, OK Barracuda [Fri Apr 11 22:01:54 BST 2014] ==> INFO: Installing MySecureShell 1.32... Barracuda [Fri Apr 11 22:02:22 BST 2014] ==> INFO: Installing /usr/bin/wkhtmltopdf x86_64 version... Barracuda [Fri Apr 11 22:02:28 BST 2014] ==> INFO: Installing /usr/bin/wkhtmltoimage x86_64 version... Barracuda [Fri Apr 11 22:02:34 BST 2014] ==> INFO: Fix #1 for libs in Debian wheezy Barracuda [Fri Apr 11 22:02:35 BST 2014] ==> INFO: Checking SMTP connections... Barracuda [Fri Apr 11 22:02:35 BST 2014] ==> INFO: Installing VnStat monitor... Barracuda [Fri Apr 11 22:02:44 BST 2014] ==> INFO: Upgrading a few more tools... Barracuda [Fri Apr 11 22:02:46 BST 2014] ==> INFO: Checking if PHP upgrade is available Barracuda [Fri Apr 11 22:02:53 BST 2014] ==> INFO: PHP EXTRA is --with-ldap --with-gmp Barracuda [Fri Apr 11 22:02:53 BST 2014] ==> INFO: PHP 5.3.28 will be installed now Barracuda [Fri Apr 11 22:02:53 BST 2014] ==> INFO: Installing PHP-FPM 5.3.28 Barracuda [Fri Apr 11 22:02:53 BST 2014] ==> NOTE! This step may take longer than 8 minutes, please wait... Barracuda [Fri Apr 11 22:03:03 BST 2014] ==> INFO: Installing PHP-FPM 5.3.28 part 1/3 Barracuda [Fri Apr 11 22:03:04 BST 2014] ==> INFO: Installing PHP-FPM 5.3.28 part 2/3 Barracuda [Fri Apr 11 22:04:59 BST 2014] ==> INFO: Installing PHP-FPM 5.3.28 part 3/3 Barracuda [Fri Apr 11 22:17:39 BST 2014] ==> INFO: Installing Zend OPcache for PHP-FPM 5.3.28... Barracuda [Fri Apr 11 22:18:02 BST 2014] ==> INFO: Installing PhpRedis for PHP-FPM 5.3.28... Barracuda [Fri Apr 11 22:18:23 BST 2014] ==> INFO: Installing UploadProgress for PHP-FPM 5.3.28... Barracuda [Fri Apr 11 22:18:34 BST 2014] ==> INFO: Installing JSMin for PHP-FPM 5.3.28... Barracuda [Fri Apr 11 22:18:46 BST 2014] ==> INFO: Installing Imagick for PHP-FPM 5.3.28... Barracuda [Fri Apr 11 22:19:09 BST 2014] ==> INFO: Installing MailParse for PHP-FPM 5.3.28... Barracuda [Fri Apr 11 22:19:23 BST 2014] ==> INFO: Installing IonCube x86_64 version for PHP-FPM... Barracuda [Fri Apr 11 22:19:27 BST 2014] ==> INFO: Upgrading Limited Shell to version 0.9.16.5-om8... Barracuda [Fri Apr 11 22:19:30 BST 2014] ==> INFO: Installed Redis version 2.6.16, upgrade required Barracuda [Fri Apr 11 22:19:30 BST 2014] ==> INFO: Installing Redis update for Debian/wheezy... Barracuda [Fri Apr 11 22:20:41 BST 2014] ==> INFO: Generating random password for Redis server Barracuda [Fri Apr 11 22:20:42 BST 2014] ==> INFO: Updating MariaDB and PHP configuration Barracuda [Fri Apr 11 22:20:43 BST 2014] ==> INFO: Running MySQLTuner check on all databases... Barracuda [Fri Apr 11 22:20:43 BST 2014] ==> NOTE! This step may take a LONG time, please wait... Barracuda [Fri Apr 11 22:20:47 BST 2014] ==> INFO: OS and services upgrade completed Barracuda [Fri Apr 11 22:20:47 BST 2014] ==> INFO: Restarting MariaDB server, please wait... Barracuda [Fri Apr 11 22:21:05 BST 2014] ==> INFO: Upgrading MariaDB tables if necessary, please wait a minute... Do you want to upgrade Aegir Master Instance? [Y/n] Y Barracuda [Fri Apr 11 22:24:01 BST 2014] ==> INFO: Running Aegir Master Instance upgrade Barracuda [Fri Apr 11 22:24:02 BST 2014] ==> INFO: Syncing provision backend db_passwd... Barracuda [Fri Apr 11 22:24:04 BST 2014] ==> INFO: Running hosting-dispatch (1/3)... Barracuda [Fri Apr 11 22:24:17 BST 2014] ==> INFO: Running hosting-dispatch (2/3)... Barracuda [Fri Apr 11 22:24:24 BST 2014] ==> INFO: Running hosting-dispatch (3/3)... Barracuda [Fri Apr 11 22:24:24 BST 2014] ==> INFO: Syncing hostmaster frontend db_passwd... Barracuda [Fri Apr 11 22:24:25 BST 2014] ==> INFO: Testing previous install... Barracuda [Fri Apr 11 22:24:25 BST 2014] ==> INFO: Test OK, we can proceed with Hostmaster upgrade Barracuda [Fri Apr 11 22:24:25 BST 2014] ==> INFO: Moving old directories Barracuda [Fri Apr 11 22:24:25 BST 2014] ==> INFO: Downloading drush... Barracuda [Fri Apr 11 22:24:26 BST 2014] ==> INFO: Drush seems to be functioning properly Barracuda [Fri Apr 11 22:24:26 BST 2014] ==> INFO: Installing provision backend in /var/aegir/.drush Barracuda [Fri Apr 11 22:24:26 BST 2014] ==> INFO: Downloading Drush and Provision extensions... Barracuda [Fri Apr 11 22:24:26 BST 2014] ==> INFO: Running hostmaster-migrate, please wait... Barracuda [Fri Apr 11 22:24:55 BST 2014] ==> INFO: Syncing hostmaster frontend db_passwd... Barracuda [Fri Apr 11 22:25:33 BST 2014] ==> INFO: Aegir Master Instance upgrade completed Barracuda [Fri Apr 11 22:25:37 BST 2014] ==> INFO: Upgrading Chive MariaDB Manager... Barracuda [Fri Apr 11 22:25:42 BST 2014] ==> INFO: Restarting Redis, PHP-FPM and Nginx Barracuda [Fri Apr 11 22:25:51 BST 2014] ==> INFO: Restarting MariaDB server Barracuda [Fri Apr 11 22:26:01 BST 2014] ==> INFO: New secure random password for MariaDB generated and updated Barracuda [Fri Apr 11 22:26:01 BST 2014] ==> INFO: New entry added to /var/log/barracuda_log.txt Barracuda [Fri Apr 11 22:26:01 BST 2014] ==> INFO: Cleaning up system swap, it may take a moment, please wait... Barracuda [Fri Apr 11 22:26:40 BST 2014] ==> CARD: Now charging your credit card for this auto-upgrade magic... Barracuda [Fri Apr 11 22:26:46 BST 2014] ==> JOKE: Just kidding! Enjoy your Aegir Hosting System :) Barracuda [Fri Apr 11 22:26:46 BST 2014] ==> Final post-upgrade cleaning, please wait a moment... Barracuda [Fri Apr 11 22:33:40 BST 2014] ==> BYE! BARRACUDA upgrade completed Bye
While the update was running I was sent this email:
From: root@puffin.webarch.net Date: Fri, 11 Apr 2014 21:57:27 +0100 (BST) To: chris@webarchitects.co.uk Subject: lfd on puffin.webarch.net: System Integrity checking detected a modified system file Time: Fri Apr 11 21:57:27 2014 +0000 The following list of files have FAILED the md5sum comparison test. This means that the file has been changed in some way. This could be a result of an OS update or application upgrade. If the change is unexpected it should be investigated: /usr/bin/7z: FAILED /usr/bin/7za: FAILED /usr/bin/Magick-config: FAILED /usr/bin/MagickCore-config: FAILED /usr/bin/MagickWand-config: FAILED /usr/bin/Wand-config: FAILED /usr/bin/add-patch: FAILED /usr/bin/anytopnm: FAILED /usr/bin/apt-key: FAILED /usr/bin/aptitude-fast: FAILED /usr/bin/autoconf2.13: FAILED /usr/bin/autoconf2.50: FAILED /usr/bin/autoheader2.13: FAILED /usr/bin/autopoint: FAILED /usr/bin/autoreconf2.13: FAILED /usr/bin/autoupdate2.13: FAILED /usr/bin/bashbug: FAILED /usr/bin/batch: FAILED /usr/bin/bison.yacc: FAILED /usr/bin/c89: FAILED /usr/bin/c89-gcc: FAILED /usr/bin/c99: FAILED /usr/bin/c99-gcc: FAILED /usr/bin/catchsegv: FAILED /usr/bin/checkbashisms: FAILED /usr/bin/compile_et: FAILED /usr/bin/conkeror: FAILED /usr/bin/crypt: FAILED /usr/bin/curl-config: FAILED /usr/bin/dcmd: FAILED /usr/bin/debconf-updatepo: FAILED /usr/bin/debsign: FAILED /usr/bin/dehtmldiff: FAILED /usr/bin/dpkg-maintscript-helper: FAILED /usr/bin/dscextract: FAILED /usr/bin/dumphint: FAILED /usr/bin/dvipdf: FAILED /usr/bin/edit-patch: FAILED /usr/bin/eps2eps: FAILED /usr/bin/fakeroot: FAILED /usr/bin/fakeroot-sysv: FAILED /usr/bin/fakeroot-tcp: FAILED /usr/bin/font2c: FAILED /usr/bin/freetype-config: FAILED /usr/bin/gcore: FAILED /usr/bin/gdbtui: FAILED /usr/bin/getbuildlog: FAILED /usr/bin/gettext.sh: FAILED /usr/bin/gettextize: FAILED /usr/bin/glib-gettextize: FAILED /usr/bin/gpg-error-config: FAILED /usr/bin/gpg-zip: FAILED /usr/bin/gsbj: FAILED /usr/bin/gsdj: FAILED /usr/bin/gsdj500: FAILED /usr/bin/gslj: FAILED /usr/bin/gslp: FAILED /usr/bin/gsnd: FAILED /usr/bin/ifnames2.13: FAILED /usr/bin/igawk: FAILED /usr/bin/install-info: FAILED /usr/bin/krb5-config: FAILED /usr/bin/lessfile: FAILED /usr/bin/lesspipe: FAILED /usr/bin/lft: FAILED /usr/bin/lft.db: FAILED /usr/bin/lftpget: FAILED /usr/bin/libgcrypt-config: FAILED /usr/bin/libmcrypt-config: FAILED /usr/bin/libpng-config: FAILED /usr/bin/libpng12-config: FAILED /usr/bin/libtool: FAILED /usr/bin/libtoolize: FAILED /usr/bin/libwmf-config: FAILED /usr/bin/lorder: FAILED /usr/bin/lsinitramfs: FAILED /usr/bin/lspgpot: FAILED /usr/bin/mkfontdir: FAILED /usr/bin/msql2mysql: FAILED /usr/bin/mysql_config: FAILED /usr/bin/mysql_install_db: FAILED /usr/bin/mysql_secure_installation: FAILED /usr/bin/mysqlaccess: FAILED /usr/bin/mysqlbug: FAILED /usr/bin/ncurses5-config: FAILED /usr/bin/ncursesw5-config: FAILED /usr/bin/neqn: FAILED /usr/bin/net-snmp-config: FAILED /usr/bin/nroff: FAILED /usr/bin/on_ac_power: FAILED /usr/bin/pamstretch-gen: FAILED /usr/bin/pcre-config: FAILED /usr/bin/pdf2dsc: FAILED /usr/bin/pdf2ps: FAILED /usr/bin/pdfopt: FAILED /usr/bin/perldoc: FAILED /usr/bin/pf2afm: FAILED /usr/bin/pfbtopfa: FAILED /usr/bin/pnminterp-gen: FAILED /usr/bin/pnmmargin: FAILED /usr/bin/po2debconf: FAILED /usr/bin/pphs: FAILED /usr/bin/ppmtomap: FAILED /usr/bin/printafm: FAILED /usr/bin/ps2ascii: FAILED /usr/bin/ps2epsi: FAILED /usr/bin/ps2pdf: FAILED /usr/bin/ps2pdf12: FAILED /usr/bin/ps2pdf13: FAILED /usr/bin/ps2pdf14: FAILED /usr/bin/ps2pdfwr: FAILED /usr/bin/ps2ps: FAILED /usr/bin/ps2ps2: FAILED /usr/bin/ps2txt: FAILED /usr/bin/rgrep: FAILED /usr/bin/routef: FAILED /usr/bin/routel: FAILED /usr/bin/savelog: FAILED /usr/bin/sensible-browser: FAILED /usr/bin/sensible-editor: FAILED /usr/bin/sensible-pager: FAILED /usr/bin/sftp-kill: FAILED /usr/bin/sftp-user: FAILED /usr/bin/shtool: FAILED /usr/bin/shtoolize: FAILED /usr/bin/smbtar: FAILED /usr/bin/ssh-argv0: FAILED /usr/bin/ssh-copy-id: FAILED /usr/bin/ssl-cert-check: FAILED /usr/bin/traceproto: FAILED /usr/bin/traceproto.db: FAILED /usr/bin/traceroute-nanog: FAILED /usr/bin/update-mime-database: FAILED /usr/bin/updatedb: FAILED /usr/bin/updatedb.findutils: FAILED /usr/bin/valgrind: FAILED /usr/bin/vimtutor: FAILED /usr/bin/wftopfa: FAILED /usr/bin/which: FAILED /usr/bin/x-www-browser: FAILED /usr/bin/xdg-desktop-icon: FAILED /usr/bin/xdg-desktop-menu: FAILED /usr/bin/xdg-email: FAILED /usr/bin/xdg-icon-resource: FAILED /usr/bin/xdg-mime: FAILED /usr/bin/xdg-open: FAILED /usr/bin/xdg-screensaver: FAILED /usr/bin/xdg-settings: FAILED /usr/bin/xlsview: FAILED /usr/bin/xml2-config: FAILED /usr/bin/xpdf: FAILED /usr/bin/xslt-config: FAILED /usr/bin/yacc: FAILED /usr/bin/zipgrep: FAILED /usr/bin/zxpdf: FAILED /usr/sbin/add-shell: FAILED /usr/sbin/csf: FAILED /usr/sbin/invoke-rc.d: FAILED /usr/sbin/locale-gen: FAILED /usr/sbin/mkinitramfs: FAILED /usr/sbin/ntpdate-debian: FAILED /usr/sbin/paperconfig: FAILED /usr/sbin/remove-shell: FAILED /usr/sbin/service: FAILED /usr/sbin/sync-available: FAILED /usr/sbin/t1libconfig: FAILED /usr/sbin/tcptraceroute: FAILED /usr/sbin/tcptraceroute.db: FAILED /usr/sbin/tzconfig: FAILED /usr/sbin/update-ca-certificates: FAILED /usr/sbin/update-fonts-alias: FAILED /usr/sbin/update-fonts-dir: FAILED /usr/sbin/update-fonts-scale: FAILED /usr/sbin/update-gsfontmap: FAILED /usr/sbin/update-icon-caches: FAILED /usr/sbin/update-icon-caches.gtk2: FAILED /usr/sbin/update-initramfs: FAILED /bin/bzcmp: FAILED /bin/bzdiff: FAILED /bin/bzegrep: FAILED /bin/bzexe: FAILED /bin/bzfgrep: FAILED /bin/bzgrep: FAILED /bin/bzless: FAILED /bin/bzmore: FAILED /bin/lessfile: FAILED /bin/lesspipe: FAILED /bin/sh: FAILED /bin/which: FAILED /sbin/fsck.nfs: FAILED /sbin/initctl: FAILED /sbin/installkernel: FAILED /sbin/on_ac_power: FAILED /sbin/resolvconf: FAILED /sbin/shadowconfig: FAILED /usr/local/bin/barracuda: FAILED /usr/local/bin/boa: FAILED /usr/local/bin/octopus: FAILED /usr/local/bin/syncpass: FAILED /usr/local/bin/tuning-primer.sh: FAILED /etc/init.d/README: FAILED /etc/init.d/atd: FAILED /etc/init.d/auditd: FAILED /etc/init.d/bootlogd: FAILED /etc/init.d/bootlogs: FAILED /etc/init.d/bootmisc.sh: FAILED /etc/init.d/checkfs.sh: FAILED /etc/init.d/checkroot-bootclean.sh: FAILED /etc/init.d/checkroot.sh: FAILED /etc/init.d/chrony: FAILED /etc/init.d/cron: FAILED /etc/init.d/dbus: FAILED /etc/init.d/fancontrol: FAILED /etc/init.d/halt: FAILED /etc/init.d/hdparm: FAILED /etc/init.d/hostname.sh: FAILED /etc/init.d/hwclock.sh: FAILED /etc/init.d/ipvsadm: FAILED /etc/init.d/killprocs: FAILED /etc/init.d/kmod: FAILED /etc/init.d/lm-sensors: FAILED /etc/init.d/lvm2: FAILED /etc/init.d/motd: FAILED /etc/init.d/mountall-bootclean.sh: FAILED /etc/init.d/mountall.sh: FAILED /etc/init.d/mountdevsubfs.sh: FAILED /etc/init.d/mountkernfs.sh: FAILED /etc/init.d/mountnfs-bootclean.sh: FAILED /etc/init.d/mountnfs.sh: FAILED /etc/init.d/mtab.sh: FAILED /etc/init.d/networking: FAILED /etc/init.d/nginx: FAILED /etc/init.d/ntp: FAILED /etc/init.d/pdnsd: FAILED /etc/init.d/php5-fpm: FAILED /etc/init.d/php53-fpm: FAILED /etc/init.d/postfix: FAILED /etc/init.d/procps: FAILED /etc/init.d/rc: FAILED /etc/init.d/rc.local: FAILED /etc/init.d/rcS: FAILED /etc/init.d/reboot: FAILED /etc/init.d/redis-server: FAILED /etc/init.d/resolvconf: FAILED /etc/init.d/rmnologin: FAILED /etc/init.d/rsync: FAILED /etc/init.d/rsyslog: FAILED /etc/init.d/saned: FAILED /etc/init.d/screen-cleanup: FAILED /etc/init.d/sendsigs: FAILED /etc/init.d/single: FAILED /etc/init.d/skeleton: FAILED /etc/init.d/ssh: FAILED /etc/init.d/stop-bootlogd: FAILED /etc/init.d/stop-bootlogd-single: FAILED /etc/init.d/sudo: FAILED /etc/init.d/sysstat: FAILED /etc/init.d/udev: FAILED /etc/init.d/udev-mtab: FAILED /etc/init.d/umountfs: FAILED /etc/init.d/umountnfs.sh: FAILED /etc/init.d/umountroot: FAILED /etc/init.d/unattended-upgrades: FAILED /etc/init.d/urandom: FAILED /etc/init.d/vnstat: FAILED /etc/init.d/x11-common: FAILED
Why all the above files were changed should be investigated.
The upgrade also removed the Gandi.net SSL certs and replaced it with self signed ones:
www.transitionnetwork.org uses an invalid security certificate.
The certificate is not trusted because it is self-signed.
The certificate is only valid for *.puffin.webarch.net
So following the steps from ticket:466#comment:25 to fix this:
cd /etc/ssl/private/ mv nginx-wild-ssl.crt nginx-wild-ssl.crt.old mv nginx-wild-ssl.key nginx-wild-ssl.key.old mv pure-ftpd.pem pure-ftpd.pem.old ln -s ../transitionnetwork.org/transitionnetwork.org.key nginx-wild-ssl.key ln -s ../transitionnetwork.org/transitionnetwork.org.crt nginx-wild-ssl.crt ln -s ../transitionnetwork.org/transitionnetwork.org.pem pure-ftpd.pem /etc/init.d/nginx restart Stopping Nginx Server...:. Starting Nginx Server...:nginx: [emerg] SSL_CTX_use_PrivateKey_file("/etc/ssl/private/nginx-wild-ssl.key") failed (SSL: error:0B080074:x509 certificate routines:X509_check_private_key:key values mismatch) rm nginx-wild-ssl.crt ln -s ../transitionnetwork.org/transitionnetwork.org.chained.pem nginx-wild-ssl.crt /etc/init.d/nginx start Starting Nginx Server...: failed! /etc/init.d/nginx status Nginx Server... found running with processes: 16141 16140 16139 16138 16137 16136 16135 16134 16133 16132 16131 16129 16127 16125 16124 16122 16120 16119 16117 16116 16114 16108 16105 16103 16102 16101 16099 16098 16096 16095 16093 ... (warning).
We still have the wrong cert.
/etc/init.d/nginx stop ps -lA | grep -i nginx 1 S 0 17720 1 1 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 17722 17720 1 80 0 - 18620 - ? 00:00:00 nginx 5 S 33 17723 17720 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 17725 17720 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 17726 17720 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 17728 17720 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 17729 17720 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 17730 17720 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 17732 17720 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 17734 17720 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 17739 17720 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 17742 17720 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 17743 17720 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 17745 17720 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 17747 17720 0 80 0 - 18654 - ? 00:00:00 nginx 5 S 33 17748 17720 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 17750 17720 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 17751 17720 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 17753 17720 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 17755 17720 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 17756 17720 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 17758 17720 0 80 0 - 18622 - ? 00:00:00 nginx 5 S 33 17759 17720 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 17760 17720 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 17761 17720 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 17762 17720 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 17763 17720 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 17764 17720 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 17765 17720 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 17766 17720 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 17767 17720 0 80 0 - 18559 - ? 00:00:00 nginx
So basically the BOA self rolled nginx doesn't have working init scripts?!
killall -9 nginx ps -lA | grep -i nginx 5 S 0 18335 1 1 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 18337 18335 0 80 0 - 18622 - ? 00:00:00 nginx 5 S 33 18339 18335 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 18340 18335 1 80 0 - 18635 - ? 00:00:00 nginx 5 S 33 18341 18335 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 18343 18335 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 18344 18335 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 18346 18335 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 18348 18335 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 18354 18335 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 18356 18335 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 18358 18335 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 18359 18335 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 18361 18335 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 18363 18335 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 18365 18335 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 18367 18335 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 18368 18335 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 18370 18335 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 18372 18335 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 18373 18335 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 18374 18335 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 18375 18335 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 18376 18335 1 80 0 - 18635 - ? 00:00:00 nginx 5 S 33 18377 18335 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 18378 18335 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 18379 18335 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 18380 18335 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 18381 18335 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 18382 18335 0 80 0 - 18559 - ? 00:00:00 nginx 5 S 33 18383 18335 0 80 0 - 18559 - ? 00:00:00 nginx
I'm going to reboot the server, this is also needed just in case some stuff is still running since the OpenSSL update.
reboot The system is going down for reboot NOW!ch.net (pts/0) (Fri Apr 11 22:45:05 2 uptime 22:45:25 up 79 days, 21:33, 2 users, load average: 10.95, 2.63, 1.28 uptime 22:47:08 up 79 days, 21:35, 1 user, load average: 45.11, 17.28, 6.68 uptime 22:47:53 up 79 days, 21:36, 1 user, load average: 42.35, 20.55, 8.29 uptime 22:48:08 up 79 days, 21:36, 1 user, load average: 41.35, 21.41, 8.77 uptime 22:48:37 up 79 days, 21:37, 1 user, load average: 38.32, 22.59, 9.56 uptime 22:50:11 up 79 days, 21:38, 1 user, load average: 33.89, 25.46, 11.86
Wow, that took ages...
Looking at the console from xen it was down to the firewall -- there is such a huge number of iptables rules generated by csf/ldf that it takes 5 mins to unload or load them, it seems.
Another email about the things that have been updated:
From: root@puffin.webarch.net Date: Fri, 11 Apr 2014 22:33:42 +0100 (BST) To: chris@webarchitects.co.uk Subject: lfd on puffin.webarch.net: System Integrity checking detected a modified system file Time: Fri Apr 11 22:33:42 2014 +0100 The following list of files have FAILED the md5sum comparison test. This means that the file has been changed in some way. This could be a result of +an OS update or application upgrade. If the change is unexpected it should be investigated: /usr/bin/drush: FAILED /usr/bin/drush4: FAILED /usr/bin/drush5: FAILED open or read /usr/bin/drush6: FAILED /usr/bin/MySecureShell: FAILED /usr/bin/nginx: FAILED /usr/bin/php-cli: FAILED /usr/bin/redis-benchmark: FAILED /usr/bin/redis-check-aof: FAILED /usr/bin/redis-check-dump: FAILED /usr/bin/redis-cli: FAILED /usr/bin/redis-server: FAILED /usr/bin/sftp-admin: FAILED /usr/bin/sftp-state: FAILED /usr/bin/sftp-who: FAILED /usr/bin/vnstat: FAILED /usr/sbin/nginx: FAILED /usr/sbin/nginx.old: FAILED /usr/sbin/vnstatd: FAILED /bin/sh: FAILED /usr/local/bin/php: FAILED open or read /usr/local/bin/redis-benchmark: FAILED open or read /usr/local/bin/redis-check-aof: FAILED open or read /usr/local/bin/redis-check-dump: FAILED open or read /usr/local/bin/redis-cli: FAILED open or read /usr/local/bin/redis-server: FAILED open or read /etc/init.d/clean-boa-env: FAILED /etc/init.d/nginx: FAILED /etc/init.d/php53-fpm: FAILED /etc/init.d/redis-server: FAILED
It's back up:
uptime 22:57:57 up 7 min, 1 user, load average: 1.70, 0.59, 0.24
We still have the self signed cert.
Now to try grepping to work out which nginx files contains the cert path.
cd /etc/nginx grep -r ssl . ./nginx.conf.default: # listen 443 ssl; ./nginx.conf.default: # ssl_certificate cert.pem; ./nginx.conf.default: # ssl_certificate_key cert.key; ./nginx.conf.default: # ssl_session_cache shared:SSL:1m; ./nginx.conf.default: # ssl_session_timeout 5m; ./nginx.conf.default: # ssl_ciphers HIGH:!aNULL:!MD5; ./nginx.conf.default: # ssl_prefer_server_ciphers on; ./sites-available/default.dpkg-dist:# ssl on; ./sites-available/default.dpkg-dist:# ssl_certificate cert.pem; ./sites-available/default.dpkg-dist:# ssl_certificate_key cert.key; ./sites-available/default.dpkg-dist:# ssl_session_timeout 5m; ./sites-available/default.dpkg-dist:# ssl_protocols SSLv3 TLSv1; ./sites-available/default.dpkg-dist:# ssl_ciphers ALL:!ADH:!EXPORT56:RC4+RSA:+HIGH:+MEDIUM:+LOW:+SSLv3:+EXP; ./sites-available/default.dpkg-dist:# ssl_prefer_server_ciphers on;
So it's none of the files in /etc/nginx so it must be included from somewhere else:
grep -r include * nginx.conf: include /etc/nginx/mime.types; nginx.conf: include /etc/nginx/conf.d/*.conf; nginx.conf: include /etc/nginx/sites-enabled/*; nginx.conf.default: include mime.types; nginx.conf.default: # include fastcgi_params; sites-available/default.dpkg-dist: # include /etc/nginx/naxsi.rules sites-available/default.dpkg-dist: # include fastcgi_params;
So:
grep -r ssl /etc/nginx/conf.d/* ssl_session_cache shared:SSL:10m; ssl_session_timeout 10m; grep -ri ssl /etc/nginx/sites-enabled/* grep: /etc/nginx/sites-enabled/*: No such file or directory
So perhaps this isn't the ngnix config at all? WHERE THE FUCK IS IT?
updatedb locate *.crt /data/disk/tn/config/server_master/ssl.d/transitionnetwork.org/openssl.crt /data/disk/tn/config/ssl.d/transitionnetwork.org/bak/openssl.crt /data/disk/tn/config/ssl.d/transitionnetwork.org/openssl.crt
Perhaps it's these...
cd /data/disk/tn/config/ssl.d/transitionnetwork.org/ ls -lah openssl.crt -> /etc/ssl/transitionnetwork.org/transitionnetwork.org.chained.pem openssl.key -> /etc/ssl/transitionnetwork.org/transitionnetwork.org.key
Nope...
These look like the possible nginx config files:
locate nginx | grep tn | grep -v backup /data/disk/tn/aegir/distro/008/profiles/hostmaster/modules/hosting/web_server/nginx /data/disk/tn/aegir/distro/008/profiles/hostmaster/modules/hosting/web_server/nginx/hosting.feature.nginx.inc /data/disk/tn/aegir/distro/008/profiles/hostmaster/modules/hosting/web_server/nginx/hosting_nginx.info /data/disk/tn/aegir/distro/008/profiles/hostmaster/modules/hosting/web_server/nginx/hosting_nginx.module /data/disk/tn/aegir/distro/008/profiles/hostmaster/modules/hosting/web_server/nginx/hosting_nginx.service.inc /data/disk/tn/aegir/distro/008/profiles/hostmaster/modules/hosting/web_server/nginx/ssl /data/disk/tn/aegir/distro/008/profiles/hostmaster/modules/hosting/web_server/nginx/ssl/hosting.feature.nginx_ssl.inc /data/disk/tn/aegir/distro/008/profiles/hostmaster/modules/hosting/web_server/nginx/ssl/hosting_nginx_ssl.info /data/disk/tn/aegir/distro/008/profiles/hostmaster/modules/hosting/web_server/nginx/ssl/hosting_nginx_ssl.module /data/disk/tn/aegir/distro/008/profiles/hostmaster/modules/hosting/web_server/nginx/ssl/hosting_nginx_ssl.service.inc /data/disk/tn/aegir/distro/008/profiles/hostmaster/web_server/nginx /data/disk/tn/aegir/distro/008/profiles/hostmaster/web_server/nginx/ssl /data/disk/tn/aegir/distro/008/profiles/hostmaster/web_server/nginx/ssl/hosting_nginx_ssl.drush.inc /data/disk/tn/config/includes/nginx_advanced_include.conf /data/disk/tn/config/includes/nginx_legacy_include.conf /data/disk/tn/config/includes/nginx_modern_include.conf /data/disk/tn/config/includes/nginx_octopus_include.conf /data/disk/tn/config/includes/nginx_simple_include.conf /data/disk/tn/config/nginx.conf /data/disk/tn/config/server_master/nginx /data/disk/tn/config/server_master/nginx.conf /data/disk/tn/config/server_master/nginx/platform.d /data/disk/tn/config/server_master/nginx/post.d /data/disk/tn/config/server_master/nginx/post.d/nginx_force_include* /data/disk/tn/config/server_master/nginx/pre.d /data/disk/tn/config/server_master/nginx/vhost.d /data/disk/tn/config/server_master/nginx/vhost.d/iirs-test.transitionnetwork.org /data/disk/tn/config/server_master/nginx/vhost.d/news.transitionnetwork.org /data/disk/tn/config/server_master/nginx/vhost.d/pb-stage-20130212.transitionnetwork.org /data/disk/tn/config/server_master/nginx/vhost.d/pb-stage-20140403.transitionnetwork.org /data/disk/tn/config/server_master/nginx/vhost.d/space.transitionnetwork.org /data/disk/tn/config/server_master/nginx/vhost.d/stg2.transitionnetwork.org /data/disk/tn/config/server_master/nginx/vhost.d/stg3.transitionnetwork.org /data/disk/tn/config/server_master/nginx/vhost.d/stg4.transitionnetwork.org /data/disk/tn/config/server_master/nginx/vhost.d/stg.transitionnetwork.org /data/disk/tn/config/server_master/nginx/vhost.d/tn.puffin.webarch.net /data/disk/tn/config/server_master/nginx/vhost.d/www.transitionnetwork.org /data/disk/tn/config/tn.nginx.conf /data/disk/tn/.drush/provision_cdn/Provision/Service/cdn/nginx.php /data/disk/tn/.drush/provision/http/nginx /data/disk/tn/.drush/provision/http/nginx/nginx_service.inc /data/disk/tn/.drush/provision/http/nginx_ssl /data/disk/tn/.drush/provision/http/nginx_ssl/nginx_ssl_service.inc /data/disk/tn/.drush/provision/http/Provision/Service/http/nginx /data/disk/tn/.drush/provision/http/Provision/Service/http/nginx.conf /data/disk/tn/.drush/provision/http/Provision/Service/http/nginx_legacy_include.conf /data/disk/tn/.drush/provision/http/Provision/Service/http/nginx_modern_include.conf /data/disk/tn/.drush/provision/http/Provision/Service/http/nginx_octopus_include.conf /data/disk/tn/.drush/provision/http/Provision/Service/http/nginx.php /data/disk/tn/.drush/provision/http/Provision/Service/http/nginx/ssl.php /data/disk/tn/static/transition-network-d6-p009/sites/news.transitionnetwork.org/nginx_cache_hour.info /data/disk/tn/static/transition-network-d6-p009/sites/www.transitionnetwork.org/nginx_cache_quarter.info /data/disk/tn/static/transition-network-d6-s008/sites/pb-stage-20130212.transitionnetwork.org/nginx_cache_quarter.info /data/disk/tn/static/transition-network-d6-s008/sites/stg2.transitionnetwork.org/nginx_cache_quarter.info /data/disk/tn/static/transition-network-d6-s008/sites/stg.transitionnetwork.org/nginx_cache_quarter.info /data/disk/tn/static/transition-network-d6-s011/sites/pb-stage-20140403.transitionnetwork.org/nginx_cache_quarter.info /var/aegir/config/server_master/nginx/platform.d/tn.conf
So, checking these places:
grep -ri ssl /data/disk/tn/aegir/distro/008/profiles/hostmaster/modules/hosting/web_server/* | grep crt grep -ri ssl /data/disk/tn/aegir/distro/008/profiles/hostmaster/web_server/* | grep crt grep -ri ssl /data/disk/tn/config/tn.nginx.conf grep -ri ssl /data/disk/tn/config/includes/* grep -ri ssl /data/disk/tn/config/nginx.conf grep -ri ssl /data/disk/tn/config/server_master/* grep -ri ssl /data/disk/tn/config/tn.nginx.conf grep -ri ssl /data/disk/tn/.drush/provision/http/nginx grep -ri ssl /data/disk/tn/.drush/provision/http/Provision/Service/http/* grep -ri ssl /data/disk/tn/static/transition-network-d6-p009/sites/* | grep crt grep -ri ssl /var/aegir/config/server_master/nginx/platform.d/tn.conf
No joy...
date Fri Apr 11 23:24:04 BST 2014
I just don't have a clue where the config files that need fixing are, this is very fustrating, the site is *down*.
comment:21 Changed 3 years ago by chris
- Add Hours to Ticket changed from 0.0 to 0.26
- Total Hours changed from 3.84 to 4.1
Starting from the beginning...
In /etc/init.d/nginx we have:
NGINX_CONF_FILE="/etc/nginx/nginx.conf"
That files includes:
include /etc/nginx/mime.types; include /etc/nginx/conf.d/*.conf; include /etc/nginx/sites-enabled/*;
ls /etc/nginx/conf.d/*.conf /etc/nginx/conf.d/aegir.conf@ grep ssl /etc/nginx/conf.d/aegir.conf ssl_session_cache shared:SSL:10m; ssl_session_timeout 10m; grep include /etc/nginx/conf.d/aegir.conf include /var/aegir/config/server_master/nginx/pre.d/*; include /var/aegir/config/server_master/nginx/platform.d/*; include /var/aegir/config/server_master/nginx/vhost.d/*; include /var/aegir/config/server_master/nginx/post.d/*; grep -ir ssl /var/aegir/config/server_master/nginx/pre.d/* /var/aegir/config/server_master/nginx/pre.d/nginx_wild_ssl.conf:### /var/aegir/config/server_master/nginx/pre.d/nginx_wild_ssl.conf /var/aegir/config/server_master/nginx/pre.d/nginx_wild_ssl.conf: listen *:443 ssl spdy; /var/aegir/config/server_master/nginx/pre.d/nginx_wild_ssl.conf: ssl on; /var/aegir/config/server_master/nginx/pre.d/nginx_wild_ssl.conf: ssl_certificate /etc/ssl/private/nginx-wild-ssl.crt; /var/aegir/config/server_master/nginx/pre.d/nginx_wild_ssl.conf: ssl_certificate_key /etc/ssl/private/nginx-wild-ssl.key; /var/aegir/config/server_master/nginx/pre.d/nginx_wild_ssl.conf: ssl_session_timeout 5m; /var/aegir/config/server_master/nginx/pre.d/nginx_wild_ssl.conf: ssl_protocols SSLv3 TLSv1 TLSv1.1 TLSv1.2; /var/aegir/config/server_master/nginx/pre.d/nginx_wild_ssl.conf: ssl_ciphers EECDH+ECDSA+AESGCM:EECDH+aRSA+AESGCM:EECDH+ECDSA+SHA384:EECDH+ECDSA+SHA256:EECDH+aRSA+SHA384:EECDH+aRSA+SHA256:EECDH+aRSA+RC4:EECDH:EDH+aRSA:RC4:!aNULL:!eNULL:!LOW:!3DES:!MD5:!EXP:!PSK:!SRP:!DSS:+RC4:RC4; /var/aegir/config/server_master/nginx/pre.d/nginx_wild_ssl.conf: ssl_prefer_server_ciphers on;
BINGO!
That file was edited:
#ssl_certificate /etc/ssl/private/nginx-wild-ssl.crt; #ssl_certificate_key /etc/ssl/private/nginx-wild-ssl.key; ssl_certificate /etc/ssl/transitionnetwork.org/transitionnetwork.org.chained.pem; ssl_certificate_key /etc/ssl/transitionnetwork.org/transitionnetwork.org.key;
But still:
/var/aegir/config/server_master/nginx/pre.d/*; include /var/aegir/config/server_master/nginx/platform.d/*; include /var/aegir/config/server_master/nginx/vhost.d/*; include /var/aegir/config/server_master/nginx/post.d/*; grep -ir ssl /var/aegir/config/server_master/nginx/pre.d/* /var/aegir/config/server_master/nginx/pre.d/nginx_wild_ssl.conf:### /var/aegir/config/server_master/nginx/pre.d/nginx_wild_ssl.conf /var/aegir/config/server_master/nginx/pre.d/nginx_wild_ssl.conf: listen *:443 ssl spdy; /var/aegir/config/server_master/nginx/pre.d/nginx_wild_ssl.conf: ssl on; /var/aegir/config/server_master/nginx/pre.d/nginx_wild_ssl.conf: ssl_certificate /etc/ssl/private/nginx-wild-ssl.crt; /var/aegir/config/server_master/nginx/pre.d/nginx_wild_ssl.conf: ssl_certificate_key /etc/ssl/private/nginx-wild-ssl.key; /var/aegir/config/server_master/nginx/pre.d/nginx_wild_ssl.conf: ssl_session_timeout 5m; /var/aegir/config/server_master/nginx/pre.d/nginx_wild_ssl.conf: ssl_protocols SSLv3 TLSv1 TLSv1.1 TLSv1.2; /var/aegir/config/server_master/nginx/pre.d/nginx_wild_ssl.conf: ssl_ciphers EECDH+ECDSA+AESGCM:EECDH+aRSA+AESGCM:EECDH+ECDSA+SHA384:EECDH+ECDSA+SHA256:EECDH+aRSA+SHA384:EECDH+aRSA+SHA256:EECDH+aRSA+RC4:EECDH:EDH+aRSA:RC4:!aNULL:!eNULL:!LOW:!3DES:!MD5:!EXP:!PSK:!SRP:!DSS:+RC4:RC4; /var/aegir/config/server_master/nginx/pre.d/nginx_wild_ssl.conf: ssl_prefer_server_ciphers on;
BINGO!
That file was edited:
#ssl_certificate /etc/ssl/private/nginx-wild-ssl.crt; #ssl_certificate_key /etc/ssl/private/nginx-wild-ssl.key; ssl_certificate /etc/ssl/transitionnetwork.org/transitionnetwork.org.chained.pem; ssl_certificate_key /etc/ssl/transitionnetwork.org/transitionnetwork.org.key;
But still:
www.transitionnetwork.org uses an invalid security certificate.
The certificate is not trusted because it is self-signed.
The certificate is only valid for *.puffin.webarch.net
So, copying the files over from wiki:PenginServer? again:
rsync -av penguin:tn/ /root/tn/ bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8) receiving incremental file list ./ transitionnetwork.org.chained.pem transitionnetwork.org.crt transitionnetwork.org.csr transitionnetwork.org.key sent 90 bytes received 9797 bytes 19774.00 bytes/sec total size is 9499 speedup is 0.96
And:
cd /etc/ssl/transitionnetwork.org mv transitionnetwork.org.* old/ mv /root/tn/transitionnetwork.org.* .
And it's fixed!
So the issue was that the right certs were replaced by self signed by BOA...?
What a wast of time that was.
comment:22 Changed 3 years ago by chris
- Add Hours to Ticket changed from 0.0 to 0.2
- Total Hours changed from 4.1 to 4.3
So now testing other stuff and looking around...
As expect the default MySQL settings have dramatically reduced the RAM available for the database:
These graphs and lots others have been broken:
- https://penguin.transitionnetwork.org/munin/transitionnetwork.org/puffin.transitionnetwork.org/nginx_vhost_traffic.html
- https://penguin.transitionnetwork.org/munin/transitionnetwork.org/puffin.transitionnetwork.org/phpfpm_average.html
- https://penguin.transitionnetwork.org/munin/transitionnetwork.org/puffin.transitionnetwork.org/phpfpm_processes.html
But I'm tired and it can wait till tomorrow -- it looks like a permissions issue:
munin-run phpfpm_average php_average.value /etc/munin/plugins/phpfpm_average: line 40: /bin/ps: Permission denied /etc/munin/plugins/phpfpm_average: line 40: /bin/grep: Permission denied /etc/munin/plugins/phpfpm_average: line 40: /bin/grep: Permission denied /etc/munin/plugins/phpfpm_average: line 40: /bin/grep: Permission denied /etc/munin/plugins/phpfpm_average: line 40: /usr/bin/awk: Permission denied munin-run phpfpm_connections Can't exec "/etc/munin/plugins/phpfpm_connections": Permission denied at /usr/share/perl5/Munin/Node/Service.pm line 263. # FATAL: Failed to exec. munin-run multips_memory /usr/share/munin/plugins/plugin.sh: line 14: /bin/sed: Permission denied /etc/munin/plugins/multips_memory: line 140: /bin/ps: Permission denied /etc/munin/plugins/multips_memory: line 144: /usr/bin/gawk: Permission denied /usr/share/munin/plugins/plugin.sh: line 14: /bin/sed: Permission denied /etc/munin/plugins/multips_memory: line 140: /bin/ps: Permission denied /etc/munin/plugins/multips_memory: line 144: /usr/bin/gawk: Permission denied /usr/share/munin/plugins/plugin.sh: line 14: /bin/sed: Permission denied /etc/munin/plugins/multips_memory: line 140: /bin/ps: Permission denied /etc/munin/plugins/multips_memory: line 144: /usr/bin/gawk: Permission denied /usr/share/munin/plugins/plugin.sh: line 14: /bin/sed: Permission denied /etc/munin/plugins/multips_memory: line 140: /bin/ps: Permission denied /etc/munin/plugins/multips_memory: line 144: /usr/bin/gawk: Permission denied /usr/share/munin/plugins/plugin.sh: line 14: /bin/sed: Permission denied /etc/munin/plugins/multips_memory: line 140: /bin/ps: Permission denied /etc/munin/plugins/multips_memory: line 144: /usr/bin/gawk: Permission denied
More broken shit than a BOA upgrade usually causes...
comment:23 Changed 3 years ago by chris
- Add Hours to Ticket changed from 0.0 to 0.5
- Total Hours changed from 4.3 to 4.8
It looks like wiki:PuffinServer commited suicide, I got this email:
Date: Sat, 12 Apr 2014 09:11:03 +0100 To: chris@webarchitects.co.uk Subject: ** PROBLEM Service Alert: puffin/SSH is CRITICAL ** ***** Nagios ***** Notification Type: PROBLEM Service: SSH Host: puffin Address: puffin.webarch.net State: CRITICAL Date/Time: Sat Apr 12 09:11:03 BST 2014
I couldn't connect via ssh and was about to reboot it at a xen level when I did get in and it looks that with the default BOA settings we are back in load spike suicide land:
uptime 09:24:23 up 10:34, 1 user, load average: 65.71, 120.18, 85.84 uptime 09:29:37 up 10:39, 1 user, load average: 0.52, 42.71, 61.54
I'll look at the logs in a while to see what happened, but the BOA default is to clobber lots of key logs so I might not find a lot of info.
Since the aim is run with the BOA defaults, ticket:670, I'll start by doing the minimum needed to get the munin graphs working again and stop the log clobbering so we can get a better picture about what is happening when the server commits suicide again.
comment:24 in reply to: ↑ 20 Changed 3 years ago by chris
- Add Hours to Ticket changed from 0.0 to 0.25
- Total Hours changed from 4.8 to 5.05
Replying to chris:
Wow, that took ages...
Looking at the console from xen it was down to the firewall -- there is such a huge number of iptables rules generated by csf/ldf that it takes 5 mins to unload or load them, it seems.
Last night I set a iptables --list running in screen, the file it generated:
ls /root/iptables.2014-04-12 -lah -rw-r--r-- 1 root root 247K Apr 12 00:28 /root/iptables.2014-04-12 cat /root/iptables.2014-04-12 | wc -l 3693
I was expecting it to be bigger.
comment:25 Changed 3 years ago by chris
Posting this to record 15 mins spent rereading comments and fixing typos and spelling mistakes
comment:26 Changed 3 years ago by chris
- Add Hours to Ticket changed from 0.0 to 0.25
- Total Hours changed from 5.05 to 5.3
Oops, time missed off last comment.
comment:27 Changed 3 years ago by chris
- Add Hours to Ticket changed from 0.0 to 0.3
- Total Hours changed from 5.3 to 5.6
Since users are no longer allowed access to most command line functioms because BOA chmodded lits of programes I think adjusting munin tasks that were run by the munin user to now run as root is probable the easiest was to address this, however this could also have negative security implications.
Testing with two graphs to negin with,
[multips] env.names nginx php_fpm mysqld redis-server munin-node user root [multips_memory] env.names nginx php-fpm mysqld redis-server munin-node user root
This should fix this graph:
As it it working again on the command line:
root@puffin:/etc/munin/plugins# munin-run multips_memory nginx.value 631918592 php_fpm.value 76709888 mysqld.value 1502449664 redis_server.value U munin_node.value 10309632
comment:28 follow-up: ↓ 32 Changed 3 years ago by chris
- Add Hours to Ticket changed from 0.0 to 0.75
- Total Hours changed from 5.6 to 6.35
Fixing the Munin plugins by making them run as root rather than user munin or nobody or other non-root users with less permissions... *sigh*
These ones are not just a matter of a perms fix:
munin-run nginx_request request.value U munin-run nginx_status total.value U reading.value U writing.value U waiting.value U munin-run phpfpm_connections accepted.value U munin-run phpfpm_connections accepted.value U munin-run phpfpm_status idle.value U active.value U total.value U munin-run redis_127.0.0.1_6379 Could not connect to Redis at 127.0.0.1:6379: Connection refused multigraph redis_commands commands.value hits.value misses.value multigraph redis_dbs expires.value
For everything else the graphs are starting to be drawn again:
comment:29 Changed 3 years ago by chris
- Add Hours to Ticket changed from 0.0 to 0.52
- Total Hours changed from 6.35 to 6.87
There was another load spike suicide this afternoon that's two in the 24 hours since the upgrade to BOA 2.2.2, I'll look at the lofs later and record my findings on ticket:670.
Fixing the broken Munin graphs...
The /var/aegir/config/server_master/nginx.conf files now contains:
server { listen *:80; server_name 127.0.0.1; location /nginx_status { stub_status on; access_log off; allow 127.0.0.1; deny all; } }
So trying to work out the URL to get the stats...
lynx -dump http://localhost/nginx_status 404 Not Found __________________________________________________________________ nginx lynx -dump http://puffin.webarch.net/nginx_status 404 Not Found __________________________________________________________________ nginx lynx -dump http://127.0.0.1/nginx_status Active connections: 11 server accepts handled requests 9354 9354 13400 Reading: 0 Writing: 1 Waiting: 10
So /etc/munin/plugin-conf.d/munin-node was updated to:
[nginx_request] env.url http://127.0.0.1/nginx_status user root [nginx_status] env.url http://127.0.0.1/nginx_status user root
And testing:
nginx_status total.value 23 reading.value 0 writing.value 1 waiting.value 21 munin-run nginx_request request.value 13915
The docs at wiki:PuffinServer#nginxconfigchanges will need updating, we once again have Munin Nginx graphs:
comment:30 Changed 3 years ago by chris
- Add Hours to Ticket changed from 0.0 to 0.75
- Total Hours changed from 6.87 to 7.62
The old php config file, /opt/local/etc/php53-fpm.conf still contains:
pm.status_path = /status ping.path = /ping
However that status isn't available at these URLs:
lynx -dump http://127.0.0.1/status 404 Not Found __________________________________________________________________ nginx lynx -dump http://localhost/status 404 Not Found __________________________________________________________________ nginx lynx -dump http://puffin.webarch.net/status 404 Not Found __________________________________________________________________ nginx
Also the new URLs which are supposed to work don't:
lynx -dump http://127.0.0.1/fpm-status 404 Not Found __________________________________________________________________ nginx lynx -dump http://localhost/fpm-status 404 Not Found __________________________________________________________________ nginx lynx -dump http://puffin.webarch.net/fpm-status 404 Not Found __________________________________________________________________ nginx
So we need to find the new php config file to see if the status is enabled.
It's might be one of these files:
updatedb locate php | grep conf$ /etc/php5/fpm/php-fpm.conf /etc/php5/fpm/pool.d/www.conf /opt/etc/php-fpm.conf /opt/local/etc/php53-fpm.conf /opt/php52/etc/php52-fpm.conf /opt/php53/etc/pear.conf /opt/php53/etc/php53-fpm.conf /opt/php53/etc/pool.d/www53.conf /opt/php54/etc/php54-fpm.conf /opt/php54/etc/pool.d/www54.conf /opt/php55/etc/php55-fpm.conf /opt/php55/etc/pool.d/www55.conf
The only one with a status line is /opt/local/etc/php53-fpm.conf
So trying to track down the php-fpm config file which is actually being used...
The /opt/php53/etc/php53-fpm.conf file includes /opt/php53/etc/pool.d/*.conf and /opt/php53/etc/pool.d/www53.conf includes /opt/etc/fpm/fpm-pool-common.conf and that files contains:
pm.status_path = /fpm-status ping.path = /fpm-ping
Looking at /etc/init.d/php53-fpm and /etc/init.d/php5-fpm to try to work out where the php-fpm config files are to be found...
/etc/init.d/php53-fpm contains:
php_fpm_CONF=/opt/php53/etc/php53-fpm.conf
And /etc/init.d/php5-fpm contains:
DAEMON_ARGS="--fpm-config /etc/php5/fpm/php-fpm.conf"
Before the last upgrade the init script was /etc/init.d/php53-fpm, see wiki:PuffinServer#php-fpm
The problem is with Nginx, I tried editing /var/aegir/config/server_master/nginx.conf to add the code we had before:
location ~ ^/(status|ping)$ { fastcgi_pass 127.0.0.1:9090; fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name; fastcgi_intercept_errors on; include fastcgi_params; access_log off; allow 127.0.0.1; deny all; }
But that didn't fix it. I think I'm too tired to solve this mystery tonight.
Looking in the logs, we have lots of entries like this in /var/log/php/error_log_53
[12-Apr-2014 18:00:35 UTC] PHP Warning: Zend OPcache can't be temporary enabled (it may be only disabled till the end of request) in Unknown on line 0 [12-Apr-2014 18:00:35 UTC] PHP Warning: Zend OPcache can't be temporary enabled (it may be only disabled till the end of request) in Unknown on line 0 [12-Apr-2014 18:00:35 UTC] PHP Warning: Zend OPcache can't be temporary enabled (it may be only disabled till the end of request) in Unknown on line 0 [12-Apr-2014 18:00:41 UTC] PHP Warning: Zend OPcache can't be temporary enabled (it may be only disabled till the end of request) in Unknown on line 0 [12-Apr-2014 18:00:41 UTC] PHP Warning: Zend OPcache can't be temporary enabled (it may be only disabled till the end of request) in Unknown on line 0 [12-Apr-2014 18:00:41 UTC] PHP Warning: Zend OPcache can't be temporary enabled (it may be only disabled till the end of request) in Unknown on line 0 [12-Apr-2014 18:00:42 UTC] PHP Warning: Zend OPcache can't be temporary enabled (it may be only disabled till the end of request) in Unknown on line 0 [12-Apr-2014 18:00:42 UTC] PHP Warning: Zend OPcache can't be temporary enabled (it may be only disabled till the end of request) in Unknown on line 0 [12-Apr-2014 18:07:47 UTC] PHP Warning: Zend OPcache can't be temporary enabled (it may be only disabled till the end of request) in Unknown on line 0
In /var/log/php/fpm-www53-slow.log there are lots of entries like this, the time matches the last load spike suicide:
[12-Apr-2014 15:20:57] [pool www53] pid 56669 script_filename = /data/disk/tn/static/transition-network-d6-p009/index.php [0x00007ff0205f72b0] _drupal_bootstrap() /data/disk/tn/static/transition-network-d6-p009/includes/bootstrap.inc:1480 [0x00007ff0205f7150] _drupal_bootstrap() /data/disk/tn/static/transition-network-d6-p009/includes/bootstrap.inc:1447 [0x00007ff0205f7050] drupal_bootstrap() /data/disk/tn/static/transition-network-d6-p009/index.php:15 [12-Apr-2014 15:21:13] [pool www53] pid 56670 script_filename = /data/disk/tn/static/transition-network-d6-p009/index.php [0x00007ff0205f7760] is_readable() /data/conf/global.inc:476 [0x00007ff0205f7628] +++ dump failed [12-Apr-2014 15:21:18] [pool www53] pid 56681 script_filename = /data/disk/tn/static/transition-network-d6-p009/index.php [0x00007ff0205f7760] connect() /data/conf/global.inc:371 [0x00007ff0205f7628] +++ dump failed [12-Apr-2014 15:21:23] [pool www53] pid 56547 script_filename = /data/disk/tn/static/transition-network-d6-p009/index.php [0x000000000348da30] is_readable() /data/conf/global.inc:427 [0x000000000348d8f8] +++ dump failed [12-Apr-2014 15:21:54] [pool www53] pid 56695 script_filename = /data/disk/tn/static/transition-network-d6-p009/index.php [0x00007ff0205f7760] connect() /data/conf/global.inc:371 [0x00007ff0205f7628] +++ dump failed
And in /var/log/php/php53-fpm-error.log there are lots of lines like this which coincide with the last load spike suicide:
[12-Apr-2014 15:21:54] ERROR: failed to ptrace(PEEKDATA) pid 56695: Input/output error (5) [12-Apr-2014 15:21:58] WARNING: [pool www53] child 56655, script '/data/disk/tn/static/transition-network-d6-p009/index.php' (request: "GET /index.php") execution timed out (194.782226 sec), terminating [12-Apr-2014 15:21:58] WARNING: [pool www53] child 56654, script '/data/disk/tn/static/transition-network-d6-p009/index.php' (request: "GET /index.php") execution timed out (197.135421 sec), terminating [12-Apr-2014 15:21:58] WARNING: [pool www53] child 56653, script '/data/disk/tn/static/transition-network-d6-p009/index.php' (request: "GET /index.php") execution timed out (199.566956 sec), terminating [12-Apr-2014 15:22:01] WARNING: [pool www53] child 56655 exited on signal 15 (SIGTERM) after 201.299257 seconds from start [12-Apr-2014 15:22:04] WARNING: [pool www53] child 56653 exited on signal 15 (SIGTERM) after 209.593928 seconds from start [12-Apr-2014 15:22:10] WARNING: [pool www53] child 56654 exited on signal 15 (SIGTERM) after 211.753491 seconds from start [12-Apr-2014 15:22:18] WARNING: [pool www53] child 56661, script '/data/disk/tn/static/transition-network-d6-p009/index.php' (request: "GET /index.php") execution timed out (193.133020 sec), terminating [12-Apr-2014 15:22:18] WARNING: [pool www53] child 56657, script '/data/disk/tn/static/transition-network-d6-p009/index.php' (request: "GET /index.php") execution timed out (189.783626 sec), terminating [12-Apr-2014 15:22:18] WARNING: [pool www53] child 56590, script '/data/disk/tn/static/transition-network-d6-p009/index.php' (request: "GET /index.php") execution timed out (187.447955 sec), terminating [12-Apr-2014 15:22:22] WARNING: [pool www53] child 56657 exited on signal 15 (SIGTERM) after 219.567805 seconds from start [12-Apr-2014 15:22:25] WARNING: [pool www53] child 56590 exited on signal 15 (SIGTERM) after 365.818528 seconds from start [12-Apr-2014 15:22:28] WARNING: [pool www53] child 56661 exited on signal 15 (SIGTERM) after 208.535602 seconds from start [12-Apr-2014 15:22:38] WARNING: [pool www53] child 56617, script '/data/disk/tn/static/transition-network-d6-p009/index.php' (request: "GET /index.php") execution timed out (196.484809 sec), terminating [12-Apr-2014 15:22:45] WARNING: [pool www53] child 56617 exited on signal 15 (SIGTERM) after 352.524615 seconds from start [12-Apr-2014 15:22:58] WARNING: [pool www53] child 56670, script '/data/disk/tn/static/transition-network-d6-p009/index.php' (request: "GET /index.php") execution timed out (196.228568 sec), terminating [12-Apr-2014 15:22:58] WARNING: [pool www53] child 56669, script '/data/disk/tn/static/transition-network-d6-p009/index.php' (request: "GET /index.php") execution timed out (198.184145 sec), terminating [12-Apr-2014 15:23:05] WARNING: [pool www53] child 56669 exited on signal 15 (SIGTERM) after 224.906323 seconds from start [12-Apr-2014 15:23:08] WARNING: [pool www53] child 56670 exited on signal 15 (SIGTERM) after 227.669041 seconds from start [12-Apr-2014 15:23:19] WARNING: [pool www53] child 56681, script '/data/disk/tn/static/transition-network-d6-p009/index.php' (request: "GET /index.php") execution timed out (192.413551 sec), terminating [12-Apr-2014 15:23:19] WARNING: [pool www53] child 56547, script '/data/disk/tn/static/transition-network-d6-p009/index.php' (request: "GET /index.php") execution timed out (194.007008 sec), terminating [12-Apr-2014 15:23:21] WARNING: [pool www53] child 56681 exited on signal 15 (SIGTERM) after 200.979892 seconds from start [12-Apr-2014 15:23:24] WARNING: [pool www53] child 56547 exited on signal 15 (SIGTERM) after 472.666956 seconds from start [12-Apr-2014 15:23:39] WARNING: [pool www53] child 56695, script '/data/disk/tn/static/transition-network-d6-p009/index.php' (request: "HEAD /index.php") execution timed out (186.279207 sec), terminating [12-Apr-2014 15:23:41] WARNING: [pool www53] child 56695 exited on signal 15 (SIGTERM) after 216.840863 seconds from start [12-Apr-2014 15:27:30] ERROR: unable to bind listening socket for address '127.0.0.1:9090': Address already in use (98) [12-Apr-2014 15:27:30] ERROR: FPM initialization failed
So it's good that the logs are not being clobbered any more but there does appear to be some things not quite right...
I'll do some more on this tomorrow evening...
comment:31 Changed 3 years ago by chris
- Add Hours to Ticket changed from 0.0 to 0.35
- Total Hours changed from 7.62 to 7.97
Redis isn't running:
ps -lA | grep -i redis
And it won't start:
/etc/init.d/redis-server start Starting redis-server: touch: cannot touch `/var/run/redis/redis.pid': No such file or directory
This could explain why the server wasn't coping with load spikes.
Make the directory for the pid file and try to start it:
mkdir /var/run/redis/ chown redis:redis /var/run/redis/ /etc/init.d/redis-server start Starting redis-server: failed
The start failed because it had been automatically started by BOA I expect, it is running now:
ps -lA | grep -i redis 1 S 106 52733 1 0 80 0 - 13575 - ? 00:00:00 redis-server
The logs is being clobbered:
rotate [52733] 14 Apr 10:16:33.980 # Server started, Redis version 2.8.8 [52733] 14 Apr 10:16:33.981 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
The Redis config file, /etc/redis/redis.conf now contains a password, so this has been added to /etc/munin/plugin-conf.d/munin-node:
[redis_*] env.password XXX user root
It has been tested on the command line:
cd /etc/munin/plugins munin-run redis_127.0.0.1_6379 multigraph redis_clients clients.value 1 multigraph redis_blocked_clients blocked.value 0 multigraph redis_memory memory.value 37383008 multigraph redis_fragmentation frag.value 1.09 multigraph redis_total_connections connections.value 883 multigraph redis_expired_keys expired.value 8 multigraph redis_evicted_keys evicted.value 0 multigraph redis_pubsub_channels channels.value 0 multigraph redis_commands commands.value 34345 hits.value 17193 misses.value 5496 multigraph redis_dbs db0keys.value 3893 db0expires.value 919
Munin has been restarted:
/etc/init.d/munin-node restart [ ok ] Stopping Munin-Node: done. [ ok ] Starting Munin-Node: done.
So now we should soon start to get Redis munin graphs again:
comment:32 in reply to: ↑ 28 ; follow-up: ↓ 33 Changed 3 years ago by chris
- Add Hours to Ticket changed from 0.0 to 1.0
- Total Hours changed from 7.97 to 8.97
Replying to chris:
These ones are not just a matter of a perms fix:
munin-run nginx_request request.value U munin-run nginx_status total.value U reading.value U writing.value U waiting.value U
The odd thing here is that this works on the command line:
munin-run nginx_request request.value 249155 munin-run nginx_status total.value 30 reading.value 0 writing.value 4 waiting.value 26
But we don't have graphs here:
There is nothing in the log files, /var/log/munin/, but since opening this comment they have started to reappear -- the munin-node restart done to fix the redis logs must have also fixed these graphs?
These are still now working:
munin-run phpfpm_connections accepted.value U munin-run phpfpm_status idle.value U active.value U total.value U
So, the plugins are written in perl:
cd /etc/munin/plugins perl -wc phpfpm_connections phpfpm_connections syntax OK perl -wc phpfpm_status phpfpm_status syntax OK
The problem is that the status URL is a 404:
lynx -dump http://127.0.0.1/fpm-status 404 Not Found __________________________________________________________________ nginx lynx -dump http://127.0.0.1/status 404 Not Found __________________________________________________________________ nginx
Previously this needed adding to /var/aegir/config/server_master/nginx.conf:
location ~ ^/(status|ping)$ { fastcgi_pass 127.0.0.1:9090; fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name; fastcgi_intercept_errors on; include fastcgi_params; access_log off; allow 127.0.0.1; deny all; }
See wiki:PuffinServer#nginxconfigchanges but when you try connect using those details:
lynx -dump http://127.0.0.1:9090/status Looking up 127.0.0.1:9090 Making HTTP connection to 127.0.0.1:9090 Sending HTTP request. HTTP request sent; waiting for response. Retrying as HTTP0 request. Looking up 127.0.0.1:9090 Making HTTP connection to 127.0.0.1:9090 Sending HTTP request. HTTP request sent; waiting for response. Alert!: Unexpected network read error; connection aborted. Can't Access `http://127.0.0.1:9090/status' Alert!: Unable to access document. lynx: Can't access startfile
This does appear to be the right port, it is set to 9090 in /opt/php53/etc/pool.d/www53.conf:
listen = 127.0.0.1:9090
And that file includes /opt/etc/fpm/fpm-pool-common.conf which contains:
pm.status_path = /fpm-status ping.path = /fpm-ping
It is running on this port:
netstat -tulpn | grep 9090 tcp 0 0 127.0.0.1:9090 0.0.0.0:* LISTEN 6852/php53-fpm.conf
And the binary:
ls -l /proc/6852/exe lrwxrwxrwx 1 root root 0 Apr 14 00:02 /proc/6852/exe -> /opt/php53/sbin/php-fpm*
And that is the binary referenced in /etc/init.d/php53-fpm:
php_fpm_BIN=/opt/php53/sbin/php-fpm php_fpm_CONF=/opt/php53/etc/php53-fpm.conf
And /opt/php53/etc/php53-fpm.conf includes /opt/php53/etc/pool.d/*.conf and that includes /opt/etc/fpm/fpm-pool-common.conf.
Still non the wiser why we can't get the php-fpm graphs working:
- https://penguin.transitionnetwork.org/munin/transitionnetwork.org/puffin.transitionnetwork.org/phpfpm_status.html
- https://penguin.transitionnetwork.org/munin/transitionnetwork.org/puffin.transitionnetwork.org/phpfpm_connections.html
More work is needed on this :-(
comment:33 in reply to: ↑ 32 ; follow-up: ↓ 34 Changed 3 years ago by chris
- Add Hours to Ticket changed from 0.0 to 0.25
- Total Hours changed from 8.97 to 9.22
Replying to chris:
Still non the wiser why we can't get the php-fpm graphs working:
Adding this to /var/aegir/config/server_master/nginx.conf:
location ~ ^/fpm-(status|ping)$ { fastcgi_pass 127.0.0.1:9090; fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name; fastcgi_intercept_errors on; include fastcgi_params; access_log off; allow 127.0.0.1; allow 81.95.52.103; deny all; }
Has resulted in the Munin graphs to start to be generated again.
However by editing /var/aegir/config/server_master/nginx.conf the "use stock BOA settings where possible" directive, ticket:670, has been breached and this change might need doing after each BOA upgrade.
The documentation, wiki:PuffinServer#nginxconfigchanges has been updated.
comment:34 in reply to: ↑ 33 ; follow-ups: ↓ 35 ↓ 36 Changed 3 years ago by jim
- Add Hours to Ticket changed from 0.0 to 0.5
- Total Hours changed from 9.22 to 9.72
Replying to chris:
Adding this to /var/aegir/config/server_master/nginx.conf:
location ~ ^/fpm-(status|ping)$ { fastcgi_pass 127.0.0.1:9090; fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name; fastcgi_intercept_errors on; include fastcgi_params; access_log off; allow 127.0.0.1; allow 81.95.52.103; deny all; }
There was an issue about this here: https://drupal.org/node/2167459
We can open a new one with the extra changes if needed. It's not clear from the above which lines were added, Chris? If you can provide a summary of what needed to be changed, I'd be happy to add a ticket in the Barracuda D.o queue tonight.
(Also adding my time for various comments & emails.)
comment:35 in reply to: ↑ 34 Changed 3 years ago by chris
- Add Hours to Ticket changed from 0.0 to 0.1
- Total Hours changed from 9.72 to 9.82
Replying to jim:
Replying to chris:
Adding this to /var/aegir/config/server_master/nginx.conf:
location ~ ^/fpm-(status|ping)$ { fastcgi_pass 127.0.0.1:9090; fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name; fastcgi_intercept_errors on; include fastcgi_params; access_log off; allow 127.0.0.1; allow 81.95.52.103; deny all; }It's not clear from the above which lines were added, Chris?
All the lines above were added.
comment:36 in reply to: ↑ 34 Changed 3 years ago by chris
- Add Hours to Ticket changed from 0.0 to 0.2
- Total Hours changed from 9.82 to 10.02
Replying to jim:
There was an issue about this here: https://drupal.org/node/2167459
According the the diffs linked from that changed were made to nginx_modern_include.conf and nginx_octopus_include.conf, these are the copies of these files on the server:
locate nginx_modern_include.conf | grep -v backups | grep -v \.drush /data/disk/tn/config/includes/nginx_modern_include.conf /var/aegir/config/includes/nginx_modern_include.conf
locate nginx_octopus_include.conf | grep -v backups | grep -v \.drush | grep -v root /data/disk/tn/config/includes/nginx_octopus_include.conf /var/aegir/config/includes/nginx_octopus_include.conf
These files do have the changes:
- /var/aegir/config/includes/nginx_modern_include.conf
- /var/aegir/config/includes/nginx_octopus_include.conf
But these don't:
- /data/disk/tn/config/includes/nginx_modern_include.conf
- /data/disk/tn/config/includes/nginx_octopus_include.conf
Should they be manually edited or is there a BOA way to update them?
comment:37 Changed 3 years ago by chris
- Add Hours to Ticket changed from 0.0 to 0.2
- Total Hours changed from 10.02 to 10.22
The two load spike suicides on Saturday didn't trigger any alerts from CSF so this config in /etc/csf/csf.conf:
# Check the PT_LOAD_AVG minute Load Average (can be set to 1 5 or 15 and # defaults to 5 if set otherwise) on the server every PT_LOAD seconds. If the # load average is greater than or equal to PT_LOAD_LEVEL then an email alert is # sent. lfd then does not report subsequent high load until PT_LOAD_SKIP # seconds has passed to prevent email floods. # # Set PT_LOAD to "0" to disable this feature PT_LOAD = "30" PT_LOAD_AVG = "5" PT_LOAD_LEVEL = "6" PT_LOAD_SKIP = "3600"
Has been updated to:
PT_LOAD = "10" PT_LOAD_AVG = "1" PT_LOAD_LEVEL = "3" PT_LOAD_SKIP = "60"
Also this was changed, though I don't know if it'll work:
#PT_APACHESTATUS = "http://127.0.0.1/server-status" PT_APACHESTATUS = "http://127.0.0.1/nginx_status"
Restarting:
csf -r lfd will restart csf within the next 5 seconds *WARNING* PT_LOAD_SKIP sanity check. PT_LOAD_SKIP = 60. Recommended range: 1800-86400 (Default: 3600)
comment:38 Changed 3 years ago by ed
have there been any more issues since Satruday?
comment:39 Changed 3 years ago by chris
- Add Hours to Ticket changed from 0.0 to 0.65
- Total Hours changed from 10.22 to 10.87
There haven't been any load spike suicides since the two on Saturday, but the load is more spiky:
The problems on Saturday could well have been mostly, or perhaps totally, because to Redis wasn't running as BOA didn't create a directory for it's process ID file. Or perhaps the suicide thresholds are now set at too low a level? I haven't spent the time to read the updated suicide script to work out what is needed to trigger one.
Using the BOA defaults for MySQL has resulted in MySQL having 1/2 the RAM it had before, this has probably contributed to the changed behaviour, I think we should breach the "use stock BOA settings where possible" policy, ticket:670, and make changes to the MySQL settings, see ticket:587#comment:13, the time for that comment has been included in the time for this one.
I still haven't had time to have a close look at the logs from Saturday, but I'm also not sure that it's worth spending any time on this now?
The documentation on wiki:PuffinServer needs quite a lot of updating, once that has been done then this ticket and ticket:670 can probably be closed.
This ticket and ticket:670 are going to end up totalling over 16 hours, this means that this BOA upgrade will have taken twice as longs as the last one, which took 8 hours, see wiki:PuffinServer#Upgradetickets for the totals.
comment:40 Changed 3 years ago by chris
Very sorry that when doing this update I forgot to run octopus up-stable all, this might be the cause of the cron tasks stopping, see ticket:724#comment:6
There is also another BOA update that is outstanding, ticket:721.
I'll to that update and this time not forget to run octopus up-stable all, after midnight tonight, so it comes out of the May maintenance budget, unless I hear otherwise.
comment:41 Changed 3 years ago by chris
- Status changed from new to closed
- Resolution set to fixed
Closing as it's been superceeded by ticket:721 for the 2.2.3 update.
Jim most definitely. This is important. I'm adding him now. And probably Paul; tbc; I have an email out with him asking if he'll take more of a lead on code publishing as it's not working for Sam - so let's see if Paul will also go cc on this.