Version 139 (modified by chris, 2 years ago) (diff) |
---|
Table of Contents
Puffin
puffin.webarch.net is a 8GB RAM, 14 CPU core Debian Wheezy virtual server which replaced NewLiveServer and DevelopmentServer for running the Transition Network Drupal sites. It went live in early 2013.
This server was migrated to run off a ZFS server in October 2013, see ticket:593 and it was upgraded from Squeeze to Wheezy on 17th November 2013, see ticket:535.
It was agreed to call this server puffin at the ttech meeting on 22nd November 2012, see ticket:463. The install and initial configuration of this server was tracked on ticket:466, see also the other PuffinServer#migrationtickets. Other services from the old server were migrated to PenguinServer.
System updates were recorded on ticket:218 and are currently recorded on ticket:692. BOA update tickets are listed at PuffinServer#Upgradetickets.
Munin Stats
There are munin stats for the server available here
See ticket:555#comment:13 for the notes regarding the installation of the MySQL munin stats package. See ticket:677#comment:3 for the Redis plugin install notes.
Sometimes the IO State graph stops, this can be fixed by deleting the lock files, see ticket:555#IOstategraph.
Some BOA upgrades change the Redis password and then it needed to be copied from /etc/redis/redis.conf to /etc/munin/plugin-conf.d/munin-node and munin-node needed restarting, see ticket:730.
We did have a trial with New Relic in 2013, see ticket:586 but this isn't on-going.
HTTP Stats
The wiki:PiwikServer generates stats from the humans visiting the server and some of these stats have been made public on wiki:WebStats.
There are some notes on analysing the raw Nginx stats on wiki:WebServerLogs and Webalizer stats for Puffin are available using the same username/password as this Trac site.
There is a wiki:ErrorCodeCheck script which emails the total number of HTTP errors each day, see ticket:483#comment:63 for a list of the total for August, September and October 2013.
Load Spikes
The documentation of the load spike suicides that the server suffered from in 2013 has been archived to wiki:PuffinServerBoaLoadSpikes as that documentation is now out dated.
When the server was updated to BOA-2.2.3 on ticket:721 the scripts in /var/xdrago/ were changed, however the load spike issue hasn't been finally resolved, see ticket:670#comment:22.
Tickets
Most the "live server" tickets relate to puffin, but the older ones, prior to ticket number #466, are for previous servers.
Current live server tickets
Closed live server tickets
Barracuda Octopus Ageir
The server is using Octopus to manage Ageir and also the updates to the Transition Network Drupal site, this system is installed and upgraded using Barracuda, the Barracuda Octopus Aegir combination is documented on the BOA wiki.
The initial BOA install script output has been saved on ticket:466#comment:22 and the updates are now documented on tickets listed at PuffinServer#Upgradetickets.
MariaDB
The MySQL root password is available in /root/.my.cnf.
Tuning of the MySQL server is being tracked on ticket:587.
We have set MySQL to use a RAM disk for temp tables, see ticket:591.
BOA installs MariaDB as the MySQL server using the debs from the MariaDB site, see /etc/apt/sources.list.d/mariadb.list, these are the current (2013-01-13) packages which are installed (note the config files only remain for php5-mysql as PHP in now installed from source code by BOA):
dpkg -l | grep -i mysql ii libdbd-mysql-perl 4.021-1+b1 amd64 Perl5 database interface to the MySQL database ii libmysqlclient16 5.1.72-2 amd64 MySQL database client library ii libmysqlclient18 5.5.34+maria-1~wheezy amd64 Virtual package to satisfy external depends ii mariadb-common 5.5.34+maria-1~wheezy all MariaDB database common files (e.g. /etc/mysql/conf.d/mariadb.cnf) ii mysql-common 5.5.34+maria-1~wheezy all MariaDB database common files (e.g. /etc/mysql/my.cnf) ii mytop 1.6-6 all top like query monitor for MySQL rc php5-mysql 5.3.27-1~dotdeb.0 amd64 MySQL module for php5 ii python-mysqldb 1.2.3-2 amd64 Python interface to MySQL
Nginx
BOA did use Nginx from dotdeb but now it compiles it from source, the dotdeb config files remain:
dpkg -l | grep -i nginx rc nginx-common 1.4.1-1~dotdeb.0 all small, powerful, scalable web/proxy server - common files
The only changes made to the default nginx configuration during the initial install was to move the key and cert it was using out of the way and symlink to the *.transitionnetwork.org ones, see ticket:466#comment:25 and also ticket:707#comment:21.
The other change made from the default BOA config are to enable Munin graphs, see wiki:PuffinServer#nginxconfigchanges
php-fpm
Please note that the version of php-fpm that the http://transitionnetwork.org/ site needs to be running to work properly is:
/etc/init.d/php53-fpm
The config file for it is /opt/local/etc/php53-fpm.conf and when it is running it is listed in top and ps as php-fpm:
ps -lA | grep php 1 S 0 29482 1 0 80 0 - 188067 - ? 00:00:00 php-fpm 5 S 33 29483 29482 2 80 0 - 205351 - ? 00:01:32 php-fpm 5 S 33 29484 29482 2 80 0 - 199726 - ? 00:01:28 php-fpm ...
Please note the settings that we changed from the default BOA ones in /opt/local/etc/php53-fpm.conf below.
When the server boots another version of php-fpm was also started, which is listed in top and ps as php5-fpm, this one:
/etc/init.d/php5-fpm
Which is configured via files in /etc/php5/fpm/. This version should be stopped if it is found to be running:
/etc/init.d/php5-fpm stop
It was stopped from running at runlevel 2 by deleting this symlink (see ticket:560#comment:17):
/etc/rc2.d/S01php5-fpm -> ../init.d/php5-fpm
But that didn't solve the problem, see ticket:580.
Redis
Tickets related to Redis issues:
- ticket:730 Redist Munin stats stop working after BOA upgrade
- ticket:554 Site slow down and MySQL load increase
- ticket:677 Spike in MyISAM (search) database activity, Redis unable to cache such requests
Redis Munin graphs:
Upgrading BOA
The steps are documented in UPGRADE.txt, to upgrade everything run these commands, this process can take around 30 mins:
sudo -i screen cd wget -q -U iCab http://files.aegir.cc/BOA.sh.txt bash BOA.sh.txt barracuda up-stable octopus up-stable all
Useful links:
- BOA Changelog (HEAD) on Github
- BOA repository on Github
- Barracuda open issues
- Octopus open issues
Note also the new hotfix tool (around line 102 of CHANGELOG.txt at time of writing) that allows post release fixes and system tweaks to be applied between full stable releases - i.e. without doing a full update to HEAD.
Upgrade tickets
The time each upgrade takes has been collected here due to concerns about how long the upgrades were taking, see ticket:629#comment:11
- BOA-2.2.4 ticket:725
- BOA-2.2.3 ticket:721
- BOA-2.2.2 ticket:707 and also ticket:670
- BOA-2.1.3 ticket:629 (Total Hours: 8h 11m)
- BOA-2.1.1 ticket:612 (Total Hours: 1h 45m)
- BOA-2.0.9 ticket:547 (Total Hours: 1h 6m)
- BOA-2.0.8 ticket:530 (Total Hours: 1h 6m)
- BOA-2.0.7 ticket:529 (Total Hours: 45m)
- BOA-2.0.5 ticket:466#comment:26
Munin config changes
BOA resets the Redis password on some upgrades, so it needs copying from /etc/redis/redis.conf to /etc/munin/plugin-conf.d/munin-node and munin-node needed restarting, see ticket:730.
nginx config changes
To get the php-fpm munin stats working the following code starting with the comment needs adding to /var/aegir/config/server_master/nginx.conf in the nginx default server section:
####################################################### ### nginx default server ####################################################### server { limit_conn limreq 32; # like mod_evasive - this allows max 32 simultaneous connections from one IP address listen *:80; server_name _; location / { expires 60s; add_header Cache-Control "public, must-revalidate, proxy-revalidate"; add_header Access-Control-Allow-Origin *; root /var/www/nginx-default; index index.html index.htm; } } server { listen *:80; server_name 127.0.0.1; location /nginx_status { stub_status on; access_log off; allow 127.0.0.1; deny all; } # chris 2014-04-14 location ~ ^/fpm-(status|ping)$ { fastcgi_pass 127.0.0.1:9090; fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name; fastcgi_intercept_errors on; include fastcgi_params; access_log off; allow 127.0.0.1; allow 81.95.52.103; deny all; } }
Logs for analysis on penguin, see wiki:WebServerLogs are generated via the following being added to the http section of the /etc/nginx/nginx.conf file:
# log for awstats log_format apache '$remote_addr - $remote_user [$time_local] "$request" ' '$status $body_bytes_sent "$http_referer" ' '"$http_user_agent"'; access_log /var/log/nginx/awstats.log apache;
mysql config changes
Settings in /etc/mysql/my.cnf are no longer changed from the default, see ticket:670 and ticket:587. Changes to /etc/mysql/my.cnf don't get clobbered when BOA is upgraded as we have set _CUSTOM_CONFIG_SQL=YES in /root/.barracuda.cnf.
System Updates
We don't use the BOA tool for updating packages:
barracuda up-stable system
As it's very slow and after running the above command to update the system you also need to follow the steps documented above at PuffinServer#UpgradingBOA for php-fpm to get the Munin stats working again.
Nginx and PHP are complied from source code so the above command should be run when these need updating, for other updates use the wiki:AptitudeUpdateScript script and document the updates on ticket:692.
See also ticket:548#comment:33 for the steps that need to be followed after this to get BOA to work with the Session443 plugin.
CSF / LDF
To restart the firewall script:
csf -r
We have set the following variable in /root/.barracuda.cnf to ensure that the CSF / LDF changes are not clobbered by BOA:
_CUSTOM_CONFIG_CSF=YES
We could do with a link here to the ticket on which the CSF / LDF configuration had a lot of work done. Some changed to the load level alerting was made on ticket:707#comment:37
False positives
BOA installs CSF / LDF and automatically blocks IP addresses after too many failed SSH login attempts, if someone is blocked who shouldn't be then they can be unblocked like this:
csf -dr 81.95.52.66
To check if a IP address is blocked:
csf -g 81.95.52.66
See this ticket for problems caused by CSF / LDF blocking the monitoring server: ticket:544
Blocklists
Blocklists are configured in /etc/csf/csf.blocklists and some were enabled on ticket:589
Console and SSH Access
There is a Xen shell available for console access, see wiki:XenShell.
For developers and sysadmins there is SSH access, contact chris@… if you need an account creating.
The server is also running Mosh : the mobile shell which is very handy when you internet connection is poor, for example on a train. Mosh was installed on ticket:673.
Cron
BOA controls the root crontab and any changes made there will be overwritten, so things that would normally be in the root crontab need to go into users ones and use sudo, these are the ones in chris' crontab:
# delete metche backups which are more than a day old # see https://tech.transitionnetwork.org/trac/ticket/531 28 11 * * * sudo /usr/local/bin/metche-clean -d # set the clock after a reboot # see /trac/ticket/599 @reboot sudo rdate -s ntp.demon.co.uk # create a tmp dir on the ram disk for mysql # see /trac/ticket/591 @reboot sudo mkdir /run/shm/mysql ; sudo chown mysql:mysql /run/shm/mysql # ssl cert check 32 09 * * * sudo ssl-cert-check -qac "/etc/ssl/transitionnetwork.org/transitionnetwork.org.crt" -e "chris@webarchitects.co.uk"
To edit chris' crontab after logging in as another user:
sudo -i export EDITOR=vim crontab -e -u chris
Backupninja
backupninja has been installed and configured to backup to another server in the Sheffield colo, two backup tasks have been configured in /etc/backup.d/, 10.sys which does backups of system settings, like all the packages installed and 20.mysql which dumps all the mysql databases into /var/backups/mysql and uses /etc/mysql/debian.cnf for authentication. In October 2013 we switched the servers filesystem to a ZFS server on the network, see ticket:593#comment:5 and now filesystem backups are done via ZFS snapshots so the rsync backup was disabled, see ticket:535#comment:22
Postfix
Two changes were made the the default postfix install, it was set to send root emails out, see ticket:466#comment:23 and it was configured to use TLS with the transition network cert, see ticket:466#comment:25.
Handy commands
There are some Bash aliases to quickly get around the system added by JK...
For root:
alias cdtn='cd /data/disk/tn/' # cd to tn directory alias totn='su -s /bin/bash tn' # log into the tn user # show file usages alias duf='du -sk * | sort -n | perl -ne '\''($s,$f)=split(m{\t});for (qw(K M G)) {if($s<1024) {printf("%.1f",$s);print "$_\t$f"; last};$s=$s/1024}'\'
For tn
alias la='ls -Al --color=auto' alias lc='ls -ltcr --color=auto' alias lk='ls -lSr --color=auto' alias ll='ls -la --group-directories-first --color=auto' alias lr='ls -lR --color=auto' alias ls='ls -hF --color=auto' alias lt='ls -ltr --color=auto' alias lu='ls -ltur --color=auto' alias lx='ls -lXB --color=auto'
Vim config
To make vim the default editor for root the following was added to /root/.bashrc:
export EDITOR="vim"
To make config files nicer to read in vim the following was added to /root/.vimrc:
syntax on
And a /root/.vim/filetype.vim files was created with the following in it:
au BufRead,BufNewFile /etc/mysql/my.cnf, set ft=mycnf autocmd BufRead,BufNewFile /etc/php5/fpm/* set syntax=dosini autocmd BufRead,BufNewFile /opt/local/etc/php53-fpm.conf set syntax=dosini au BufRead,BufNewFile /etc/nginx/*,/etc/nginx/conf.d/*,/var/aegir/config/server_master/nginx/*/* set ft=nginx au BufRead,BufNewFile /data/disk/tn/config/server_master/nginx/vhost.d/* set ft=nginx
And a /root/.vim/syntax/ directory was created and mycnf.vim was created in it by downloading it from http://cvs.pld-linux.org/cgi-bin/cvsweb.cgi/packages/vim-syntax-mycnf/ and nginx.vim was downloaded from http://www.vim.org/scripts/script.php?script_id=1886
Migration Tickets
Tickets created during the migration of the http://www.transitionnetwork.org/ site from NewLiveServer to this server:
- ticket:466 Puffin install and configuration
- ticket:472 Script to copy files from NewLiveServer to puffin
- ticket:479 Transfer live transitionnetwork.org site to puffin
- ticket:480 Transfer news.transitionnetwork.org to puffin
- ticket:483 Nginx 502 Bad Gateway Errors with BOA see the summary on ticket:483#comment:46
- ticket:487 robots.txt files for development sites