Table of Contents
Web Server Log File Analysis
The PiwikServer stats are very good at tracking people who consent to be tracked, however they don't track bots, and abusive people who don't want to be tracked. The difference in the traffic is quite stark, for example on 26th June 2013 the javascript based Piwik stats reported:
- 1106 visits
- 3314 page views
Where as the raw log files showed ten times as many page views:
- 7155 visits
- 32688 page views
Therefore to get a handle on what the web servers are doing, as opposed to what really people are doing on the site, we need some tools other than Piwik, this page was created on ticket:555.
webalizer
Note these logs are no longer generated
These logs are password protected and available at https://penguin.transitionnetwork.org/webalizer/puffin/ and they are good for getting an overview on bandwidth, hits and visitors, eg:
PenguinServer is set up to get a copy of the Nginx awstats.log file very day via logrotate, see AwStatsInstall#Copyingthelogstopenguin and on Penguin there is a puffin user account which has this crontab:
05 07 * * * /usr/local/bin/puffin-webalizer
Which runs /usr/local/bin/puffin-webalizer which contains:
#!/bin/bash DATE=$(date "+%Y-%m-%d") LOG_FILE=/home/puffin/nginx/puffin-nginx-$DATE.log STATS_DIR=/web/penguin.transitionnetwork.org/www/webalizer/puffin cd $STATS_DIR webalizer -p -n transitionnetwork.org -o $STATS_DIR $LOG_FILE
logstalgia
This allows a realtime display of log files, install logstalgia on your local machine, for example:
sudo aptitude install logstalgia
And then pipe the logs into it via ssh, for example these are the commands to see a real time display from the 3 servers:
ssh puffin.webarch.net sudo tail -f /var/log/nginx/access.log | logstalgia --sync ssh parrot.webarch.net sudo tail -f /home/*/logs/access.log | logstalgia --sync ssh penguin.webarch.net sudo tail -f /var/log/nginx/*.access.log | logstalgia --sync
The following screen shot doesn't do it justice, the last two numbers of the IP address have been removed from this image:
For more info see https://code.google.com/p/logstalgia/ and the videos here https://www.youtube.com/user/Logstalgia
goaccess
To get an overview of a log file you can use goaccess of the server to load a specific log file, for example on puffin, this is the current log:
goaccess -f /var/log/nginx/access.log
And this is yesterdays:
goaccess -f /var/log/nginx/access.log.1
This displays totals like this:
When we upgrade to Wheezy, see ticket:535 we should set up Goaccess to generate a HTML / email report per day.
For more information see the goaccess web site at http://goaccess.prosoftcorp.com/
Attachments
-
logstalgia-puffin.png
(33.5 KB) -
added by chris 3 years ago.
Logstalgia display of Nginx access logs on Puffin
-
goaccess-puffin.png
(29.5 KB) -
added by chris 3 years ago.
Goaccess display of Nginx log file on Puffin
-
puffin_webalizer_daily_usage_201307.png
(3.0 KB) -
added by chris 3 years ago.
Puffin Webalizer stats 2013-07-12