<?xml version="1.0"?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0">
  <channel>
    <title>Transition Technology: Ticket #893: BOA Cron Jobs</title>
    <link>http://localhost:8080/trac/ticket/893</link>
    <description>&lt;p&gt;
All the BOA cron jobs were stopped on &lt;a class="closed ticket" href="http://localhost:8080/trac/ticket/846#comment:88" title="maintenance: Load Spikes on BOA PuffinServer (closed: fixed)"&gt;ticket:846#comment:88&lt;/a&gt;. This ticket is for looking at them all and deciding which, if any, are needed.
&lt;/p&gt;
</description>
    <language>en-us</language>
    <image>
      <title>Transition Technology</title>
      <url>/trac/chrome/site/TransitionNetwork-Logo-Web-Small.jpg</url>
      <link>http://localhost:8080/trac/ticket/893</link>
    </image>
    <generator>Trac 0.12.5</generator>
    <item>
      
        <dc:creator>chris</dc:creator>

      <pubDate>Thu, 24 Dec 2015 12:42:52 GMT</pubDate>
      <title>hours, totalhours changed</title>
      <link>http://localhost:8080/trac/ticket/893#comment:1</link>
      <guid isPermaLink="false">http://localhost:8080/trac/ticket/893#comment:1</guid>
      <description>
          &lt;ul&gt;
            &lt;li&gt;&lt;strong&gt;hours&lt;/strong&gt;
                changed from &lt;em&gt;0.0&lt;/em&gt; to &lt;em&gt;1.0&lt;/em&gt;
            &lt;/li&gt;
            &lt;li&gt;&lt;strong&gt;totalhours&lt;/strong&gt;
                changed from &lt;em&gt;0.0&lt;/em&gt; to &lt;em&gt;1.0&lt;/em&gt;
            &lt;/li&gt;
          &lt;/ul&gt;
        &lt;p&gt;
I have updated the &lt;a class="wiki" href="http://localhost:8080/trac/wiki/PuffinServer?action=diff&amp;amp;version=191&amp;amp;old_version=189"&gt;wiki:PuffinServer?action=diff&amp;amp;version=191&amp;amp;old_version=189&lt;/a&gt; to reflect the current status of &lt;a class="wiki" href="http://localhost:8080/trac/wiki/PuffinServer"&gt;PuffinServer&lt;/a&gt; and created a new &lt;a class="wiki" href="http://localhost:8080/trac/wiki/BoaCronJobs"&gt;BoaCronJobs&lt;/a&gt; wiki page where I have written up a brief description of each BOA script following a quick read of them. I don't think any of them are worth re-enabling, we probably should have stopped them all years ago, BOA itself looks like the cause of the killing sprees and suicides.
&lt;/p&gt;
&lt;p&gt;
However I think the memory allocation for MySQL is worth tweaking:
&lt;/p&gt;
&lt;pre class="wiki"&gt;
mysqltuner.pl
 &amp;gt;&amp;gt;  MySQLTuner 1.2.0 - Major Hayden &amp;lt;major@mhtx.net&amp;gt;
 &amp;gt;&amp;gt;  Bug reports, feature requests, and downloads at http://mysqltuner.com/
 &amp;gt;&amp;gt;  Run with '--help' for additional options and output filtering
[OK] Logged in using credentials from debian maintenance account.
-------- General Statistics --------------------------------------------------
[--] Skipped version check for MySQLTuner script
[OK] Currently running supported MySQL version 5.5.47-MariaDB-1~wheezy-log
[OK] Operating on 64-bit architecture
-------- Storage Engine Statistics -------------------------------------------
[--] Status: +Archive -BDB +Federated +InnoDB -ISAM -NDBCluster
[--] Data in MyISAM tables: 203M (Tables: 3)
[--] Data in InnoDB tables: 933M (Tables: 1279)
[--] Data in PERFORMANCE_SCHEMA tables: 0B (Tables: 17)
[!!] Total fragmented tables: 139
-------- Security Recommendations  -------------------------------------------
[OK] All database users have passwords assigned
-------- Performance Metrics -------------------------------------------------
[--] Up for: 19h 59m 12s (7M q [99.977 qps], 81K conn, TX: 20B, RX: 988M)
[--] Reads / Writes: 92% / 8%
[--] Total buffers: 2.8G global + 20.4M per thread (50 max threads)
[OK] Maximum possible memory usage: 3.8G (21% of installed RAM)
[OK] Slow queries: 0% (116/7M)
[OK] Highest usage of available connections: 57% (29/50)
[!!] Key buffer size / total MyISAM indexes: 193.0M/214.7M
[!!] Key buffer hit rate: 89.9% (2K cached / 280 reads)
[OK] Query cache efficiency: 45.3% (5M cached / 11M selects)
[!!] Query cache prunes per day: 681699
[OK] Sorts requiring temporary tables: 0% (176 temp sorts / 158K sorts)
[!!] Temporary tables created on disk: 27% (43K on disk / 157K total)
[OK] Thread cache hit rate: 99% (29 created / 81K connections)
[!!] Table cache hit rate: 0% (128 open / 21K opened)
[OK] Open file limit used: 0% (4/196K)
[OK] Table locks acquired immediately: 99% (1M immediate / 1M locks)
[OK] InnoDB data size / buffer pool: 933.8M/1.5G
-------- Recommendations -----------------------------------------------------
General recommendations:
    Run OPTIMIZE TABLE to defragment tables for better performance
    MySQL started within last 24 hours - recommendations may be inaccurate
    Temporary table size is already large - reduce result set size
    Reduce your SELECT DISTINCT queries without LIMIT clauses
    Increase table_cache gradually to avoid file descriptor limits
Variables to adjust:
    key_buffer_size (&amp;gt; 214.7M)
    query_cache_size (&amp;gt; 128M)
    table_cache (&amp;gt; 128)
&lt;/pre&gt;&lt;p&gt;
So these variables in &lt;tt&gt;/etc/mysql/my.cnf&lt;/tt&gt; were changed:
&lt;/p&gt;
&lt;pre class="wiki"&gt;#key_buffer_size         = 193M
key_buffer_size         = 256M
#query_cache_size        = 128M
query_cache_size        = 512M
#join_buffer_size        = 4M
join_buffer_size        = 12M
#tmpdir                  = /tmp
tmpdir                  = /dev/shm/mysql
&lt;/pre&gt;&lt;p&gt;
And MySQL was restarted.
&lt;/p&gt;
&lt;p&gt;
Note that the &lt;tt&gt;chris&lt;/tt&gt; crontab alread contained:
&lt;/p&gt;
&lt;pre class="wiki"&gt;# create a tmp dir on the ram disk for mysql
# see https://trac.transitionnetwork.org/trac/ticket/591
@reboot sudo mkdir /run/shm/mysql ; sudo chown mysql:mysql /run/shm/mysql
&lt;/pre&gt;
      </description>
      <category>Ticket</category>
    </item><item>
      
        <dc:creator>chris</dc:creator>

      <pubDate>Sun, 27 Dec 2015 12:03:16 GMT</pubDate>
      <title>hours, totalhours changed</title>
      <link>http://localhost:8080/trac/ticket/893#comment:2</link>
      <guid isPermaLink="false">http://localhost:8080/trac/ticket/893#comment:2</guid>
      <description>
          &lt;ul&gt;
            &lt;li&gt;&lt;strong&gt;hours&lt;/strong&gt;
                changed from &lt;em&gt;0.0&lt;/em&gt; to &lt;em&gt;0.33&lt;/em&gt;
            &lt;/li&gt;
            &lt;li&gt;&lt;strong&gt;totalhours&lt;/strong&gt;
                changed from &lt;em&gt;1.0&lt;/em&gt; to &lt;em&gt;1.33&lt;/em&gt;
            &lt;/li&gt;
          &lt;/ul&gt;
        &lt;p&gt;
I have been keeping an eye on the site and the Munin graphs, although this is a very quite time of year for the site I doubt the bots adhere to holidays, the effect of stopping all the BOA root cron jobs had been quote dramatic, I have posted some graphs here &lt;a class="wiki" href="http://localhost:8080/trac/wiki/BoaCronJobs#BOACronJobs"&gt;wiki:BoaCronJobs#BOACronJobs&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;
&lt;/p&gt;
      </description>
      <category>Ticket</category>
    </item><item>
      
        <dc:creator>chris</dc:creator>

      <pubDate>Fri, 01 Jan 2016 13:12:44 GMT</pubDate>
      <title>hours, totalhours changed</title>
      <link>http://localhost:8080/trac/ticket/893#comment:3</link>
      <guid isPermaLink="false">http://localhost:8080/trac/ticket/893#comment:3</guid>
      <description>
          &lt;ul&gt;
            &lt;li&gt;&lt;strong&gt;hours&lt;/strong&gt;
                changed from &lt;em&gt;0.0&lt;/em&gt; to &lt;em&gt;0.25&lt;/em&gt;
            &lt;/li&gt;
            &lt;li&gt;&lt;strong&gt;totalhours&lt;/strong&gt;
                changed from &lt;em&gt;1.33&lt;/em&gt; to &lt;em&gt;1.58&lt;/em&gt;
            &lt;/li&gt;
          &lt;/ul&gt;
        &lt;p&gt;
I have just made some tweaks to some server settings, doubling the Redis memory in &lt;tt&gt;/etc/redis/redis.conf&lt;/tt&gt;:
&lt;/p&gt;
&lt;pre class="wiki"&gt;#maxmemory 1024MB
maxmemory 2048MB
&lt;/pre&gt;&lt;p&gt;
As it had hit the 1GB limit.
&lt;/p&gt;
&lt;p&gt;
And in &lt;tt&gt;/etc/nginx/nginx.conf&lt;/tt&gt; adjusting the settings to suite the number of CPUs:
&lt;/p&gt;
&lt;pre class="wiki"&gt;#worker_processes  28;
# this should match cpus
worker_processes  4;
events {
  multi_accept on;
  # https://easyengine.io/tutorials/nginx/optimization/
  worker_connections 1024;
  # http://nginx.2469901.n2.nabble.com/Tuning-workers-and-connections-td3192878.html
  use epoll;
}
&lt;/pre&gt;
      </description>
      <category>Ticket</category>
    </item><item>
      
        <dc:creator>chris</dc:creator>

      <pubDate>Sun, 03 Jan 2016 18:55:18 GMT</pubDate>
      <title>attachment set</title>
      <link>http://localhost:8080/trac/ticket/893</link>
      <guid isPermaLink="false">http://localhost:8080/trac/ticket/893</guid>
      <description>
          &lt;ul&gt;
            &lt;li&gt;&lt;strong&gt;attachment&lt;/strong&gt;
                set to &lt;em&gt;puffin-2016-01-03_multips_memory-week.png&lt;/em&gt;
            &lt;/li&gt;
          &lt;/ul&gt;
      </description>
      <category>Ticket</category>
    </item><item>
      
        <dc:creator>chris</dc:creator>

      <pubDate>Sun, 03 Jan 2016 19:02:46 GMT</pubDate>
      <title>attachment set</title>
      <link>http://localhost:8080/trac/ticket/893</link>
      <guid isPermaLink="false">http://localhost:8080/trac/ticket/893</guid>
      <description>
          &lt;ul&gt;
            &lt;li&gt;&lt;strong&gt;attachment&lt;/strong&gt;
                set to &lt;em&gt;puffin_2016-01-03_load-month.png&lt;/em&gt;
            &lt;/li&gt;
          &lt;/ul&gt;
      </description>
      <category>Ticket</category>
    </item><item>
      
        <dc:creator>chris</dc:creator>

      <pubDate>Sun, 03 Jan 2016 19:02:59 GMT</pubDate>
      <title>attachment set</title>
      <link>http://localhost:8080/trac/ticket/893</link>
      <guid isPermaLink="false">http://localhost:8080/trac/ticket/893</guid>
      <description>
          &lt;ul&gt;
            &lt;li&gt;&lt;strong&gt;attachment&lt;/strong&gt;
                set to &lt;em&gt;puffin_2016-01-03_http_loadtime-month.png&lt;/em&gt;
            &lt;/li&gt;
          &lt;/ul&gt;
      </description>
      <category>Ticket</category>
    </item><item>
      
        <dc:creator>chris</dc:creator>

      <pubDate>Sun, 03 Jan 2016 19:12:30 GMT</pubDate>
      <title>hours, totalhours changed</title>
      <link>http://localhost:8080/trac/ticket/893#comment:4</link>
      <guid isPermaLink="false">http://localhost:8080/trac/ticket/893#comment:4</guid>
      <description>
          &lt;ul&gt;
            &lt;li&gt;&lt;strong&gt;hours&lt;/strong&gt;
                changed from &lt;em&gt;0.0&lt;/em&gt; to &lt;em&gt;0.5&lt;/em&gt;
            &lt;/li&gt;
            &lt;li&gt;&lt;strong&gt;totalhours&lt;/strong&gt;
                changed from &lt;em&gt;1.58&lt;/em&gt; to &lt;em&gt;2.08&lt;/em&gt;
            &lt;/li&gt;
          &lt;/ul&gt;
        &lt;p&gt;
I have spent some time looking at the &lt;a class="ext-link" href="https://penguin.transitionnetwork.org/munin/transitionnetwork.org/puffin.transitionnetwork.org/index.html"&gt;&lt;span class="icon"&gt;​&lt;/span&gt;Munin graphs&lt;/a&gt; and the Nginx changes done on &lt;a class="new ticket" href="http://localhost:8080/trac/ticket/893#comment:3" title="defect: BOA Cron Jobs (new)"&gt;ticket:893#comment:3&lt;/a&gt; reduced the memory usage:
&lt;/p&gt;
&lt;p&gt;
&lt;a style="padding:0; border:none" href="http://localhost:8080/trac/attachment/ticket/893/puffin-2016-01-03_multips_memory-week.png"&gt;&lt;img src="http://localhost:8080/trac/raw-attachment/ticket/893/puffin-2016-01-03_multips_memory-week.png" /&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;
Looking at mySQL we have:
&lt;/p&gt;
&lt;pre class="wiki"&gt; &amp;gt;&amp;gt;  MySQLTuner 1.2.0 - Major Hayden &amp;lt;major@mhtx.net&amp;gt;
 &amp;gt;&amp;gt;  Bug reports, feature requests, and downloads at http://mysqltuner.com/
 &amp;gt;&amp;gt;  Run with '--help' for additional options and output filtering
[OK] Logged in using credentials from debian maintenance account.
-------- General Statistics --------------------------------------------------
[--] Skipped version check for MySQLTuner script
[OK] Currently running supported MySQL version 5.5.47-MariaDB-1~wheezy-log
[OK] Operating on 64-bit architecture
-------- Storage Engine Statistics -------------------------------------------
[--] Status: +Archive -BDB +Federated +InnoDB -ISAM -NDBCluster
[--] Data in MyISAM tables: 203M (Tables: 3)
[--] Data in InnoDB tables: 987M (Tables: 1279)
[--] Data in PERFORMANCE_SCHEMA tables: 0B (Tables: 17)
[!!] Total fragmented tables: 139
-------- Security Recommendations  -------------------------------------------
[OK] All database users have passwords assigned
-------- Performance Metrics -------------------------------------------------
[--] Up for: 10d 6h 14m 42s (82M q [92.669 qps], 886K conn, TX: 249B, RX: 11B)
[--] Reads / Writes: 93% / 7%
[--] Total buffers: 3.3G global + 28.4M per thread (50 max threads)
[OK] Maximum possible memory usage: 4.6G (25% of installed RAM)
[OK] Slow queries: 0% (629/82M)
[!!] Highest connection usage: 100%  (51/50)
[OK] Key buffer size / total MyISAM indexes: 256.0M/214.7M
[!!] Key buffer hit rate: 93.1% (61K cached / 4K reads)
[OK] Query cache efficiency: 46.9% (66M cached / 140M selects)
[!!] Query cache prunes per day: 133202
[OK] Sorts requiring temporary tables: 0% (2K temp sorts / 963K sorts)
[OK] Temporary tables created on disk: 22% (234K on disk / 1M total)
[OK] Thread cache hit rate: 99% (51 created / 886K connections)
[!!] Table cache hit rate: 0% (128 open / 178K opened)
[OK] Open file limit used: 0% (6/196K)
[OK] Table locks acquired immediately: 99% (16M immediate / 16M locks)
[OK] InnoDB data size / buffer pool: 987.4M/1.5G
-------- Recommendations -----------------------------------------------------
General recommendations:
    Run OPTIMIZE TABLE to defragment tables for better performance
    Reduce or eliminate persistent connections to reduce connection usage
    Increasing the query_cache size over 128M may reduce performance
    Increase table_cache gradually to avoid file descriptor limits
Variables to adjust:
    max_connections (&amp;gt; 50)
    wait_timeout (&amp;lt; 3600)
    interactive_timeout (&amp;lt; 28800)
    query_cache_size (&amp;gt; 512M) [see warning above]
    table_cache (&amp;gt; 128)
&lt;/pre&gt;&lt;p&gt;
Based on the &lt;a class="ext-link" href="https://penguin.transitionnetwork.org/munin/transitionnetwork.org/puffin.transitionnetwork.org/index.html"&gt;&lt;span class="icon"&gt;​&lt;/span&gt;Munin graphs&lt;/a&gt; and past experience I think the cache sizes are probably OK, but I have increased the max connections in &lt;tt&gt;/etc/mysql/my.cnf&lt;/tt&gt;:
&lt;/p&gt;
&lt;pre class="wiki"&gt;#max_connections         = 50
#max_user_connections    = 50
max_connections         = 75
max_user_connections    = 75
&lt;/pre&gt;&lt;p&gt;
Since stopping all the BOA root cron jobs we still have a dramatic reduction in the load on the server (highest recorded weekly spike according to Munin was 3.14) and no more load spikes, the last &lt;tt&gt;lfd&lt;/tt&gt; alert was on 23rd Dec 2015:
&lt;/p&gt;
&lt;pre class="wiki"&gt;Date: Wed, 23 Dec 2015 11:47:28 +0000 (GMT)
From: root@puffin.webarch.net
To: chris@webarchitects.co.uk
Subject: lfd on puffin.webarch.net: High 5 minute load average alert - 72.36
&lt;/pre&gt;&lt;p&gt;
&lt;a style="padding:0; border:none" href="http://localhost:8080/trac/attachment/ticket/893/puffin_2016-01-03_load-month.png"&gt;&lt;img src="http://localhost:8080/trac/raw-attachment/ticket/893/puffin_2016-01-03_load-month.png" /&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;
And a great improvement in page load times:
&lt;/p&gt;
&lt;p&gt;
&lt;a style="padding:0; border:none" href="http://localhost:8080/trac/attachment/ticket/893/puffin_2016-01-03_http_loadtime-month.png"&gt;&lt;img src="http://localhost:8080/trac/raw-attachment/ticket/893/puffin_2016-01-03_http_loadtime-month.png" /&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;
So far it appears to be safe to say that the cause of the load spikes was BOA itself rather than any external cause, but when the Xmas holidays are over perhaps things will change.
&lt;/p&gt;
      </description>
      <category>Ticket</category>
    </item><item>
      
        <dc:creator>chris</dc:creator>

      <pubDate>Sun, 03 Jan 2016 19:32:00 GMT</pubDate>
      <title>hours, totalhours changed</title>
      <link>http://localhost:8080/trac/ticket/893#comment:5</link>
      <guid isPermaLink="false">http://localhost:8080/trac/ticket/893#comment:5</guid>
      <description>
          &lt;ul&gt;
            &lt;li&gt;&lt;strong&gt;hours&lt;/strong&gt;
                changed from &lt;em&gt;0.0&lt;/em&gt; to &lt;em&gt;0.1&lt;/em&gt;
            &lt;/li&gt;
            &lt;li&gt;&lt;strong&gt;totalhours&lt;/strong&gt;
                changed from &lt;em&gt;2.08&lt;/em&gt; to &lt;em&gt;2.18&lt;/em&gt;
            &lt;/li&gt;
          &lt;/ul&gt;
        &lt;p&gt;
I have updated the &lt;a class="wiki" href="http://localhost:8080/trac/wiki/PuffinServer#LoadSpikes"&gt;wiki:PuffinServer#LoadSpikes&lt;/a&gt; documentation to reflect what has happened in the last couple of weeks.
&lt;/p&gt;
      </description>
      <category>Ticket</category>
    </item><item>
      
        <dc:creator>chris</dc:creator>

      <pubDate>Mon, 01 Feb 2016 11:36:00 GMT</pubDate>
      <title></title>
      <link>http://localhost:8080/trac/ticket/893#comment:6</link>
      <guid isPermaLink="false">http://localhost:8080/trac/ticket/893#comment:6</guid>
      <description>
        &lt;p&gt;
A monthly Redis restart has been added to the root crontab, see &lt;a class="closed ticket" href="http://localhost:8080/trac/ticket/900#comment:5" title="maintenance: Unusal High Load on Puffin (closed: fixed)"&gt;ticket:900#comment:5&lt;/a&gt;
&lt;/p&gt;
      </description>
      <category>Ticket</category>
    </item><item>
      
        <dc:creator>chris</dc:creator>

      <pubDate>Thu, 03 Mar 2016 13:46:58 GMT</pubDate>
      <title></title>
      <link>http://localhost:8080/trac/ticket/893#comment:7</link>
      <guid isPermaLink="false">http://localhost:8080/trac/ticket/893#comment:7</guid>
      <description>
        &lt;p&gt;
One side-effect from the BOA cronjobs being commented out appears to be a massive growth in the size of various cache tables, see &lt;a class="new ticket" href="http://localhost:8080/trac/ticket/907" title="maintenance: TN Drupal database size (new)"&gt;ticket:907&lt;/a&gt;
&lt;/p&gt;
      </description>
      <category>Ticket</category>
    </item><item>
      
        <dc:creator>chris</dc:creator>

      <pubDate>Wed, 30 Mar 2016 10:18:07 GMT</pubDate>
      <title>hours, totalhours changed</title>
      <link>http://localhost:8080/trac/ticket/893#comment:8</link>
      <guid isPermaLink="false">http://localhost:8080/trac/ticket/893#comment:8</guid>
      <description>
          &lt;ul&gt;
            &lt;li&gt;&lt;strong&gt;hours&lt;/strong&gt;
                changed from &lt;em&gt;0.0&lt;/em&gt; to &lt;em&gt;0.25&lt;/em&gt;
            &lt;/li&gt;
            &lt;li&gt;&lt;strong&gt;totalhours&lt;/strong&gt;
                changed from &lt;em&gt;2.18&lt;/em&gt; to &lt;em&gt;2.43&lt;/em&gt;
            &lt;/li&gt;
          &lt;/ul&gt;
        &lt;p&gt;
Replying to &lt;a href="http://localhost:8080/trac/ticket/893#comment:6" title="Comment 6 for Ticket #893"&gt;chris&lt;/a&gt;:
&lt;/p&gt;
&lt;blockquote class="citation"&gt;
&lt;p&gt;
A monthly Redis restart has been added to the root crontab, see &lt;a class="closed ticket" href="http://localhost:8080/trac/ticket/900#comment:5" title="maintenance: Unusal High Load on Puffin (closed: fixed)"&gt;ticket:900#comment:5&lt;/a&gt;
&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;
Redis ran out of memory yesterday (29th March) so a monthly restart isn't enough, I have changed it to a restart on 1st and 15th of each month. I needed to restart several services today due to high loads caused by Redis running out of memory, see the &lt;a class="ext-link" href="https://penguin.transitionnetwork.org/munin/transitionnetwork.org/puffin.transitionnetwork.org/index.html"&gt;&lt;span class="icon"&gt;​&lt;/span&gt;Munin stats for details&lt;/a&gt;.
&lt;/p&gt;
      </description>
      <category>Ticket</category>
    </item>
 </channel>
</rss>