<?xml version="1.0"?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0">
  <channel>
    <title>Transition Technology: Ticket #666: Parrot lockups</title>
    <link>http://localhost:8080/trac/ticket/666</link>
    <description>&lt;p&gt;
We have this on the console, I'm going to reboot it:
&lt;/p&gt;
&lt;pre class="wiki"&gt;[28800.164426]  [&amp;lt;ffffffffa001c1ba&amp;gt;] ? do_get_write_access+0x22c/0x452 [jbd2]
[28800.164435]  [&amp;lt;ffffffff81066360&amp;gt;] ? wake_bit_function+0x0/0x23
[28800.164443]  [&amp;lt;ffffffff8104b51c&amp;gt;] ? try_to_wake_up+0x289/0x29b
[28800.164453]  [&amp;lt;ffffffffa001c402&amp;gt;] ? jbd2_journal_get_write_access+0x22/0x33 [jbd2]
[28800.164475]  [&amp;lt;ffffffffa006289e&amp;gt;] ? __ext4_journal_get_write_access+0x4e/0x56 [ext4]
[28800.164492]  [&amp;lt;ffffffffa0042b8e&amp;gt;] ? ext4_reserve_inode_write+0x37/0x73 [ext4]
[28800.164508]  [&amp;lt;ffffffffa0042c05&amp;gt;] ? ext4_mark_inode_dirty+0x3b/0x1c4 [ext4]
[28800.164528]  [&amp;lt;ffffffffa005bdc7&amp;gt;] ? ext4_journal_start_sb+0xd4/0x10e [ext4]
[28800.164543]  [&amp;lt;ffffffffa0042eb0&amp;gt;] ? ext4_dirty_inode+0x30/0x46 [ext4]
[28800.164553]  [&amp;lt;ffffffff81109ead&amp;gt;] ? __mark_inode_dirty+0x25/0x14a
[28800.164560]  [&amp;lt;ffffffff8110138b&amp;gt;] ? file_update_time+0x101/0x130
[28800.164569]  [&amp;lt;ffffffff810b6835&amp;gt;] ? __generic_file_aio_write+0x16e/0x293
[28800.164578]  [&amp;lt;ffffffff810b69b3&amp;gt;] ? generic_file_aio_write+0x59/0x9f
[28800.164588]  [&amp;lt;ffffffff810f0316&amp;gt;] ? do_sync_write+0xce/0x113
[28800.164596]  [&amp;lt;ffffffff810fcd0c&amp;gt;] ? filldir+0x0/0xb7
[28800.164605]  [&amp;lt;ffffffff810549b1&amp;gt;] ? _local_bh_enable_ip+0x22/0x8f
[28800.164613]  [&amp;lt;ffffffff81066332&amp;gt;] ? autoremove_wake_function+0x0/0x2e
[28800.164626]  [&amp;lt;ffffffff8130f1a1&amp;gt;] ? _spin_lock_bh+0x9/0x25
[28800.164626]  [&amp;lt;ffffffff810f0c68&amp;gt;] ? vfs_write+0xa9/0x102
[28800.164632]  [&amp;lt;ffffffff810f0d18&amp;gt;] ? sys_pwrite64+0x57/0x77
[28800.164639]  [&amp;lt;ffffffff81011b42&amp;gt;] ? system_call_fastpath+0x16/0x1b
[28800.164657] INFO: task apache2:31559 blocked for more than 120 seconds.
[28800.164665] "echo 0 &amp;gt; /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[28800.164675] apache2       D 0000000000000000     0 31559  28011 0x00000000
[28800.164691]  ffffffff8149f1f0 0000000000000286 0000000000000000 ffffffff81274f4f
[28800.164711]  ffff880002dc99d8 ffff8800bd1ca000 000000000000f9e0 ffff880002dc9fd8
[28800.164730]  00000000000157c0 00000000000157c0 ffff8800bd9746a0 ffff8800bd974998
[28800.164752] Call Trace:
[28800.164762]  [&amp;lt;ffffffff81274f4f&amp;gt;] ? sch_direct_xmit+0x7f/0x14c
[28800.164773]  [&amp;lt;ffffffff81066253&amp;gt;] ? bit_waitqueue+0x10/0xa0
[28800.164787]  [&amp;lt;ffffffffa001c1ba&amp;gt;] ? do_get_write_access+0x22c/0x452 [jbd2]
[28800.164798]  [&amp;lt;ffffffff81066360&amp;gt;] ? wake_bit_function+0x0/0x23
[28800.164812]  [&amp;lt;ffffffffa001c402&amp;gt;] ? jbd2_journal_get_write_access+0x22/0x33 [jbd2]
[28800.164833]  [&amp;lt;ffffffffa006289e&amp;gt;] ? __ext4_journal_get_write_access+0x4e/0x56 [ext4]
[28800.164852]  [&amp;lt;ffffffffa0042b8e&amp;gt;] ? ext4_reserve_inode_write+0x37/0x73 [ext4]
[28800.164871]  [&amp;lt;ffffffffa0042c05&amp;gt;] ? ext4_mark_inode_dirty+0x3b/0x1c4 [ext4]
[28800.164890]  [&amp;lt;ffffffffa005bdc7&amp;gt;] ? ext4_journal_start_sb+0xd4/0x10e [ext4]
[28800.164908]  [&amp;lt;ffffffffa0042eb0&amp;gt;] ? ext4_dirty_inode+0x30/0x46 [ext4]
[28800.164921]  [&amp;lt;ffffffff81109ead&amp;gt;] ? __mark_inode_dirty+0x25/0x14a
[28800.164932]  [&amp;lt;ffffffff8110138b&amp;gt;] ? file_update_time+0x101/0x130
[28800.164943]  [&amp;lt;ffffffff810b6835&amp;gt;] ? __generic_file_aio_write+0x16e/0x293
[28800.164958]  [&amp;lt;ffffffff8125227b&amp;gt;] ? sock_aio_write+0x0/0xbc
[28800.164969]  [&amp;lt;ffffffff8100cc43&amp;gt;] ? xen_make_pte+0x7b/0x83
[28800.164980]  [&amp;lt;ffffffff810b69b3&amp;gt;] ? generic_file_aio_write+0x59/0x9f
[28800.164992]  [&amp;lt;ffffffff810f0316&amp;gt;] ? do_sync_write+0xce/0x113
[28800.165003]  [&amp;lt;ffffffff81066332&amp;gt;] ? autoremove_wake_function+0x0/0x2e
[28800.165015]  [&amp;lt;ffffffff810ce24c&amp;gt;] ? handle_mm_fault+0x3b8/0x80f
[28800.165027]  [&amp;lt;ffffffff810f0c68&amp;gt;] ? vfs_write+0xa9/0x102
[28800.165038]  [&amp;lt;ffffffff810f0d7d&amp;gt;] ? sys_write+0x45/0x6e
[28800.165049]  [&amp;lt;ffffffff81011b42&amp;gt;] ? system_call_fastpath+0x16/0x1b
[125024.867759] hrtimer: interrupt took 38561246 ns
[1412520.196163] INFO: task mysqld:7928 blocked for more than 120 seconds.
[1412520.196183] "echo 0 &amp;gt; /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[1412520.196191] mysqld        D 0000000000000000     0  7928  18454 0x00000000
[1412520.196203]  ffff8800bfa69530 0000000000000286 0000000000000000 0000000000000000
[1412520.196215]  0007ffffffffffff 0000000000000001 000000000000f9e0 ffff880074cb5fd8
[1412520.196226]  00000000000157c0 00000000000157c0 ffff880002ce1530 ffff880002ce1828
[1412520.196236] Call Trace:
[1412520.196259]  [&amp;lt;ffffffffa00232bf&amp;gt;] ? jbd2_log_wait_commit+0xbf/0x112 [jbd2]
[1412520.196273]  [&amp;lt;ffffffff81066332&amp;gt;] ? autoremove_wake_function+0x0/0x2e
[1412520.196293]  [&amp;lt;ffffffffa003fb41&amp;gt;] ? ext4_sync_file+0x199/0x25c [ext4]
[1412520.196304]  [&amp;lt;ffffffff8110d6e0&amp;gt;] ? vfs_fsync_range+0x73/0x9e
[1412520.196319]  [&amp;lt;ffffffff8110d78a&amp;gt;] ? do_fsync+0x28/0x39
[1412520.196325]  [&amp;lt;ffffffff8110d7b9&amp;gt;] ? sys_fsync+0xb/0x10
[1412520.196333]  [&amp;lt;ffffffff81011b63&amp;gt;] ? sysret_check+0x17/0x5a
[1412520.196341]  [&amp;lt;ffffffff81011b42&amp;gt;] ? system_call_fastpath+0x16/0x1b
&lt;/pre&gt;</description>
    <language>en-us</language>
    <image>
      <title>Transition Technology</title>
      <url>/trac/chrome/site/TransitionNetwork-Logo-Web-Small.jpg</url>
      <link>http://localhost:8080/trac/ticket/666</link>
    </image>
    <generator>Trac 0.12.5</generator>
    <item>
      
        <dc:creator>chris</dc:creator>

      <pubDate>Wed, 08 Jan 2014 11:51:38 GMT</pubDate>
      <title>hours, totalhours changed</title>
      <link>http://localhost:8080/trac/ticket/666#comment:1</link>
      <guid isPermaLink="false">http://localhost:8080/trac/ticket/666#comment:1</guid>
      <description>
          &lt;ul&gt;
            &lt;li&gt;&lt;strong&gt;hours&lt;/strong&gt;
                changed from &lt;em&gt;0.0&lt;/em&gt; to &lt;em&gt;0.32&lt;/em&gt;
            &lt;/li&gt;
            &lt;li&gt;&lt;strong&gt;totalhours&lt;/strong&gt;
                changed from &lt;em&gt;0.0&lt;/em&gt; to &lt;em&gt;0.32&lt;/em&gt;
            &lt;/li&gt;
          &lt;/ul&gt;
        &lt;p&gt;
The server is back up.
&lt;/p&gt;
&lt;p&gt;
The console errors were not recent, this is from &lt;tt&gt;/var/log/kern.log.2.gz&lt;/tt&gt;:
&lt;/p&gt;
&lt;pre class="wiki"&gt;Dec 25 01:28:54 parrot kernel: [1412520.196163] INFO: task mysqld:7928 blocked for more than 120 seconds.
Dec 25 01:28:54 parrot kernel: [1412520.196183] "echo 0 &amp;gt; /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 25 01:28:54 parrot kernel: [1412520.196191] mysqld        D 0000000000000000     0  7928  18454 0x00000000
Dec 25 01:28:54 parrot kernel: [1412520.196203]  ffff8800bfa69530 0000000000000286 0000000000000000 0000000000000000
Dec 25 01:28:54 parrot kernel: [1412520.196215]  0007ffffffffffff 0000000000000001 000000000000f9e0 ffff880074cb5fd8
Dec 25 01:28:54 parrot kernel: [1412520.196226]  00000000000157c0 00000000000157c0 ffff880002ce1530 ffff880002ce1828
Dec 25 01:28:54 parrot kernel: [1412520.196236] Call Trace:
Dec 25 01:28:54 parrot kernel: [1412520.196259]  [&amp;lt;ffffffffa00232bf&amp;gt;] ? jbd2_log_wait_commit+0xbf/0x112 [jbd2]
Dec 25 01:28:54 parrot kernel: [1412520.196273]  [&amp;lt;ffffffff81066332&amp;gt;] ? autoremove_wake_function+0x0/0x2e
Dec 25 01:28:54 parrot kernel: [1412520.196293]  [&amp;lt;ffffffffa003fb41&amp;gt;] ? ext4_sync_file+0x199/0x25c [ext4]
Dec 25 01:28:54 parrot kernel: [1412520.196304]  [&amp;lt;ffffffff8110d6e0&amp;gt;] ? vfs_fsync_range+0x73/0x9e
Dec 25 01:28:54 parrot kernel: [1412520.196319]  [&amp;lt;ffffffff8110d78a&amp;gt;] ? do_fsync+0x28/0x39
Dec 25 01:28:54 parrot kernel: [1412520.196325]  [&amp;lt;ffffffff8110d7b9&amp;gt;] ? sys_fsync+0xb/0x10
Dec 25 01:28:54 parrot kernel: [1412520.196333]  [&amp;lt;ffffffff81011b63&amp;gt;] ? sysret_check+0x17/0x5a
Dec 25 01:28:54 parrot kernel: [1412520.196341]  [&amp;lt;ffffffff81011b42&amp;gt;] ? system_call_fastpath+0x16/0x1b
&lt;/pre&gt;&lt;p&gt;
I can't see anything in the logs to indicate why it was not responding today.
&lt;/p&gt;
&lt;p&gt;
There is also nothing I can see in the munin logs, &lt;a class="ext-link" href="https://penguin.transitionnetwork.org/munin/transitionnetwork.org/parrot.transitionnetwork.org/"&gt;&lt;span class="icon"&gt;​&lt;/span&gt;https://penguin.transitionnetwork.org/munin/transitionnetwork.org/parrot.transitionnetwork.org/&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;
I was alerted to the lask of response from the server by this email:
&lt;/p&gt;
&lt;pre class="wiki"&gt;From: munin@penguin.webarch.net
Date: Wed, 08 Jan 2014 11:25:23 +0000
Subject: parrot.transitionnetwork.org Munin Alert
transitionnetwork.org :: parrot.transitionnetwork.org :: eth0 errors
        UNKNOWNs: errors is unknown, errors is unknown.
&lt;/pre&gt;&lt;p&gt;
And I couldn't connect via SSH.
&lt;/p&gt;
&lt;p&gt;
It's possible that it would have recovered without intervention.
&lt;/p&gt;
&lt;p&gt;
Closing this ticket as I can't think of anything else to do on it and the server is up and running now.
&lt;/p&gt;
      </description>
      <category>Ticket</category>
    </item><item>
      
        <dc:creator>chris</dc:creator>

      <pubDate>Wed, 08 Jan 2014 11:51:56 GMT</pubDate>
      <title>status changed; resolution set</title>
      <link>http://localhost:8080/trac/ticket/666#comment:2</link>
      <guid isPermaLink="false">http://localhost:8080/trac/ticket/666#comment:2</guid>
      <description>
          &lt;ul&gt;
            &lt;li&gt;&lt;strong&gt;status&lt;/strong&gt;
                changed from &lt;em&gt;new&lt;/em&gt; to &lt;em&gt;closed&lt;/em&gt;
            &lt;/li&gt;
            &lt;li&gt;&lt;strong&gt;resolution&lt;/strong&gt;
                set to &lt;em&gt;fixed&lt;/em&gt;
            &lt;/li&gt;
          &lt;/ul&gt;
      </description>
      <category>Ticket</category>
    </item><item>
      
        <dc:creator>chris</dc:creator>

      <pubDate>Wed, 15 Jan 2014 15:03:20 GMT</pubDate>
      <title>cc, status, summary changed; resolution deleted</title>
      <link>http://localhost:8080/trac/ticket/666#comment:3</link>
      <guid isPermaLink="false">http://localhost:8080/trac/ticket/666#comment:3</guid>
      <description>
          &lt;ul&gt;
            &lt;li&gt;&lt;strong&gt;cc&lt;/strong&gt;
              &lt;em&gt;aland&lt;/em&gt; added
            &lt;/li&gt;
            &lt;li&gt;&lt;strong&gt;status&lt;/strong&gt;
                changed from &lt;em&gt;closed&lt;/em&gt; to &lt;em&gt;reopened&lt;/em&gt;
            &lt;/li&gt;
            &lt;li&gt;&lt;strong&gt;resolution&lt;/strong&gt;
                &lt;em&gt;fixed&lt;/em&gt; deleted
            &lt;/li&gt;
            &lt;li&gt;&lt;strong&gt;summary&lt;/strong&gt;
                changed from &lt;em&gt;Parrot isn't responding&lt;/em&gt; to &lt;em&gt;Parrot lockups&lt;/em&gt;
            &lt;/li&gt;
          &lt;/ul&gt;
        &lt;p&gt;
&lt;a class="wiki" href="http://localhost:8080/trac/wiki/ParrotServer"&gt;wiki:ParrotServer&lt;/a&gt; locked up again today, again nothing in the logs, I stopped and restarted it at a xen level.
&lt;/p&gt;
&lt;p&gt;
I have reopened this ticket to keep an eye on this issue and also added Alan as a CC.
&lt;/p&gt;
      </description>
      <category>Ticket</category>
    </item><item>
      
        <dc:creator>chris</dc:creator>

      <pubDate>Wed, 15 Jan 2014 15:04:46 GMT</pubDate>
      <title></title>
      <link>http://localhost:8080/trac/ticket/666#comment:4</link>
      <guid isPermaLink="false">http://localhost:8080/trac/ticket/666#comment:4</guid>
      <description>
        &lt;p&gt;
The alert I got about the problem from munin was as before:
&lt;/p&gt;
&lt;pre class="wiki"&gt;From: munin@penguin.webarch.net
Date: Wed, 15 Jan 2014 13:55:22 +0000
Subject: parrot.transitionnetwork.org Munin Alert
transitionnetwork.org :: parrot.transitionnetwork.org :: eth0 errors
        UNKNOWNs: errors is unknown, errors is unknown.
&lt;/pre&gt;
      </description>
      <category>Ticket</category>
    </item><item>
      
        <dc:creator>chris</dc:creator>

      <pubDate>Wed, 02 Apr 2014 10:28:07 GMT</pubDate>
      <title>status changed; resolution set</title>
      <link>http://localhost:8080/trac/ticket/666#comment:5</link>
      <guid isPermaLink="false">http://localhost:8080/trac/ticket/666#comment:5</guid>
      <description>
          &lt;ul&gt;
            &lt;li&gt;&lt;strong&gt;status&lt;/strong&gt;
                changed from &lt;em&gt;reopened&lt;/em&gt; to &lt;em&gt;closed&lt;/em&gt;
            &lt;/li&gt;
            &lt;li&gt;&lt;strong&gt;resolution&lt;/strong&gt;
                set to &lt;em&gt;fixed&lt;/em&gt;
            &lt;/li&gt;
          &lt;/ul&gt;
        &lt;p&gt;
Closing this, hoping the fix for the NFZ/ZFS server has resolved this, see &lt;a class="closed ticket" href="http://localhost:8080/trac/ticket/618#comment:5" title="maintenance: Migrate Penguin and Parrot to the ZFS fileserver (closed: fixed)"&gt;ticket:618#comment:5&lt;/a&gt;
&lt;/p&gt;
      </description>
      <category>Ticket</category>
    </item>
 </channel>
</rss>