Ticket #147 (closed defect: fixed)

Opened 6 years ago

Last modified 6 years ago

Migration of live server

Reported by: chris Owned by: chris
Priority: major Milestone:
Component: Live server Keywords:
Cc: ed, john, jim Estimated Number of Hours: 14.0
Add Hours to Ticket: 0 Billable?: yes
Total Hours: 21.7

Description

This is a ticket to track the migration of wiki:LiveServer (transitionnetwork.gaiahost.coop) to wiki:NewLiveServer (quince.webarch.net), even though we haven't got the final go ahead, due to the desire to migrate by the end of the month, I have made a "at risk" start on this.

The hosting side of things is all done apart from sorting out the backups.

The virtual server has been set up, accounts have been created for jim and john and their public keys have been installed so they should be able to connect to it.

A copy of the live site has been set up here and it's now available for testing:

http://live.quince.webarch.net/

It has the same settings as the dev and test sites to ensure that it doesn't send out unwanted emails to users.

One issue I found is that the cacherouter module isn't in svn any more, so this was manually copied across, there are also some drupal errors and I think they might also be due to missing modules but I haven't bottomed this out yet (more on these issues here: wiki:NewLiveServer).

One thing we need to decide is what php accelerator to use -- no apache/php/mysql optimisations at all have been done so far.

Change History

comment:1 Changed 6 years ago by chris

  • Add Hours to Ticket changed from 0.0 to 3.0
  • Total Hours changed from 0.0 to 3.0

comment:2 Changed 6 years ago by chris

I forgot to add, what about the sites that are on the gaia server but we are not using at the moment:

comment:3 Changed 6 years ago by jim

Well done Chris.

OK, the SVN repo isn't quite right and the best version to presently pull from is the DEV branch, which is 100% up to date. I do plan to finish tagging the latest release in TEST and LIVE very soon so let me know if you want this done first. DEV is king for now though.

As for the PHP accelerator, Drupal.org has lots of documentation on the best approach. I also recommend adding memcached so that CacheRouter? can use it for a huge speed increase.

Lots more on a 'proper' performant Drupal setup here: http://www.chapterthree.com/blog/josh_koenig/project_mercury_preconfigured_drupalvarnish_ec2_ami

comment:4 Changed 6 years ago by chris

  • Add Hours to Ticket changed from 0.0 to 1.0
  • Total Hours changed from 3.0 to 4.0

OK, the SVN repo isn't quite right and the best version to presently pull from is the DEV branch, which is 100% up to date. I do plan to finish tagging the latest release in TEST and LIVE very soon so let me know if you want this done first. DEV is king for now though.

OK, this is a good way to test how fast svn is now, switching from https://svn.webarch.net/transition/code/trunk/ to https://svn.webarch.net/transition/code/branches/DEV/

cd /web/transitionnetwork.org/www/
date ; svn switch https://svn.webarch.net/transition/code/branches/DEV/ ; date
  Sat Oct 16 17:10:51 UTC 2010
  ...
  svn: Failed to add directory 'sites/all/modules/cacherouter': a versioned directory of the same name already exists
  Sat Oct 16 17:11:11 UTC 2010
rm -rf sites/all/modules/cacherouter
cd sites/all/modules/
svn up

20 seconds for the svn switch... which seems OK?

I have installed memcached and php5-memcache and it's running with the default debian settings, apart from the memory limit of 64M having been changed to 128M, see /etc/memcached.conf. I have looked at Cache Router and could find the settings there so I used the ones here (I'm not sure if shared shouldn't be FALSE since we only have one memcache server and one site?):

$conf['cacherouter'] = array(
        'default' => array(
        'engine'  => 'memcache',
        'server'  => array('127.0.0.1:11211'),
        'shared'  => TRUE,
),
);

But I'm not sure this has speeded things up -- 1000 requests for the front page, 10 at a time, from the dev server takes about 5 seconds with memcached:

ab -v 4 -n 1000 -H "Accept-Encoding: gzip" -c 100 "http://live.quince.webarch.net/"
 Concurrency Level:      100
 Time taken for tests:   4.732 seconds
 Complete requests:      1000
 Failed requests:        0

But with the file cache settings it's closer to 4 seconds:

ab -v 4 -n 1000 -H "Accept-Encoding: gzip" -c 100 "http://live.quince.webarch.net/"
 Concurrency Level:      100
 Time taken for tests:   3.876 seconds
 Complete requests:      1000
 Failed requests:        0

I guess some more testing and tweaking is required...

Do we want to install Project Mercury and the Varnish Accelerating Proxy now?

If we do then this article seems to be guide we could follow: Deploy High Performance Drupal Sites with Mercury on Debian 5 (Lenny).

comment:5 Changed 6 years ago by chris

  • Add Hours to Ticket changed from 0.0 to 3.0
  • Total Hours changed from 4.0 to 7.0

Regarding APC vs Memcache there is an interesting thread here, it might be the case that memcache doesn't add much for our usage, if we had one or move dedicated memcache servers then I guess it would: http://groups.drupal.org/node/73513

I forgot to click submit changed on a ticket yesterday -- I did a further 3 hours of tweaking and was going to post this comment (I'm adding those hours to this comment):

I've sorted some more thing out the items still oustanding:

Please comment if there is anything else!

comment:6 Changed 6 years ago by chris

  • Add Hours to Ticket changed from 0.0 to 1.0
  • Total Hours changed from 7.0 to 8.0

Migrating the Mediawiki site http://wiki.transitionnetwork.org/ will be done at the same time, see ticket:148 and there is a script to make this easy, on quince run:

/usr/local/bin/mysql-update-from-mediawiki.webarch.net

comment:7 Changed 6 years ago by jim

I guess you're still sorting APC/Cacherouter out as I'm looking around the site right now and am getting very slow responses plus this error at the head of the page:

Notice: Use of undefined constant CACHE_PERMENANT - assumed 'CACHE_PERMENANT' in /web/transitionnetwork.org/www/sites/all/modules/cacherouter/engines/memcache.php on line 66

According to anecdotal evidence here maybe it's an APC issue?? PERMENANT is spelled wrong, too...

Also getting time-outs:

Fatal error: Maximum execution time of 30 seconds exceeded in /web/transitionnetwork.org/www/sites/all/modules/cacherouter/engines/memcache.php on line 185

All at 9.30pm on 27 Oct.

Is Cacherouter's setup correct or are you still setting up APC etc?

Anyway, as for other issues:

  • I've used the DotDeb? on my machine because Debian's GD PHP library is ancient. I recommend sticking to the setup on LIVE though, so best to carry on with ImageMagick? unless you're averse to it somehow.
  • Are those Drupal errors still happening? Do you want me to look into them?
  • Personally I reckon we need Varnish or a Mercury-type setup soon... Controversy is growing for TN's activities and there's always the chance of us being Slashdotted. Not important for launch, obviously, but nice to have a migration plan in place. What have you learned Chris?

comment:8 Changed 6 years ago by chris

I guess you're still sorting APC/Cacherouter out as I'm looking around the
site right now and am getting very slow responses

It's been using memcache since I last worked on it, this is what we have in /web/transitionnetwork.org/www/sites/default/settongs.php:

$conf['cache_inc'] = './sites/all/modules/cacherouter/cacherouter.inc';
$conf['cacherouter'] = array(
        'default' => array(
        'engine'  => 'memcache',
        'server'  => array('127.0.0.1:11211'),
        'shared'  => TRUE,
),
);

plus this error at the
head of the page:

Notice: Use of undefined constant CACHE_PERMENANT - assumed                                                              
'CACHE_PERMENANT' in                                                                                                     
/web/transitionnetwork.org/www/sites/all/modules/cacherouter/engines/memcache.php                                        
on line 66                                                                                                               

According to anecdotal evidence here maybe it's an APC issue?? PERMENANT
is spelled wrong, too...

This is weird, I don't get the error and I can't see it in the reports here: https://live.quince.webarch.net/admin/reports/dblog

There was only one use of CACHE_PERMENANT in line of /web/transitionnetwork.org/www/sites/all/modules/cacherouter/engines/memcache.php line 66:

if ($expire == CACHE_TEMPORARY || $expire == CACHE_PERMENANT) {

And yes it's a typo, see: http://drupal.org/node/936966

I have fixed it and committed it here: https://svn.webarch.net/transition/code/branches/DEV/sites/all/modules/cacherouter/engines/memcache.php

comment:9 Changed 6 years ago by chris

Also getting time-outs:

Fatal error: Maximum execution time of 30 seconds exceeded in                                                            
/web/transitionnetwork.org/www/sites/all/modules/cacherouter/engines/memcache.php                                        
on line 185                                                                                                              

All at 9.30pm on 27 Oct.

OK, I have now also got the errors, after setting up memcache to log in a verbose way I found the same errors in /var/log/memcached.log as mentioned here: http://drupal.org/node/734026#comment-2681912

<10 add /tmp/cache_lock 0 0 1
>10 NOT_STORED

I haven't got to the bottom of this, but for now I have changed cacherouter back to use the filecache.

Is Cacherouter's setup correct or are you still setting up APC etc?

APC is installed and the mediawiki site is using it, I'll do some testing and see if APC makes the site faster than using the file cache.

  • I've used the DotDeb on my machine because Debian's GD PHP library is

ancient. I recommend sticking to the setup on LIVE though, so best to
carry on with ImageMagick unless you're averse to it somehow.

Fine by me, any pointers regarding what I need to do to get Drupal to use ImageMagick?

  • Are those Drupal errors still happening? Do you want me to look into

them?

No, sorry I should have made that clear sooner, they have all gone.

  • Personally I reckon we need Varnish or a Mercury-type setup soon...

Controversy is growing for TN's activities and there's always the chance
of us being Slashdotted. Not important for launch, obviously, but nice to
have a migration plan in place. What have you learned Chris?

That I'd need a day or so to play with it, I was thinking of following this article and testing it on the dev server: http://library.linode.com/development/frameworks/php/project-mercury/debian-5-lenny Tis would have to be done after the server switch due to time constraints.

comment:10 Changed 6 years ago by jim

Re: ImageMagick?/GD etc... Use the best match to LIVE unless there's a better way.

In any case, to change the image processing in Drupal: Once the right underlying libraries are in place, it's a case of going to https://live.quince.webarch.net/admin/build/modules to enable the correct ImageAPI backend module. Then you'd go to https://live.quince.webarch.net/admin/settings/imageapi and ensure it's set up properly. It's all pretty simple.

Re: Cacherouter & other drupal issues...
I wonder if the correct SVN checkout will have fixed many of these.

comment:11 Changed 6 years ago by chris

  • Add Hours to Ticket changed from 0.0 to 4.0
  • Total Hours changed from 8.0 to 12.0

Re: ImageMagick?/GD etc... Use the best match to LIVE unless there's a better way.

The current live server has a very different setup in this regard since it's running FreeBSD... I'll look at it...

Re: Cacherouter & other drupal issues...
I wonder if the correct SVN checkout will have fixed many of these.

Hopefully :-)

I haven't had much joy with APC, it didn't seem any quicker and was generating a lot of error messages, see wiki:NewLiveServer#apc so the site is using the filecache.

I've done another update of the database from gaia but it has resulted in some errors on pages like this one https://live.quince.webarch.net/user

warning: call_user_func_array() [function.call-user-func-array]: First argument is expected to be a valid callback, 'notifications_access_user' was given in /web/transitionnetwork.org/www/includes/menu.inc on line 452.

This need tracking down...

comment:12 Changed 6 years ago by jim

Those kind of errors (call_user_func_array) generally stem from Drupal not being able to find a function in code that is referenced in the database. I cleared the caches and the issue went away.

I would suggest that LIVE's codebase is not quite a match for the DEV branch that Quince ues (probably due to those merge errors I have had over the months). In any case, clearing the caches removed the reference to the incorrect/outdated/moved function and told Drupal to recheck and re-cache its hooks.

In other words I think it's a non-issue.

comment:13 Changed 6 years ago by chris

clearing the caches removed the reference to the incorrect/outdated/moved function and told Drupal to recheck and re-cache its hooks

Thanks for that, do you, by any chance, know what the SQL commands would be to achieve this -- if I add these to the script for syncing the database between servers it would make it work better... :-)

comment:14 Changed 6 years ago by chris

  • Add Hours to Ticket changed from 0.0 to 1.5
  • Total Hours changed from 12.0 to 13.5

http://wiki.transitionnetwork.org/ and http://static.transitionnetwork.org/ have been migrated to the new live server but there is a 24 hour TTL on the DNS so you might not get the new server yet...

The Drupal migration will happen tomorrow with a 1 min TTL :-)

comment:15 Changed 6 years ago by chris

  • Add Hours to Ticket changed from 0.0 to 2.5
  • Total Hours changed from 13.5 to 16.0

The server switch has been done.

Outstanding items:

comment:16 Changed 6 years ago by chris

  • Add Hours to Ticket changed from 0.0 to 0.5
  • Total Hours changed from 16.0 to 16.5

In any case, to change the image processing in Drupal: Once the right underlying libraries are in place, it's a case of going to https://www.transitionnetwork.org/admin/build/modules to enable the correct ImageAPI backend module. Then you'd go to https://www.transitionnetwork.org/admin/settings/imageapi and ensure it's set up properly.

OK, that's done, and seem to work, I think, if any one knows some good test for it please do them or point me in the right direction.

comment:17 Changed 6 years ago by chris

  • Add Hours to Ticket changed from 0.0 to 0.2
  • Total Hours changed from 16.5 to 16.7

There was an ImageMagick issue:

	The specified ImageMagick path /usr/local/bin/convert does not exist.

https://www.transitionnetwork.org/admin/reports/event/618152

I have added a sym link from /usr/local/bin/convert (the path on the old live server) to the correct path, /usr/bin/convert but it would also be good to track down where we need to chance this setting.

comment:18 Changed 6 years ago by jim

  • Status changed from new to closed
  • Resolution set to fixed

It's in: https://www.transitionnetwork.org/admin/settings/imageapi/config

I've updated it to the correct path so the symlink isn't needed now.

comment:19 Changed 6 years ago by ed

  • Status changed from closed to reopened
  • Resolution fixed deleted

getting some imagemagick burble at the of things I'm editing with images in them:
(http://www.transitionnetwork.org/blogs/ed-mitchell/2010-04/creating-news-item)

# ImageMagick? command: /usr/bin/convert 'sites/default/files/uploaded/u4/news-item-guide-1.png' -resize 500x319! -quality '80' 'sites/default/files/resize/uploaded/u4/news-item-guide-1-500x319.png'
# ImageMagick? output:
# ImageMagick? command: /usr/bin/convert 'sites/default/files/uploaded/u4/news-item-guide-2.png' -resize 500x315! -quality '80' 'sites/default/files/resize/uploaded/u4/news-item-guide-2-500x315.png'
# ImageMagick? output:
# ImageMagick? command: /usr/bin/convert 'sites/default/files/uploaded/u4/news-item-guide-3.png' -resize 500x271! -quality '80' 'sites/default/files/resize/uploaded/u4/news-item-guide-3-500x271.png'
# ImageMagick? output:
# ImageMagick? command: /usr/bin/convert 'sites/default/files/uploaded/u4/news-item-guide-4.png' -resize 500x319! -quality '80' 'sites/default/files/resize/uploaded/u4/news-item-guide-4-500x319.png'
# ImageMagick? output:

comment:20 Changed 6 years ago by jim

  • Status changed from reopened to closed
  • Resolution set to fixed

Debugging was on... now it's not.

comment:21 Changed 6 years ago by chris

  • Add Hours to Ticket changed from 0.0 to 4.0
  • Total Hours changed from 16.7 to 20.7

I have done a lot of tidying of documentation for the dev server today, I hoped to also do the same for the live server documentation but that will have to be done another day.

This page is now clear about what is running on the dev server:

http://kiwi.transitionnetwork.org/

The dev server docs are upto date:

https://tech.transitionnetwork.org/trac/wiki/DevelopmentServer

And I have rewritten the live2dev and wiki-live2dev scripts so that they work with the new live server, see the wiki:DevelopmentServer documentation for more on these.

comment:22 Changed 6 years ago by chris

  • Add Hours to Ticket changed from 0.0 to 1.0
  • Total Hours changed from 20.7 to 21.7

I've done some tidying up of the documentation, wiki:NewLiveServer and I still need to document how the FTP server is set up.

Also phpmyadmin https://quince.transitionnetwork.org/phpmyadmin and the php https://quince.transitionnetwork.org/info/ and apc info https://quince.transitionnetwork.org/info/apc.php have been moved off the www.transitionnetwork.org domain name to quince.transtitionnetwork.org and a static page at http://quince.transtitionnetwork.org/ lists the sites running on the server.

comment:23 Changed 6 years ago by chris

This is now finished -- the docs are up to date: wiki:NewLiveServer.

Note: See TracTickets for help on using tickets.