Ticket #590 (assigned defect)
Drupal performance improvements
Reported by: | jim | Owned by: | jim |
---|---|---|---|
Priority: | minor | Milestone: | Maintenance |
Component: | Drupal modules & settings | Keywords: | |
Cc: | ed, chris | Estimated Number of Hours: | 0.05 |
Add Hours to Ticket: | 0 | Billable?: | yes |
Total Hours: | 16.95 |
Description (last modified by jim) (diff)
This ticket is to track the work and changes done within the Drupal sphere in relation to performance enhancements done since #585.
More information is needed and will come when ticket:586 New Relic Monitoring for BOA is completed.
I also note that many of these cleanup operations will also help make the move to D7 smoother and better.
Summary of actions and status
TODO
O) Stop making so many URL aliases for non-relevant pages, clean up url_alias table -- 1/4-1/2 hour, medium reward, only risk is that some already broken links might break... Per chat with Ed, only these will be removed (plus releated tweaks to Pathauto settings):
- 3,579 entries where src = node/%/feed
- 1,856 entries where src = user/%/contact
- = 5,435 or ~11% of entries in url_alias
L) Review slow query log, explain queries, tweak as necessary/flag poorly behaving modules. 2-4 hours, high reward, low risk... Keep looking at the slow query log and adjust Drupal or find patches as necessary. ALSO related Reduce your server's resource usage by moving MySQL temporary directory to tmpfs... Have opened ticket for this: #591 for Chris.
Done
A) Remove spam taxonomy entries 1/2 hour, Low risk, low reward -- See item 8 below. A simple delete from taxo term table where length > 50 is worth doing IMHO, and nothing I saw that would be clobbered is not spam.
B) Try a Taxonomy Cleanup: 3 hours, Medium risk, medium reward -- style module to try to merge terms with the same names and clean up the link tables back to nodes. Further, we can remove any taxonomies or relations to certain CTs that don't really add value.
D) Review Views caching 1 hour, low risk, high reward -- Utilise Views Content Cache this was done a while back but I think -- done (task 12) in comment 21.
F) Force blocks caches to cached appropriately (and be rendered/included only as needed) 1-2 hours, medium reward, low risk -- BOA packages the Block Cache Alter, which makes sure Drupal only renders blocks when needed. Potential small but nice boost quickly in whole site. -- per comment 22, block caching is disabled by other modules so this will have to go on hold for now.
H) Remove CustomError? module all together 1/2 hour, low risk, low reward -- We should take out the PHP code from the 403 section of CustomError? and put it into a simple page entry. See comment 6 below as this has happened for 404s (which need no PHP). We can then remove the CustomError? module all together, saving lots of sessions. I would go ahead and do this but since the 403 page has various displays depending on user type, I wanted to raise it here as it *may* have side effects. Or not...
I) Re-enable block caching. 2-6 hours, high risk, high reward -- Per comment 24, a module (probably Content Access) is stopping Drupal caching blocks, which for some of them means a fair amount of pointless overhead. We need to somehow get around this and get blocks cached if possible. R&D mainly, perhaps with some hacking/patching - but I'd stop short of doing this if so.
K) Add & enable Views Lite Pager on big views. 1 hour, low risk, low reward -- Using this module stops a heavy count query on views with pagers -- recommended for large sites.
M) Take control of Cron, and maximise time pages are cached for. .25h, high reward, low risk -- Cron is wiping the page cache, so we need to install https://drupal.org/project/elysia_cron so we can clear the page less often, and run other things when we want and the site is quieter. Now need per minute resolution set to get the best, see comment 33 and 34 for more...
N) Replace Admin Menu 1.x with 3.x -- will happen when #590 occurs, marking complete here -- 5 mins, high reward, low risk -- done when #582 happens, could be the cause of some load spikes as it occasionally goes made and does 2000-5000 queries~~
Change History
comment:1 Changed 3 years ago by jim
- Add Hours to Ticket changed from 0.0 to 2.25
- Status changed from new to assigned
- Description modified (diff)
- Total Hours changed from 0.0 to 2.25
comment:2 Changed 3 years ago by jim
- Description modified (diff)
Added proposal E: Review site features... moved old E to E.2...
Ed, do you want to drive this on your return? If you make a list of things you feel we've outgrown or have fallen aside along the way, I can review the likely impact of those... I've added E.1 and E.2 for starters.
Also added time estimates to proposals.
comment:3 Changed 3 years ago by jim
- Add Hours to Ticket changed from 0.0 to 0.35
- Total Hours changed from 2.25 to 2.6
- Description modified (diff)
Adding missed time, and proposal F & G.
comment:4 Changed 3 years ago by chris
Using ESI could make a massive difference but I expect it would be a quite big job to set it all up?
Basically everything in a page that is the same for every user would (hopefully) be cached and served directly by Nginx and wouldn't need php-fpm / mysql and the small bits of pages that have content that is different for each user, eg your username, would be generated by php-fpm / mysql and included, rather like SSI, see:
comment:5 Changed 3 years ago by jim
- Add Hours to Ticket changed from 0.0 to 0.35
- Total Hours changed from 2.6 to 2.95
- Description modified (diff)
Well the NginX & Drupal parts are covered by the ESI module and BOA's config, which can use the SSI tags nginx needs. So the main tasks around G are to try it on the STG site, and start to implement/tweak the ESI settings on each part of the page that's relevant...:
- Page 'outer', including user menus
- Blocks and regions - the handful of user-specific blocks can be set up
- Key pages and panels - we'd do these by order of hits, so homepage, user register and the main panels templates for news, blogs and other content types.
- All the other bits - then we're into the 20% of the 80/20 rule, so we can review the other pages and blocks/panes as needed.
Assuming this all goes well we can start to implement it on PROD.
But you're right, there's a lot to do in terms of reviewing pages and content areas/blocks/panes. Luckily the server and interaction is covered by BOA out the box with the ESI module, so we're talking config rather than open heart surgery.
Note also that this proposal (G) also overlaps with (F) in that when we've set each block to be 'Per site', 'Per user', 'Per page' or 'don't cache' combinations, we'll have basically nailed the site structure/"Page 'outer'" piece in the first bullet above. So I'd recommend doing F then G.
Summary updated.
comment:6 follow-up: ↓ 8 Changed 3 years ago by jim
- Add Hours to Ticket changed from 0.0 to 1.0
- Total Hours changed from 2.95 to 3.95
- Description modified (diff)
I just realised that 404 pages START A FREAKING SESSION!
E.g. go to any page in the site, no session cookie... Go to a 404 like http://www.transitionnetwork.org/asses-of-doom and there's a session cookie.
THIS IS BAD! So new action:
10.. After checking the CustomError? module is starting a session on 404 pages, causing general slowness. So I've:
- created new page http://www.transitionnetwork.org/404/page-not-found which is just like the old CE one, minus the search form and minus the 'if you've come from the old wiki' bit which is not really relevant these days.
- disabled CE handling of 404s.
According to running SELECT COUNT(*) FROM sessions WHERE uid = 0; against the database, there are currently 2489 anonyomous sessions active over the last 2 days. This fix should have the effect of stopping 404s generating new ones for anonymous users, which I reckon is many of them.
This brings me to proposal H above, regarding doing this for 403s and removing CustomError? all together.
comment:7 Changed 3 years ago by jim
I've also added the new page (and the proposed 403/permission-denied page in H) to the robots.txt module page so they aren't indexed, and excluded them from the XML sitemap.
comment:8 in reply to: ↑ 6 Changed 3 years ago by chris
Replying to jim:
- created new page http://www.transitionnetwork.org/404/page-not-found which is just like the old CE one, minus the search form and minus the 'if you've come from the old wiki' bit which is not really relevant these days.
There are still a *lot* of links to the old wiki URL's and a lot of people come to the site following these old links so the text about how to find old content is probably still needed. I can dig up some stats on this if you would like.
comment:9 Changed 3 years ago by jim
True... we can add back if Ed thinks it's an issue, BUT 3 years is a long time on the web and people can probably work out where they are without being told where they came from!
In a large heading it used to say:
"Woha there! No page was found for the address you requested. If you have come from a link to the old Transition Towns website - hello, and please have a look through our list of most likely pages you may be after..."
Now says (in same sized heading):
404! Woha there! No page was found for the address you requested.
Again, Ed is welcome to edit the page as he sees fit as it's plain content.
Also: I've wiped the anonymous sessions from the DB, (may affect a handful of form submissions) but gives us a clean slate to see if this (task 10) makes a difference.
comment:10 Changed 3 years ago by jim
- Add Hours to Ticket changed from 0.0 to 0.65
- Total Hours changed from 3.95 to 4.6
- Description modified (diff)
Actioned:
11.. OK so I've gone ahead and done H, set up http://www.transitionnetwork.org/403/permission-denied, tested (works fine after a tweak) and removed CustomError?. We can roll back if any issues arise by reinstalling CE and putting the contents of the two new pages back inside -- I don't foresee any issues now I've tested, and that's one pointless module gone.
We have a slightly slimmer site with less overheads! And less sessions/better caching! Merry Christmas!
comment:11 Changed 3 years ago by ed
E: no go for now. needs discussion.
A, B, D, F: yes
C, G: sound sexy - might be on or the other - this work is going to eat up a lot of time - so keep an eye on timings
remember we're in utilitarian mode but not kill mode or tinkering with strategy stuff
comment:12 Changed 3 years ago by jim
- Add Hours to Ticket changed from 0.0 to 0.15
- Total Hours changed from 4.6 to 4.75
- Description modified (diff)
Thanks sir. Updating description...
comment:13 Changed 3 years ago by jim
- Add Hours to Ticket changed from 0.0 to -0.05
- Total Hours changed from 4.75 to 4.7
- Description modified (diff)
Correction...
comment:14 Changed 3 years ago by jim
Doing A: Created a new admin VBO view that allows filtering of taxo entries and usese Drupal's APIs to delete the terms safely... https://www.transitionnetwork.org/admin/reports/terms
comment:15 Changed 3 years ago by jim
- Add Hours to Ticket changed from 0.0 to 0.75
- Total Hours changed from 4.7 to 5.45
- Description modified (diff)
A done.. the view filter by any keyword, I chose '|' which seemed to be used to break up LOADs of spammy long terms.. Scanned by eye before deleting, around 1200 terms gone.
I also searched for 'Adult', 'payday' etc and removed the handful left...
ED: you can use this tool again (Reports -> Taxonmy terms cleanup) to remove safely batches of offending tags that contain, start with etc various words/symbols.
comment:16 Changed 3 years ago by jim
- Add Hours to Ticket changed from 0.0 to 0.25
- Total Hours changed from 5.45 to 5.7
11.. continued..
Which begs the question: Where/how are these spam keywords being generated? It seems the answer is that the 'tags' vocab is available on user profiles. I'd imagine this is pointless so I've removed it, which should reduce future spam terms appearing.
At https://www.transitionnetwork.org/admin/content/taxonomy/2 (the tags list page), there are 43 spammy terms with "::" in, e.g "Health and Fitness::Health::Foot Health::Recreation and Sports::Exercise::Sports::hand-wrist-pain::Fitness Equipment::Foot-Health::misc::gen::uncat::miscellaneous::General::Others::Shopping::uncharacterized" -- they're gone now too
I've done a little more cleaning and will move on...
comment:17 Changed 3 years ago by jim
- Add Hours to Ticket changed from 0.0 to 0.35
- Total Hours changed from 5.7 to 6.05
11.. In terms of the rest of B (taxo cleanup, A done), I think we need to do DB checks and remove links (in term_node table) to terms that no longer exist. I'll see if any modules or queries are posted that do this, otherwise it's a just join or two.
...
Actually, I've extended the above view with another tab: Term cleanup - by Relation. This needed a new module to temporarily be installed, Term Node Count which allows us to scan the system for taxonomy terms that have no nodes related to them... Looks like there's >900 of these...
Cleaning up now.
comment:18 Changed 3 years ago by jim
- Add Hours to Ticket changed from 0.0 to 0.35
- Total Hours changed from 6.05 to 6.4
- Description modified (diff)
Lots of spammy things clean up, like: Bankruptcy Lawyers, Brain Injury Attorney, Auto Loans, Air Max 2012 etc.. And a load of unused/old tags with nothing attached like: burn out, community power, buying club etc.
I've opened ~10% of the term links on the first couple of pages and none showed anything associated with the terms with the standard Drupal term view page... So I merrily deleted all 905 of them.
I checked in mysql for terms with >50 character length name and only 23 now -- and of those only one looked spammy (some look useless but that's a different game!) and it was: "Watch Free !!!Toronto vs Hamilton CFL live Streaming Online on 3 september". However, that was the removed by clearing out the 905, above. The same same query now returns 23, down from 525 originally
Note, this was in the Tags vocab only, some false positives appeared for Forum structure vocab, and the Geographic region vocab had a few areas with no entries -- the latter relating to proposal E.1.
OK so done with A and B for now... 1200 + 905 + 43 + a few others is >2150 terms gone.. That leaves 4468 in the database, meaning ~33% of the terms are now gone. We could still merge similar terms ('help', 'Help', 'HELP', etc), but I'll leave that for another day.
Updating summary
comment:19 Changed 3 years ago by jim
I'll leave Term Node Count in place for now... we might want to do this again in future.
comment:20 Changed 3 years ago by jim
Finally the view is now in Reports -> Tags cleanup, and has two tabs: Tags by name filter and Tags with no content related.
comment:21 Changed 3 years ago by jim
- Add Hours to Ticket changed from 0.0 to 1.05
- Total Hours changed from 6.4 to 7.45
- Description modified (diff)
12.. Onto proposal D: Check/improve views caching...
The principle around views cache times is that if it's relatively regularly updated, 1/2 hour is fine. Less, or less important is 1 hour. Other ancillary stuff can be 6 hours (the next level in Views). Also views that are simple to generate, will only cache their results and not rendered output (e.g. block list of titles), whilst complex displays will store the output (e.g. slideshow). Node view lists (blog, news etc) will only store results, not rendered as they are big for the cache pages -- we can review this later.
...
All done! OK so a LOT of views were uncached, now are... some user-centric views are now microcached for 5 mins, others that aren't so important have a much longer holding time.
The upshot will be a faster site, but a HUGE cache_views table. I don't see why this would matter really, provided it doesn't get too big. I'll monitor from here for a week or so, and in Munin the burst of MySQL insert/update activity from 15:25-16:25 today (7 Sep) marks the work I did and the start of the new caching regime for comparison purposes.
So that's C done.
comment:22 Changed 3 years ago by jim
- Add Hours to Ticket changed from 0.0 to 0.25
- Total Hours changed from 7.45 to 7.7
13.. F is not possible at present (without patching/changing system) because CATPCHA and Content Access have disabled the block cache on the performance page; "Note that block caching is inactive when modules defining content access restrictions are enabled.".
So this goes on the back burner for now, and we move to some preliminary ESI work for proposal G...
comment:24 Changed 3 years ago by jim
- Add Hours to Ticket changed from 0.0 to 1.15
- Total Hours changed from 7.7 to 8.85
- Description modified (diff)
13.. continued... OK so there's conflicting reports of if Block Cache Alter module (above) is able to make a difference if core's block cache is disabled. I think not, but regardless I've enabled it and applied the correct settings to every block as this was a 15 min job. The reason for going ahead with this (apart from it might work) is that then now we have a correct record of what blocks should be cached, and how.
We now need to track down which module is disabling them. Once we know what's causing it we can hopefully get around it and instantly gain a good boost in performance. This then becomes proposal I: Re-enable block caching.
Another part of what I just did was to split the 'top user links' bar with login/account into two, one for anonymous, one for authenticated. This way we can cache the former for every page (as it uses the page path to return users to the page they were on when logging in), and the latter means we can cache it once for the current user. Each version is now shown to the correct user.
The reason this was needed is that these blocks are currently executing evaluated PHP, which is php calling php on a string with eval() -- this is very slow. So the work I've outlined allows us to now grab these blocks and push them into code, which is proposal J: Convert inline PHP into module code and features.
Finally, this required the CSS to be corrected (as it was looking for just the specific block), so I've tweaked that and also noticed Ben had left LOTS of SASS debugging code in the CSS. I've now stripped that out for a minor client speed/bandwidth improvement.
Summary updated.
comment:25 Changed 3 years ago by jim
- Estimated Number of Hours changed from 0.0 to 0.05
- Description modified (diff)
Ensuring it's clear F has been done, even if it requires I to complete... formating of done items
comment:26 Changed 3 years ago by jim
- Add Hours to Ticket changed from 0.0 to 0.1
- Total Hours changed from 8.85 to 8.95
Further to my last, it seems modules using hook_node_grants() cause this... So I ran $ grep -R _node_grants * and it appears the answer is 3 modules do :(
- OG in contrib/og/modules/og_access/og_access.module
- nodeaccess_userreference in contrib/nodeaccess_userreference/nodeaccess_userreference.module
- content_access in contrib/content_access/content_access.module
Hmm... I'll try to find a way around since the modules are critical... Or we abandon the block caching.
comment:27 Changed 3 years ago by jim
- Add Hours to Ticket changed from 0.0 to 1.75
- Total Hours changed from 8.95 to 10.7
- Description modified (diff)
13.. More on block caching: It now works!
I needed to apply a core block module patch that comes with blockcache_alter -- this tells the block system to ignore that some modules are using hook_node_grants(). I've now done some checks and can confirm this is faster...
Futher, the Context module also interferes, and indeed uses its own version of the block load/render code I patched from core... So I patched this similarly and block caching has started to work.
According to Devel, without caching the homepage makes ~10 calls to block_block that each takes 0.22-0.75ms to generate, or about 3ms over 10 queries. With caching they all take are gone, saving queries and calls. This should be a nice little boost to the site.
So that means I and F are now done and working, summary updated.
NOTE we now need to keep an eye out for any caching oddness, though I'd expect this for blocks in the CMS only really.
comment:28 Changed 3 years ago by jim
Note also that the blocks on the homepage are quick/dirty to build. Some others around the site are much more complex (showing bits of menu, user options etc), so the improvement will be better in other pages.
comment:29 Changed 3 years ago by jim
- Add Hours to Ticket changed from 0.0 to 0.25
- Total Hours changed from 10.7 to 10.95
Patch for this added to our repo: https://github.com/transitionnetwork/transitionnetwork.org-d6.profile/blob/master/patches/context-blockcache-alter-enable.patch
And the makefile is updated to include this patch on next build.
TODO is the makefile change to apply block cache alter patch to core -- though it seems Context is overriding this so it's of limited use for now. Should be added though. I'll add a note to #582 about this so it gets covered on next update.
comment:30 Changed 3 years ago by jim
- Add Hours to Ticket changed from 0.0 to 0.75
- Total Hours changed from 10.95 to 11.7
- Description modified (diff)
I've now gone through and set block caching and timeouts where appropriate on all blocks (just did ones assigned normally before, now covers ones from Context too).
Also added Views Lite Pager (Proposal K), which should on a big site like ours saves a heavy node count just to work out what pager numbers to display. Instead of the normal next, prev, last, first, 1-9 options, it now just has 'next' and 'previous'.
Am now working though major views and adding the new litepager at the sme time I check/tweak the caching.
Added to our makefile recipe. Updated summary.
comment:31 Changed 3 years ago by jim
- Add Hours to Ticket changed from 0.0 to 0.75
- Total Hours changed from 11.7 to 12.45
- Description modified (diff)
14.. Proposal K (Views Lite pager) is done on the 'main' views (blogs, news, initiatives, projects etc), and the I've done a little more on D (views cache settings) as I went along to make sure these central views also cached their rendered HTML.
I'm now done for a few days, pending New Relic addition and the new information that brings.
Summary updated.
comment:32 Changed 3 years ago by jim
- Add Hours to Ticket changed from 0.0 to 0.25
- Total Hours changed from 12.45 to 12.7
- Description modified (diff)
Proposal L added: (underway already) Review slow query log, explain queries, tweak as necessary/flag poorly behaving modules. and raised #591 for Chris to examine if we can move the mysql disk hitting temp tables into memory.
Also adding some nice links to the excellent 2bits.com...
comment:33 Changed 3 years ago by jim
- Add Hours to Ticket changed from 0.0 to 1.0
- Total Hours changed from 12.7 to 13.7
- Description modified (diff)
OK though I said I was done, further investigation shows that Redis (Drupal page cache) is most likely being cleared by cron runs -- there's a consistent hourly spike in the memory usage chart around 1/2 past the hour...
This link backs this theory up: http://www.metaltoad.com/blog/how-drupals-cron-killing-you-your-sleep-simple-cache-warmer
And this module might be the solution... https://drupal.org/project/elysia_cron -- we can clear the page cache every 3 to 6 hours, and run other things when we want and the site is quieter.
Which makes up Proposal M. Summary updated, adding research time.
comment:34 Changed 3 years ago by jim
- Add Hours to Ticket changed from 0.0 to 0.5
- Total Hours changed from 13.7 to 14.2
- Description modified (diff)
OK so I'm still playing!
Added Elysia cron per last comment, and have set some less important things to happen every 2, 3, 6 or 24 hours depending on what it is.
Items changed from hourly:
- Every 2h: filter_cron, messaging_cron, node_cron
- 3h: system_cron
- 6h: backup_migrate_cron, media_youtube_cron, spambot_cron, xmlsitemap_cron, xmlsitemap_node_cron, xmlsitemap_taxonomy_cron
- Daily (at 5am): captcha_cron, logintoboggan_cron, revision_deletion_cron
- Weekly (5am on Sunday): update_cron (already does this via settings, but this ensures it)
Note I used http://2bits.com/drupal-performance/improving-performance-drupals-cron-using-elysia-cron-module.html as a template/good practice. All can be altered by Developer users at https://www.transitionnetwork.org/admin/build/cron/settings
The next enhancement will be to stagger these items across the minutes, but this means making the site cron run every minute or two or 5... The Aegir control panel allows this, and now we've Elysia Cron there is no performance hit since each item is only executed per its schedule. This has extra advantages, like indexing new content every 1/2 instead to keep the search system fresher...
I'll wait to see how this change goes before scheduling a quick (10 minute) tweak to allow per minute cron resolution in Aegir. Soon hopefully though.
comment:35 Changed 3 years ago by jim
Note, apart from the cron tweak, and another pass at the slow query log, I'm putting the rest of this ticket on hold until after the New Relic work, and Ed's return from holidays -- don't want to eat any more budget without confirmation.
I'm happy with the improvements so far though, and again many things will help or be re-usable in D7. Now let's let it bed in...
comment:36 Changed 3 years ago by jim
- Add Hours to Ticket changed from 0.0 to 0.05
- Total Hours changed from 14.2 to 14.25
Also, not to self and other developers: always disable * UI modules when you're done with them... Just ran this:
drush @www.transitionnetwork.org dis -y views_ui rules_ui context_ui imagecache_ui
As none of those are needed now development/tweaking is completed.
No more tinkering now...!
comment:37 Changed 3 years ago by jim
- Add Hours to Ticket changed from 0.0 to 0.3
- Total Hours changed from 14.25 to 14.55
Having seen some really good looking graphs for the Redis memory and mysql/php reductions since 20.30 last night, I've now set the cron to run every minute so we have the resolution we want, and then mead some channels in Elysia cron that look like this:
fast = */30 * * * * = Every half hour (at 0 and 30 past the hour).
- search_cron
- votingapi_cron
comms = 50 */2 * * * = every 2 hours at 50 past (staggered every minute)
Sends notifications etc out, speaks to other websites, gets feeds
- feeds_cron
- job_scheduler_cron
- mailchimp_cron
- media_youtube_cron
- messaging_cron
- mollom_cron
- notifications_cron
- ping_cron
default catch all = 20 */2 * * * = every 2 hours at 20 past the hour.
These are standard tasks that don't need any real urgency, and can be moved to other groups when/if we need to.
- abuse_cron
- aggregator_cron (should be in comms really)
- ctools_cron
- date_timezone_cron
- node_cron
- path_redirect_cron
- piwik_cron
- rules_cron
- trigger_cron
cleanup = 10 */12 * * * = 10 past at midnight and midday, staggered.
Key tasks that, especially when run hourly, kill the cache. Now every 12 hours at 10 past.
- filter_cron = big cache killer
- revision_deletion_cron
- spambot_cron
- system_cron = big cache killer
daily = 40 5 * * * = at 5:40am, though tasks are staggered every minute or two.
Daily tasks that
- backup_migrate_cron
- captcha_cron
- logintoboggan_cron
- update_cron
- xmlsitemap_cron
- xmlsitemap_node_cron
- xmlsitemap_taxonomy_cron
weekly = 5:42 am on Sunday evenings
- update_cron = checks for module updates
comment:38 Changed 3 years ago by jim
- Add Hours to Ticket changed from 0.0 to 0.1
- Total Hours changed from 14.55 to 14.65
- Description modified (diff)
OK, found this: http://2bits.com/admin-menu/admin-menu-module-popular-yet-occasionally-problematic.html and https://drupal.org/node/1031950
Hence, I'll do this (Proposal N) when #582 happens: Replace Admin Menu 1.x with 3.x -> could be the cause of some load spikes as it occasionally goes made and does 2000-5000 queries! Added to #582
comment:39 Changed 3 years ago by jim
- Add Hours to Ticket changed from 0.0 to 0.5
- Total Hours changed from 14.65 to 15.15
Some new enhancements/notes:
1: The RSS feeds are now all cached for 3 hours regardless of HTTP(S) protocol. They're also cached in the Nginx Speedcache AND Redis so it works for logged in and non-logged in users. This removes the need for redirects from HTTPS -> HTTP.
2: I've extended the cache time for speedcache items from 10 seconds to 15 minutes. This is for logged-out users only, and means things will stay in cache longer than before. Per the BOA docs:
Note that default cache TTL used in Speed Booster is just 10 seconds for
both logged in and anonymous visitors, and 24 hours for known bots.
To force 15 minutes cache TTL, use:
sites/all/modules/nginx_cache_quarter.info (platform-wide)
Which is what I did: touch nginx_cache_quarter.info in the current TN platform and main site, plus touch nginx_cache_hour.info on the news site...
I'll keep an eye out to see what effect this has.
comment:40 Changed 3 years ago by jim
- Add Hours to Ticket changed from 0.0 to 0.1
- Total Hours changed from 15.15 to 15.25
Advagg now added and enabled -- BOA auto configures it and it'll mean long cache times are available, plus less requests per page (for more client-side speed).
comment:41 follow-up: ↓ 42 Changed 3 years ago by jim
- Add Hours to Ticket changed from 0.0 to 0.5
- Total Hours changed from 15.25 to 15.75
- Description modified (diff)
I think we're creating a lot of URL aliases for things that don't need aliasing... I think therefore we can change the alias settings and the clean the unwanted items from the url_alias table, which is a heavily contended table.
This is Proposal O - updating description.
Long story short, we can nearly half the size of a very busy table that's hit on every request many times -- that's gotta be a good tweak!
Ed?
comment:42 in reply to: ↑ 41 ; follow-up: ↓ 43 Changed 3 years ago by chris
- Add Hours to Ticket changed from 0.0 to 0.25
- Total Hours changed from 15.75 to 16.0
Replying to jim:
I think we're creating a lot of URL aliases for things that don't need aliasing... I think therefore we can change the alias settings and the clean the unwanted items from the url_alias table, which is a heavily contended table.
How would yopu decide which ones are not needed?
Could we change them into a list of Nginx redirects?
My concern is that we serve a hell of a lot of 404 errors already, so far this month:
HTTP Response Code | Number |
---|---|
Code 200 - OK | 378866 |
Code 206 - Partial Content | 25 |
Code 301 - Moved Permanently | 25317 |
Code 302 - Found | 21523 |
Code 304 - Not Modified | 14032 |
Code 400 - Bad Request | 194 |
Code 403 - Forbidden | 21028 |
Code 404 - Not Found | 48411 |
Code 405 - Method Not Allowed | 27 |
Code 408 - Request Timeout | 4 |
Code 500 - Internal Server Error | 83 |
Code 502 - Bad Gateway | 359 |
Code 503 - Service Unavailable | 15988 |
Total number of redirects, 301 and 302 is 46,840 and the total number of 404's is 48411.
As far as I'm concerned 301 and 302's are good if they result in the user getting what they are looking for, 404's are bad.
comment:43 in reply to: ↑ 42 Changed 3 years ago by jim
- Add Hours to Ticket changed from 0.0 to 0.1
- Total Hours changed from 16.0 to 16.1
Replying to chris:
How would you decide which ones are not needed?
Well firstly URL aliases are in addition to the built-in paths (like /node/XXX and user/YYY), so all content that is unaliased is always available.
Secondly, the vast majority of useful aliases will be untouched by my changes -- these are for content, blogs, normal feeds and pages etc. I'm talking about removing pointless aliases that add no value. It's very unlikely these are being used much, and if they are we can always add the key ones back again.
Finally, having spoken with Ed, I'm really talking about all the aliases that match these source criteria:
- user/%/feed
- node/%/feed
These plain don't work and represent several thousand entries. Ed and I agree they should go and will have no impact. I'll also change the PathAuto? settings to avoid pointless aliasing.
Could we change them into a list of Nginx redirects?
Maybe, but the point was to remove the useless ones, not just move a bloated set of aliases further down the stack.
Updating summary.
Also per Ed and my chat, a few items here are to be moved to a new ticket as they're 'pre-D7 migration cleanup tasks'... summary rearranged pending the other ticket creation.
comment:45 Changed 3 years ago by jim
- Description modified (diff)
Moved TODOs to new ticket: #606 Site upgrade tasks -- pre-migration cleanup
comment:46 Changed 3 years ago by jim
- Add Hours to Ticket changed from 0.0 to 0.05
- Total Hours changed from 16.1 to 16.15
- Description modified (diff)
Due to work on #582 to include Views Content Cache, we can now revisit Proposal D (Review Views caching) and extend many for days due to the better caching invalidation logic provided by that module
comment:47 Changed 3 years ago by jim
- Add Hours to Ticket changed from 0.0 to 0.3
- Total Hours changed from 16.15 to 16.45
I've changed the Blog (inc SR) and News views to use Views Content Cache:
- Minimum lifetime = 5 mins
- Max lifetime = 5 days.
Note that Views Content Cache will invalidate/clear the appropriate views cache whenever the content type it monitors changes. So in the News example, if a new Transition News node is added, or an existing one is updated or deleted, then the News views will have their cache cleared and forced to render fully for the next request. Otherwise they can stay in cache for up to 6 days - which is now safe since they cannot change without Views Content Cache reacting.
Should take a reasonable amount of load off whilst simultaneously making the site more responsive to changes in content. Before this these views were pinned to 30 minute intervals regardless what happened.
I'll await feedback or any problems that might crop up before I continue applying this across the complete set of views.
comment:48 Changed 3 years ago by ed
sounds good
comment:49 Changed 3 years ago by jim
- Priority changed from critical to major
Since there's been no reports of any issues from the Views Content Cache, and the performance has been increased recently, we can downgrade this ticket.
I'll apply Views Content Cache to the rest of the key views in the coming weeks.
comment:50 Changed 3 years ago by jim
- Add Hours to Ticket changed from 0.0 to 0.5
- Priority changed from major to minor
- Description modified (diff)
- Total Hours changed from 16.45 to 16.95
Views content cache work now done.. Might have some side effects, Ed please shout if anything doesn't appear to be updating or is wonky in a view.
All the key stuff on this ticket is done bar a few low priority/impact ones... Downgrading.
Ed to close if he's happy with Drupal-level performance.
Work already done and sent in email, plus additional tweaks since then from item 6...
1.. Downloaded and processed the slow query log with https://github.com/LeeKemp/mysql-slow-query-log-parser -- example output file attached at https://tech.transitionnetwork.org/trac/attachment/ticket/585/sql.report.log
2.. (on call) Installed and used DB Tuner module to add an index for field_region_value field on content_field_region table.
3.. Looked at the last items and worked back... The 'access' table seems to cause a lot, so I've used DB Tuner module to add indexes for:
ALTER TABLE {access} ADD INDEX mask (mask)
ALTER TABLE {access} ADD INDEX status (status)
ALTER TABLE {access} ADD INDEX type (type)
4.. Disabled DB Tuner, enabled Variable Cleanup... Removed the following variables not needed that have not been cleaned up by their modules:
5.. cleared the caches to force the smaller variable table to be loaded.
6.. Examined the slow log and saw lots of SELECT name FROM users WHERE LOWER(mail) = LOWER(XXX); type ones -- investigation showed this is caused by LoginTobogan? that allows people to login via email too, however the LOWER() makes these queries very slow. On MySQL that call is also unnecessary (since string comparisons are case-insensitive as standard). I've patched our PROD version of LoginToboggan? to avoid the LOWER() calls, and only scan the email column if there's an '@' in the username field... I'll convert these changes to a patch so the updates will become part of #582.
7.. I notice some calls to the News Sharing Engine are slow, so I've upped the caching to 6 hours for all views queries there.
8.. PROPOSAL A & B: I notice there are plenty of spam entries in the Taxonomy tables, and listing all entries with > 50 chars shows ~200 entries that could safely be removed. Adding this to 'proposed fixes' list above needing approval. Taxonomy could use a cleanup.
9.. PROPOSAL C: I see plenty of SELECT * FROM variable calls, which imply a cache clear due to a variable being set. In normal use variables shouldn't be set (admin screens tend to do this), so I'd like to try to see what module it causing this and patch/remove it.
more to come...