Changes between Version 15 and Version 18 of Ticket #590


Ignore:
Timestamp:
09/07/13 15:19:12 (3 years ago)
Author:
jim
Comment:

Lots of spammy things clean up, like: Bankruptcy Lawyers, Brain Injury Attorney, Auto Loans, Air Max 2012 etc.. And a load of unused/old tags with nothing attached like: burn out, community power, buying club etc.

I've opened ~10% of the term links on the first couple of pages and none showed anything associated with the terms with the standard Drupal term view page... So I merrily deleted all 905 of them.

I checked in mysql for terms with >50 character length name and only 23 now -- and of those only one looked spammy (some look useless but that's a different game!) and it was: "Watch Free !!!Toronto vs Hamilton CFL live Streaming Online on 3 september". However, that was the removed by clearing out the 905, above. The same same query now returns 23, down from 525 originally

Note, this was in the Tags vocab only, some false positives appeared for Forum structure vocab, and the Geographic region vocab had a few areas with no entries -- the latter relating to proposal E.1.


OK so done with A and B for now... 1200 + 905 + 43 + a few others is >2150 terms gone.. That leaves 4468 in the database, meaning ~33% of the terms are now gone. We could still merge similar terms ('help', 'Help', 'HELP', etc), but I'll leave that for another day.

Updating summary

Legend:

Unmodified
Added
Removed
Modified
  • Ticket #590

    • Property Add Hours to Ticket changed from 0.75 to 0.35
    • Property Total Hours changed from 5.45 to 6.4
  • Ticket #590 – Description

    v15 v18  
    77= Proposed fixes = 
    88 
    9 === Agreed to do by Ed === 
    10 '''B) Try a Taxonomy Cleanup''':  3 hours, Medium risk, medium reward -- style module to try to merge terms with the same names and clean up the link tables back to nodes. Further, we can remove any taxonomies or relations to certain CTs that don't really add value. 
    11  
     9=== To Do: Agreed by Ed === 
    1210'''D) Review Views caching''' 1-2 hours, low risk, high reward -- this was done a while back but I think 
    1311 
     
    1513 
    1614 
    17 === To do once above done if spare time (JK to monitor, EM to confirm) === 
     15=== To do: depeding on spare time (JK to monitor, EM to confirm) === 
    1816'''C) Find Variable table writes and kill them''' -- 3-8 hours, medium reward, low risk -- Per item 9 below,  I see plenty of SELECT * FROM variable calls, which imply a cache clear due to a variable being set. In normal use variables shouldn't be set (admin screens tend to do this), so I'd like to try to see what module it causing this and patch/remove it. 
    1917 
     
    2422'''A) Remove spam taxonomy entries''' 1/2 hour, Low risk, low reward -- See item 8 below. A simple delete from taxo term table where length > 50 is worth doing IMHO, and nothing I saw that would be clobbered is not spam. 
    2523 
     24'''B) Try a Taxonomy Cleanup''':  3 hours, Medium risk, medium reward -- style module to try to merge terms with the same names and clean up the link tables back to nodes. Further, we can remove any taxonomies or relations to certain CTs that don't really add value. 
     25 
    2626'''H) Remove CustomError module all together''' ~~1/2 hour, low risk, low reward -- We should take out the PHP code from the 403 section of CustomError and put it into a simple page entry. See comment 6 below as this has happened for 404s (which need no PHP). We can then remove the CustomError module all together, saving lots of sessions. I would go ahead and do this but since the 403 page has various displays depending on user type, I wanted to raise it here as it *may* have side effects. Or not...~~ 
    2727 
    2828 
    2929=== On hold for now === 
     30'''B.2) More Taxonomy cleanup''':  2 hours, Low risk, low reward -- try to merge terms with the same names. 
     31 
    3032'''E) Review site features, kill what we don't really need''' Low risk, '''game changer''', care needed, medium reward -- Let's start reviewing the site and make a short-list of things that have had their day. Ed, do you want to drive this on your return. 
    3133* '''E.1) Remove 'Geographic region' and related taxonomy and Hierarchical Select modules''' 1 hour, low reward, low risk -- never really been used and is effectively a duplicate of the location field. let's kill it!