Drupal scaling and performance tuning - Part 2

Drupal is highly relational.  When it comes to performance, MySQL has a big role to play in Drupal world. As I explained in Drupal scaling and performance tuning - Part 1, we were able tune Apache to handle much load during high traffic hours. But MySQL didn't give a chance to rest.

As the first step, we decided to look at the MySQL slow queries to identify bad queries. Server logged all queries that took more than 2 seconds to process and most of them were node permissions, user sessions, access log, cache, comments, watchdog and node contents.

The server load was gone up and it was almost at the frozen state most of the time. As a solution, I knew that we would have to end up with a modified MySQL configuration file.

But, What are the Parameters I need to change and what are the values for them, like i did for Apache?

Then I came across with a tool called MySQLTuner-perl which is a script written in Perl that allows us to review a MySQL installation quickly and make adjustments to increase performance and stability.  The current configuration variables and status data are retrieved and presented in a brief format along with some basic performance suggestions

Tools used
  1. MySQLTuner-perl
  2.  Maatkit ( Power tools for open-source databases)

Mysql Optimization - I will not list all configurations here as MySQL Tuner gives you a good guide
  1. MySQL Tuner
    1. Download the Mysql Tuner (wget https://github.com/rackerhacker/MySQLTuner-perl/blob/master/mysqltuner.pl)
    2. Make it executable (chmod +x mysqltuner.pl)
    3. Run MySQL Turner - You need your MySQL root password in order to execute this (./mysqltuner.pl) 
    4.  You should carefully read the output, especially the recommendations at the end. It shows exactly which variables you should adjust in the [mysqld] section of your my.cnf (on Debian and Ubuntu the full path is /etc/mysql/my.cnf). Whenever you change your my.cnf, make sure that you restart MySQL. You can then run MySQLTuner again to see if it has further recommendations to improve the MySQL performance. This way, you can optimize MySQL step by step.
  2. Maatkit
    1. mk-duplicate-key-checker - helped me to find duplicate indexes and foreign keys on MySQL tables. 
    2. Removed all duplicate indexes and foreign keys - This helped MySQL to process SQLs smoothly.
    3. mk-query-digest and mk-query-profiler helped to profile and test new configurations/ Modifications to the database.
    4.  mk-variable-advisor to double check changes and recommendations made by MySQL Tuner.
  3. Converted comments, node, users tables from MyISAM to InnoDB
  4. mysqlcheck -o -A -p command optimized other tables in the Drupal Database
After going through few cycles of above points, we were able to get the MySQL server to a level that it can handle much load without any hiccups. Slow query log didn't report any slow quries. SHOW PROCESSLIST was always below 20 - 30.

Other Optimizations used
  1. Disable watchdog, Statistic modules from Drupal to reduce the read/write load to MySQL
  2. Uninstalled all unused modules.
Finally I was able to get Our Oxygentank Database to a state where it can breathe freely without running on full power all the time. 

But I was not happy with this setting since the second MySQL master server was always running silently without helping the Primary Master server to handle its load.

I used NGINX to solve this problem.

I will discuss how I used NGINX to share MySQL load between Databases / Drupal nodes with the help of Memcached, in part 3.

Drupal scaling and performance tuning - Part 1

As WSO2Con 2011 was a huge hit in IT sector, I had to face a problem with wso2.org (Oxygentank developer Portal). Which site couldn’t handle a large traffic constantly.

The old system we had was NGINX (Load Balancer) fronted 4 Drupal nodes which was running with Apache2 with master - master MySQL replication.

But during a high-traffic Our servers got more than 1 minute (average) time to respond to a request. As NGINX’s fail timeout for 45 seconds, users got 504 Gateway timeout most of the time. Same time Apache server load went above 30 - 70. First as a quick solution we spinned new 2 servers and routed the load across them. That helped us to cater for few hours but again there was a huge delay when site was loading. Then we found that MySQL server load also gone up and process-list has grown and MySQL Server take time process queries than it was before. Root cause for this new problem was 6 drupal nodes had started stress MySQL servers continually.

Then I started to ask some questioned by my self.

OK, Then why we didn’t see this new problem before we plug new nodes? Answer was simple, Drupal nodes couldn’t stress the DB as it got self killed (Frozen) during high traffics.

Before jump in the Apache, Our WSO2 Infrastructure team revisit our monitoring systems (Ganglia) and found that servers were running on swap most of the time

As a solution I stated to tune up the Apache to avoid memory swapping problem. Then I saw that there was some miss configurations which take more memories when Apache process start to handle the traffic. Due to that configuration server can easily excused.

MaxClients as was one of a major key parameter I had to change during the Apache memory optimization process.

####Apache Memory Optimization

  • Find the non-swapped physical memory Apache has used (RES)
    • Run top command and press shift + m to find highest RES value which is Apache process use
    • Run service apache2 stop and type free -m
    • Note the used memory and Subtract it from total (This will give you the FREE MEMORY POOL)
    • Multiply MEMORY POOL by 0.8 to find the average AVAILABLE APACHE POOL (This will allow server 20% memory reserve for burst periods)
  • Calculate MaxClients
    • Divide AVAILABLE APACHE POOL by the highest RES memory used by Apache (Step 1). Now we have MaxClients number
    • Open apache2.conf and change MaxClients value
  • Other tweaks
    • Set Keepalive On (keep it off if you haven’t keep 20% memory reserve. )
    • set keepalivetimeout to the lowest value (This will prevent connection hanging, If you experience high latency to your server, set it to 2-5 seconds)
    • Set your Timeout to a reasonable value like 10 - 40 (Keep it low)
    • set MaxKeepAliveRequests with in 70-150 (If you have a good idea about objects in a 1 page set it to match the object count)
    • Set MinSpareServers equal to 10-25% of MaxClients
    • Set MaxSpareServers equal to 25-50% of MaxClients
    • Set StartServers equal to MaxSpareServers
    • Set MaxRequestsPerChild somewhere between 400 - 700 (if you see rapid Apache child process memory use growth) to 10000
  • Apply Changes to Apache
    • Run service apache2 start

Once I restart our servers with above setting, server could handle 150% traffic from Apache end and server load was always below 2 (in a burst server-load was 1-2)

This solved my Apache hanging problem. Then I ran a another load test to verify my setting. I was amazed with the improvement but that haven’t solved my problem.

server is taking bit long to response but I didn’t see a connection drop.

Now, What is wrong with my settings, Why server response is slow ? MySQL!!

I will discuss in part 2, how I addressed this problem .