Game Development, Design, Programming, and Marketing
Tuning Facebook Apps: Apache and MySQL Performance
Facebook apps are getting hugely popular. Once you’ve got a great idea to implement on a social network it can be tough getting it off the ground. The Facebook team says that one of the best indicators of an apps success is its speed and responsiveness. We’ve already seen the Facebook site grow a little too big for its britches – it’s performance can be downright discouraging on some older machines and browsers. But what can you do to increase app responsiveness and performance. Here are some tricks to get you app, your Apache server, and your MySQL database ready for the big time.
The first thing I’m going to mention is that if your app is designed and implemented poorly then you’ll find tuning Apache and MySQL to be highly dissatisfying. Best practices for web apps, especially hosted on other platforms, include rendering a page with minimal content and filling in resource-intensive information with AJAX, minimizing and caching SQL query results, caching REST queries where appropriate, etc.
You can’t just stick to outdated scripting pattern of “run a boatload of queries and render the page”. When you hit the 8 second mark Facebook will render an error to your user! The penatly for this is astronomical. Just look at the countless Facebook apps whose pages, walls, and discussion boards are just rant after rant about “how come this dont lode?” and “me two!!!” threads. But even well engineered apps simply buckle under the staggering demand social apps can put on servers. The spikes in popularity can kill an app almost as fast as it meteorically rose into the public spotlight. Apps like PackRat are a great example.
To do any kind of performance testing you’ll need a reliable suite of tools to test exactly how responsive your app is. I find the best kind of performance testing is compound testing of throughput for an entire page render. Usually these page renders make a handful of REST calls, a few SQL queries, and run your template engine.
A great app to do this is Apache Bench (ab). You just point ab to a URL on your server, tell it how many requests to run and how many to fire off concurrently. A simple one is to run a batch of 10,000 queries with 100 concurrent requests:
ab -n 10000 -c 100 http://localhost/facebook_app.php
The output from Apache Bench tells you a lot. Longest request, requests per second, etc. Here’s quickie from Google:
Server Hostname: www.google.com Server Port: 80 Document Path: / Document Length: 4952 bytes Concurrency Level: 100 Time taken for tests: 4.366902 seconds Complete requests: 1000 Failed requests: 841 (Connect: 0, Length: 841, Exceptions: 0) Write errors: 0 Total transferred: 5323026 bytes HTML transferred: 5003374 bytes Requests per second: 229.00 [#/sec] (mean) Time per request: 436.690 [ms] (mean) Time per request: 4.367 [ms] (mean, across all concurrent requests) Transfer rate: 1190.32 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 65 84 66.8 72 562 Processing: 72 310 385.6 149 3562 Waiting: 71 149 244.7 81 3504 Total: 138 394 388.6 222 3640 Percentage of the requests served within a certain time (ms) 50% 222 66% 492 75% 556 80% 578 90% 830 95% 1097 98% 1661 99% 2136 100% 3640 (longest request)
Apache comes pre-configured with a lot of modules turned on that are frequently not used. Even if you are using them you might ask yourself, “do I really need that feature?” The Red Hat default Apache 2 package has all these modules in it that I turned off:
- authn_auth_digest_module (shared)
- authn_file_module (shared)
- authn_alias_module (shared)
- authn_anon_module (shared)
- authn_dbm_module (shared)
- authn_default_module (shared)
- authz_host_module (shared)
- authz_user_module (shared)
- authz_owner_module (shared)
- authz_groupfile_module (shared)
- authz_dbm_module (shared)
- authz_default_module (shared)
- ldap_module (shared)
- authnz_ldap_module (shared)
- include_module (shared)
- env_module (shared)
- ext_filter_module (shared)
- deflate_module (shared)
- headers_module (shared)
- setenvif_module (shared)
- mime_module (shared)
- dav_module (shared)
- status_module (shared)
- autoindex_module (shared)
- info_module (shared)
- dav_fs_module (shared)
- vhost_alias_module (shared)
- negotiation_module (shared)
- dir_module (shared)
- actions_module (shared)
- speling_module (shared)
- userdir_module (shared)
- proxy_module (shared)
- proxy_balancer_module (shared)
- proxy_ftp_module (shared)
- proxy_http_module (shared)
- proxy_connect_module (shared)
- suexec_module (shared)
- cgi_module (shared)
- perl_module (shared)
- roxy_ajp_module (shared)
- python_module (shared)
- ssl_module (shared)
Next you’ll want to make sure that extra features are turned off. Lots of things like HostnameLookups have big performance hits and have been turned off since Apache 1.3. There are still more subtle things to turn off like FollowSymLinks which makes extra system calls for each request. For a full rundown see the Apache Performance Notes.
When you start to get a lot of simultaneous requests you’ll notice Apache starting to grab bigger and bigger chunks of resources. By default many distros ship with the prefork MPM as the default handler. Switch to the worker MPM will save you a lot of resources. Unfortunately PHP is not a thread-safe environment and you’ll need to switch PHP over to FastCGI. On Red Hat you can switch you Apache over to the worker MPM by changing /etc/sysconfig/httpd and adding this line:
# Set Apache to the Worker MPM on Red Hat HTTPD=/usr/sbin/httpd.worker
Switching PHP over to the FastCGI environment isn’t hard either. You install the PHP CGI package (and probably a few dependencies, too) for your distro and change the PHP handler in Apache to:
Options ExecCGI Indexes AddHandler fcgid-script .php FCGIWrapper /usr/lib/cgi-bin/php5 .php
Now you’ve got a nice, snappy PHP server capable of churning out tons of concurrent requests to your Facebook app users. For most apps you’ll also be maintaining app-specific data. Maybe it’s their high score or their favorite vampire. You do not want to store private user information in your app, ever. If you want to store user information for caching reasons put it in memory – at least flat-file in a ram drive. So you’ll likely be holding tables of user ids and some simple data associated with the user.
While your SQL fundamentals class told you to make your data as atomic as possible, sometimes this just isn’t the answer. Keep your tables structures simple joins can kill performance if you don’t keep an eye on them.
First things first, you’ll need to keep an eye on MySQL and make sure that you’re not running an “slow queries”. You can set a threshold for this and log every slow query to a file. You can even log queries that are run against un-indexed columns. Add these lines (while testing only, disk access is a performance hit):
set-variable=long_query_time=1 log-slow-queries=/var/log/mysql/log-slow-queries.log log-queries-not-using-indexes
Keep an eye on it. You’ll catch glaring queries right away. There are subtle problems I’ve run into also. If you mismatch the variable type you pass to a query it can scan the entire table skipping your index entirely! Beyond that you’ll want to make sure the heaviest queries are cached – or in the case of leaderboards, only update them when something changes! It might be as simple as storing a value for 10th place in your leaderboards. When someone cracks the leaderboard, re-write the entire thing into cache. Use this push method as often as possible. When you do the cache-miss then fill method you can run into some nasty race conditions.
Stay tuned and happy hacking!

