Tizag.com Webmaster Tutorials - A collection of webmaster tutorials from HTML to PHP.

Monday, November 14, 2011

Make PHP apps fast, faster, fastest

Part 3: Cache your data in memory with the Memcache daemon


The previous two articles in this series present techniques to hasten your PHP applications. Part 1 presents XCache, a PHP extension that caches PHP opcodes in memory. XCache avoids the costly and (strictly) unnecessary expense of recompiling the same PHP code to deliver a page. XCache is free and open source software, takes little time to install and provides a substantial return. Part 2 presents XDebug, a PHP extension to profile your PHP code. XDebug is something akin to a software x-ray: It peers into your application, exposes its inner workings, and reveals how your code spends its cycles. Armed with XDebug metrics -- and not before -- you can optimize your code to tweak algorithms, reduce bottlenecks, and relieve excessive memory use.

Note: XCache is intended for production use. XDebug is best used during development because its accounting overhead is wasteful on a live machine.

This time, let's explore a third and especially effective performance enhancer. The Memcache daemon, called memcached, is a high-performance distributed object cache. Installed between your application and your data store, memcached persists your objects in RAM.

The Memcache PHP extension provides a simple application program interface (API) to access the cache. To use the cache, call the API to determine whether an object was previously cached. If so, simply retrieve the object and continue processing. Otherwise, go to the database, fetch the requisite data, map it to an object, and add it to the cache. From there, memcached minimizes or eliminates database queries for information you've processed before.

If XCache and XDebug are turbochargers, memcached is a jet engine. Prepare to light the afterburner.

The need for more speed

Typically, a PHP application's most time-consuming task is data retrieval. Indeed, the time to fetch information from a repository -- be it a file or database server -- far exceeds the lapse required to compile and perhaps even execute a PHP program. The time required to connect to a database server is one delay, waiting for the query to finish adds additional hesitation, and the transfer of results incurs even more latency. Further, if your code uses objects, there's a tax-incurred mapping flat rows to objects.

MySQL can speed the query phase using its query cache. You can also replicate a database (one master, many copies) to share the burden of query processing among many CPUs. However, the contents of the MySQL query cache become obsolete as soon as an underlying table changes. Moreover, the query cache is hit only if a query is identical to a previous query. Replication has limitations, too. A database write cannot be distributed, for example.

Ultimately, while the query cache and database replication are purposeful and have a place in an overall workload management strategy (the query cache consumes some memory, but is otherwise cheap; replication minimizes the risk of catastrophic downtime), connection and transmission time remain.

The Memcache PHP extension caches objects in RAM. Each cache hit replaces a roundtrip to a database server, making your application run faster. You'll likely find that memcached also (indirectly) improves the performance of your database server; because memcached acts as a surrogate persistent store, fewer requests reach your database server, allowing it to be more responsive to the queries that do arrive.

You can run memcached on one or more servers, and the contents of the cache are duplicated among all nodes. If servers fail, the client API software reroutes your cache reads and writes to a working alternate.

Unlike XCache, you must modify your code to integrate memcached. However, if you've carefully isolated your database access code within a handful of object methods, your modifications will likely be slight and centralized.

Written by Danga Interactive, the Memcache daemon is free and open source software licensed per the liberal terms of the Berkeley Software Distribution (BSD) License. The daemon should readily build on UNIX® and Linux® systems, and can also build on Mac OS X and Microsoft® Windows®. Many Linux distributions offer a memcached package; check your package repository. If you use Mac OS X or Windows and prefer the convenience of a pre-built binary, you can find such software on the Web by way of a simple Google search.

(Re)building PHP

Let's build, install, and deploy memcached on Debian Linux. To accelerate the process and allow you to test memcached separately from your existing Web server infrastructure, use the XAMPP Apache distribution as a basis for the build. XAMPP is easy to install and contains Apache V2, MySQL, PHP V4 and V5, Perl, many libraries, and many Web applications, such as phpMyAdmin. Starting with XAMPP is ideal if you've never built a Linux, Apache, MySQL, and PHP (LAMP) stack from scratch or if you want to avoid the hassles associated with such an endeavor.

Note: If you've built PHP from source before and retained your files, simply add the --enables-memcache option to your list of configure switches and skip the steps shown prior to building memcached and the PHP Memcache extension.

To build and deploy memcached, you need the XAMPP distribution, including the XAMPP development files, the source code for the PHP version included with XAMPP, and the source code to memcached and the PHP extension for Memcache. You can download the XAMPP binaries and the XAMPP development files (required for building additional components) from XAMPP. You can also use wget to grab the software quickly:

$ wget 'http://www.apachefriends.org/download.php?xampp-linux-1.6.tar.gz'       $ wget 'http://www.apachefriends.org/download.php?xampp-linux-devel-1.6.tar.gz' 

The former tarball (a tarball is a compressed .tar file and typically ends with the suffix .tar.gz) contains the binaries; the latter tarball contains header files needed to build code against the XAMPP system.

Although you can anchor XAMPP anywhere in your file system, install the package into /opt. Also install the development files into /opt. Using /opt makes the rest of the build process much easier. Use the -C option to tar to extract the files directly into /opt, as follows:


Listing 1. Extract files directly into /opt
                 $ sudo mkdir /opt $ tar xzf xampp-linux-1.6.tar.gz -C /opt $ tar xzf  xampp-linux-devel-1.6.tar.gz -C /opt $ ls -CF /opt/lampp RELEASENOTES error/  info/  logs/  phpsqliteadmin/ backup/  etc/  lampp*  man/  sbin/ bin/  htdocs/  lib/  manual/  share/ build/  icons/  libexec/ modules/ tmp/ cgi-bin/ include/ licenses/ phpmyadmin/ var/ 

Next, download and extract the source for the version of PHP included with XAMPP (XAMPP V1.6 bundled with PHP V4.4.6); download the code for PHP V4.4.6 from PHP.net. Again, wget makes the task a breeze:

$ wget http://us2.php.net/get/php-4.4.6.tar.bz2/from/www.php.net/mirror $ tar xjf php-4.4.6.tar.bz2 $ cd php-4.4.6 

Next, modify XAMPP's PHP build script to rebuild PHP to enable Memcache. You can find the original build script (and other build scripts) in /opt/lampp/share/lampp/configures.tar.gz. Extract the PHP V4 build script using:

$ tar xzfv /opt/lampp/share/lampp/configures.tar.gz \   php/configure-php4-oswald 

Open configure-php4-oswald and add --enable-memcache. (You may also find it necessary to remove options, such as those specific to Oracle and PostgreSQL, if your system doesn't have those databases.) Listing 2 shows the modified script used on the test system to rebuild PHP. (The PHP build process depends on many utilities and development libraries, such as Flex, Bison, libxml, and PCRE. You may need to install additional packages to prepare for these builds, depending on the contents of your Linux distribution and your PHP configuration.)


Listing 2. The modified XAMPP script to rebuild PHP
                 (     cd /opt/lampp/bin     rm phpize phpextdist php-config php     rm -rf /opt/lampp/include/php )  make distclean  export PATH="/opt/lampp/bin:$PATH"  export CFLAGS="-O6 -I/opt/lampp/include/libpng \     -I/opt/lampp/include/ncurses \     -I/opt/lampp/include -L/opt/lampp/lib"   ./configure \     --prefix=/opt/lampp \     --with-apxs2=/opt/lampp/bin/apxs \     --with-config-file-path=/opt/lampp/etc \     --with-mysql=/opt/lampp \     --enable-inline-optimation \     --disable-debug \     --enable-memcache \     --enable-bcmath \     --enable-calendar \     --enable-ctype \     --enable-dbase \     --enable-discard-path \     --enable-exif \     --enable-filepro \     --enable-force-cgi-redirect \     --enable-ftp \     --enable-gd-imgstrttf \     --enable-gd-native-ttf \     --with-ttf \     --enable-magic-quotes \     --enable-memory-limit \     --enable-shmop \     --enable-sigchild \     --enable-sysvsem \     --enable-sysvshm \     --enable-track-vars \     --enable-trans-sid \     --enable-wddx \     --enable-yp \     --with-ftp \     --with-gdbm=/opt/lampp \     --with-jpeg-dir=/opt/lampp \     --with-png-dir=/opt/lampp \     --with-tiff-dir=/opt/lampp \     --with-freetype-dir=/opt/lampp \     --without-xpm \     --with-zlib=yes \     --with-zlib-dir=/opt/lampp \     --with-openssl=/opt/lampp \     --with-expat-dir=/opt/lampp \     --enable-xslt \     --with-xslt-sablot=/opt/lampp \     --with-dom=/opt/lampp \     --with-ldap=/opt/lampp \     --with-ncurses=/opt/lampp \     --with-gd \     --with-imap-dir=/opt/lampp \     --with-imap-ssl \     --with-imap=/opt/lampp \     --with-gettext=/opt/lampp \     --with-mssql=/opt/lampp \     --with-mysql-sock=/opt/lampp/var/mysql/mysql.sock \     --with-mcrypt=/opt/lampp \     --with-mhash=/opt/lampp \     --enable-sockets \     --enable-mbstring=all \     --with-curl=/opt/lampp \     --enable-mbregex \     --enable-zend-multibyte \     --enable-exif \     --enable-pcntl \     --with-mime-magic \     --with-iconv   make   sudo make install   exit 1 

By the end of the script, your installation of XAMPP has a new, independent working copy of a memcache-capable PHP V4. If you'd like to test your build, stop all running copies of Apache and MySQL, including those outside of XAMPP, and run the command:

$ sudo /opt/lampp/lampp start 

This launches the XAMPP versions of Apache and MySQL, including your new PHP V4 module. (If you want to keep your production Apache server running undisturbed, you can edit the file /opt/lampp/etc/httpd.conf and change the Listen port parameter to 8080 (or another free port). You can then launch XAMPP's Apache server separately with the command:

sudo /opt/lampp/bin/apachectl start 

Point your browser to http://localhost, and you should see something akin to Figure 1.


Figure 1. XAMPP for Linux splash page: Proof that your installation of XAMPP and PHP was successful
XAMPP for Linux splash page: Proof that your installation of XAMPP and PHP was successful

To verify that your build of PHP is installed, click the phpinfo() link at left. You should see statistics similar to those shown in Figure 2. The version of PHP should be 4.4.6, the build options should include --enable-memcache, and the working version of php.ini should reside in /opt/lampp/lib.


Figure 2. Proof that your custom build of PHP replaced XAMPP's own
Proof that your custom build of PHP replaced XAMPP's own

Building the PHP Memcache extension

With a new PHP in place, the next step is to build and install the PHP Memcache extension. The extension provides the API required to access the Memcache daemon's services.

Begin by retrieving the source code of the Memcache extension. You can again use wget to get the source with the command:

$ wget http://pecl.php.net/get/memcache-2.1.0.tgz 

The process to build the Memcache extension is identical to that of other PHP extensions:

  1. Change to the directory of the source
  2. Run phpize, followed by ./configure, make, and make install.
  3. Be sure to use your new versions of phpize.
  4. Place /opt/lampp/bin in your shell's PATH before any directory containing another version of phpize:
    $ cd memcache-2.1.0 $ export PATH=/opt/lampp/bin:$PATH $ phpize $ ./configure $ make ... $ sudo make install Installing shared extensions:     /opt/lampp/lib/php/extensions/no-debug-non-zts-20020429/ 

The build places a new file named memcache.so in the extensions directory. To load and apply the extension, you must edit php.ini. Open the XAMPP PHP configuration file, /opt/lampp/etc/php.ini and add the lines from Listing 3.


Listing 3. Edit php.ini
                 extension=memcache.so memcache.allow_failover = 1  memcache.max_failover_attempts=20  memcache.chunk_size =8192  memcache.default_port = 11211  

Line 1 loads the Memcache extension. The other four lines are parameters to control the extension. In order, from top to bottom:

memcache.allow_failover
A Boolean to control whether the Memcache extension fails over to other servers if a connection error occurs. The default value is 1 (true).
memcache.max_failover_attempts
An integer to limit how many servers to connect to persist or retrieve data. If memcache.allow_failover is false, this parameter is ignored. The default is 20.
memcache.chunk_size
An integer that controls how large data transfers are. The default is 8192 bytes (8 KB), but you may get better performance with a setting of 32768 (32 KB).
memcache.default_port
Another integer for the TCP port to use to connect to Memcache. Unless you modify it, the default is the high, unprivileged pot 11211.

To determine whether your build is now fully functional, restart the XAMPP Apache Web server using this command:

$ sudo /opt/lampp/bin/apachectl restart 

If you revisit the XAMPP phpinfo() page, you should see a section for Memcache that resembles Figure 3.


Figure 3. View your Memcache settings through phpinfo()
View your Memcache settings through phpinfo()

Building the Memcache daemon

There's one more step to this (seemingly lengthy) process: Build and deploy the Memcache daemon, which manages the RAM cache for your data. The daemon depends on libevent, so you must build and deploy that library before compiling memcached:

$ wget http://www.monkey.org/~provos/libevent-1.3b.tar.gz $ wget http://www.danga.com/memcached/dist/memcached-1.2.1.tar.gz 

Next, extract the tarballs to yield a directory for each bundle:

$ tar xzf memcached-1.2.1.tar.gz  $ tar xzf libevent-1.3b.tar.gz 

To continue, build each package in turn, beginning with the library. To keep all the files contained in /opt, use the --prefixoption when you run configure. The instructions below build and install libevent.


Listing 4. Edit php.ini
                 $ cd ../libevent-1.3b $ ./configure --prefix=/opt/lampp ... $ make $ sudo make install ... /usr/bin/install -c .libs/libevent.lai /opt/lampp/lib/libevent.la /usr/bin/install -c .libs/libevent.a /opt/lampp/lib/libevent.a chmod 644 /opt/lampp/lib/libevent.a ranlib /opt/lampp/lib/libevent.a PATH="$PATH:/sbin" ldconfig -n /opt/lampp/lib ---------------------------------------------------------------------- Libraries have been installed in:    /opt/lampp/lib 

And the next commands build and install the memcached binary.


Listing 5. Edit php.ini
                 $ cd ../memcached-1.2.1 $ ./configure --prefix=/opt/lampp ... $ make $ sudo make install ... /usr/bin/install -c memcached /opt/lampp/bin/memcached /usr/bin/install -c memcached-debug /opt/lampp/bin/memcached-debug 

Launching memcached is simple:

./memcached -d -m 2048 -l ip-address -p 11211 

The -d option runs memcached as a daemon, rather than in the foreground. -m number allocates number megabytes to this process instance. (On some systems, you may be required to run multiple memcached instances to access all the memory available for caching. See the Memcache documentation for more information). -l ip-address -p 11211 makes the daemon listen to IP address ip-address and port 11211, respectively. Substitute your IP address. If you choose a different port for memcached, make sure php.ini reflects that port.

Caching your data

With installation behind you, it's time to put Memcache to work.

Memcache is generally used to store objects, but it can store any PHP variable that can be serialized, such as a string. Memcache has a procedural and an object-oriented API. No matter which variant you use, you must provide four arguments to store a variable in the cache:

A unique key
The key is used to retrieve the associated data from the cache. If each of your records has a unique ID, that will likely suffice as a cache key, but you can concoct other schemes to suit your needs.
The variable to cache
The variable can be any type so long as it can be serialized to be persisted and de-serialized to be retrieved.
A Boolean to enable on-the-fly compression via zlib
Compression saves memory in the cache -- albeit at the expense of processing the data on save and restore.
An expire time specified in seconds
When a cached datum expires, it is removed automatically. If you set this value to 0, the item is never expired from the cache. Use the Memcache API delete() function to remove such a perennial object.

The code below uses the object-oriented Memcache extension API to create, store, and retrieve a simple object.


Listing 6. Creating, caching, and retrieving a PHP object
                 connect(  '198.168.0.1'  );     }             return $instance;           }            function close( ) {         if ( isset( $instance ) ) {             $instance->close();             $instance = null;          }     } }  class myDatum {     var $values = array(         'description' => null,         'price'       => null,         'weight'      => null     );    function myDatum( $settings ) {      foreach ( $settings as $key => $value ) {             if ( ! array_key_exists( $key, $this->values ) ) {                 die( "Unknown attribute: $key" );             }                          $this->values{ $key } = $value;         }  }    function set( $key=null, $value=null ) {      if ( is_null( $key ) ) {             die( "Key cannot be null" );         }                  if ( ! array_key_exists( $key, $this->values ) ) {             die( "Unknown attribute: $key" );         }                  if ( ! is_null( $value ) ) {             $this->values{ $key } = $value;         }                  return( $this->values{ $key } );  }    function get( $key=null ) {      return( $this->set( $key, null ) );  } }   $datum = new myDatum(      array(         'description' => 'ball',         'price'       => 1.50,         'weight'      => 10     )  );  print $datum->get( 'description' ) . "\n";  $cache = myCache::getInstance( );  if ( $cache->set( $datum->get( 'description' ), $datum, false, 600 ) ) {     print( 'Object added to cache' . "\n");  }  $cache->close();   $new_cache = myCache::getInstance( );  $new_datum = $new_cache->get( $datum->get( 'description' ) );  if ( $new_datum !== false ) {     print( 'Object retrieved from cache' . "\n");      print $new_datum->get( 'description' ) . "\n"; }  print_r( $new_cache->getExtendedStats() );  $new_cache->close(); ?> 

The myCache class is a singleton and provides a solitary, open connection to the cache. The myDatum class is representative of all objects: It has a series of attributes (implemented as a hash), as well as one getter and one setter method. To create amyDatum object, pass a hash of values to its constructor. To set an attribute, call set() with an attribute name as a string and a value. To get an attribute, call get() with the attribute name.

The last several lines in the code above create an object, store it in the cache, and retrieve it. The penultimate line, part of the Memcache API, shows the statistics of the cache. If you run Listing 6 using your new PHP command-line interpreter, you should see something like Listing 7.


Listing 7. Creating, caching, and retrieving a PHP object
                 ball Object added to cache Object retrieved from cache ball Array (     [72.51.41.164:11211] => Array         (             [pid] => 865             [uptime] => 3812845             [time] => 1173817644             [version] => 1.1.12             [rusage_user] => 0.043993             [rusage_system] => 0.038994             [curr_items] => 1             [total_items] => 5             [bytes] => 145             [curr_connections] => 1             [total_connections] => 8             [connection_structures] => 3             [cmd_get] => 5             [cmd_set] => 5             [get_hits] => 5             [get_misses] => 0             [bytes_read] => 683             [bytes_written] => 1098             [limit_maxbytes] => 67108864         ) ) 

Simple, no? If the constructor for myDatum were typical, it would likely be given an ID and would query a database to yield a specific row (for example, find the student with social security number 123-45-6789). You can extend the constructor to query the cache for the ID first. If found, simply return that object. Otherwise, construct the object, cache it, and return it.

If you have a group of Debian Linux systems, you can copy or export (through NFS) /opt/lampp and run memcached on multiple systems. Running memcached concurrently on two or more machines removes a single point of failure and expands the capacity of the cache. Use the addServer() API function to build a list of available memcached servers.

No comments: