Part 2: Profile your PHP application to find, diagnose, and hasten plodding code
Part 1 of this "Make PHP apps fast, faster, fastest" series demonstrates how you can accelerate your entire site using XCache, a PHP opcode cache. XCache (just one of several caching packages you can evaluate) retains the output of the compilation process, eliminating otherwise redundant work. As long as a page doesn't change, the cached version of the page suffices as proxy. When a page is revised, the cached version is invalidated and replaced.
An opcode cache -- and an opcode optimizer, which is often provided in the same package -- is a dirt-cheap technique to hasten the responsiveness of your site. Many cache packages are free and open source, and you need not change any code to realize the benefits.
Of course, in some applications, the time required to translate a file of PHP source code to its corresponding opcodes is insignificant compared to actual execution time. Connecting to a remote database server, querying a store with inefficient SQL queries, and oodles of busy work parsing and manipulating data can be far more lengthy and, hence, expensive or even wasteful. Good network design and smart database construction can alleviate lag time and slow queries, and you can turn to your friendly neighborhood specialist for assistance, if necessary. If your code is plodding, however, you may want to take the matter into your own hands.
But where to start? As is widely recommended, tweaking code before it's complete is foolhardy -- even a first-cut implementation may be suitably fast. When your code is functional and accurate -- and seemingly or actually slow -- your first step is to quantify, or benchmark, its performance. Attempts to optimize code without such diagnostics are a veritable stab in the dark.
A simple performance metric is wall clock time, or measuring the real-world delay between a request for a page and a finished rendering. For some circumstances -- say, running the Web server, database, and browser locally on your own workstation -- wall clock time can be informative. However, wall clock time is merely anecdotal for most other permutations, such as those that include network latency; a trafficked, live Web server; or an active database.
A far more accurate measure -- one that can even gauge the time taken to run a single-source code statement -- can be taken using a code profiler. A profiler, typically implemented as an extension to the PHP runtime engine, records the delta between the start of a statement and its end, notes the delta between the start and end of a procedure, and captures the total time required to craft a response to an incoming request. With such plumbs, you can target the statement, loop, function, class, or library that is most sluggish. If time isn't the issue, but rather memory usage, a good profiler can also reveal component footprints.
One popular profiler for PHP is Xdebug, which also provides server hooks to interactively debug PHP applications. (See "A better way to debug" for more information. Another part of this series will look at interactive and advanced debugging.) Xdebug is easy to build from source and installs as a Zend extension. (Binaries are available for some platforms.) When in place, each request for a PHP-based page generates a data set viewable in KCacheGrind.
Building and installing Xdebug
Assuming that you have access to the PHP utilities phpize and php-config, and assuming that you have access to your system's php.ini configuration file, you should need only a few minutes to install and set up Xdebug. The instructions presented below are for Linux®, although the steps required for Mac OS X are virtually identical. (You can find precompiled versions of Xdebug for Microsoft® Windows® on the Xdebug Web site.)
The latest version of Xdebug is V2.0.0RC3 (although the final version -- V2.0.0 -- may be available by the time you read this). Download and unpack the tarball, and change to the subdirectory of the source code. Ensure that phpize and php-config are in your shell's PATH, and prepare to build with phpize.
Listing 1. Set up Xdebug
$ wget http://www.xdebug.org/files/xdebug-2.0.0RC3.tgz $ tar xzf xdebug-2.0.0RC3.tgz $ cd xdebug-2.0.0RC3/xdebug-2.0.0RC3 $ phpize Configuring for: PHP Api Version: 20020918 Zend Module Api No: 20020429 Zend Extension Api No: 20050606 |
The product of phpize is a script -- aptly named configure -- that configures the rest of the build process. To build Xdebug, simply type ./configure followed immediately by make.
Listing 2. Build Xdebug
$ ./configure checking build system type... i686-apple-darwin8.8.1 checking host system type... i686-apple-darwin8.8.1 checking for egrep... grep -E ... $ make ... Build complete. (It is safe to ignore warnings about tempnam and tmpnam). |
The make command yields the Xdebug extension, xdebug.so. All that's left to do is install it with sudo make install.
$ sudo make install Installing shared extensions: /usr/lib/php/extensions/no-debug-non-zts-20020429/ |
Note: If you ran the last command in a terminal window, select and copy the directory emitted in the last step. You'll need it in the next step.
Finally, to visualize profile data, you must have KCacheGrind and GraphViz. Linux distributions that include the K Desktop Environment (KDE) are likely to have KCacheGrind and GraphViz already. Otherwise, it should be easy to find versions suitable for your flavor of Linux. Debian users can quickly install KCacheGrind and GraphViz and all the packages' dependencies using the Advanced Packaging Tool (APT).
Listing 3. Install KCacheGrind
$ apt-cache search kcachegrind valgrind-callgrind - call-graph skin for valgrind kcachegrind - visualisation tool for valgrind profiling output kcachegrind-converters - format converters for KCachegrind profiling visualisation tool $ apt-cache search graphviz graphviz - rich set of graph drawing tools graphviz-dev - graphviz Libs and Headers against which to build applications graphviz-doc - additional documentation for graphviz libdeps-renderer-dot-perl - DEPS renderer plugin using GraphViz/dot ... $ sudo apt-get install kcachegrind graphviz ... |
If KDE is not installed on your system, KCacheGrind, GraphViz, and all prerequisites require approximately 90 MB of disk space.
After the Xdebug extension is installed, you're ready to enable and configure the extension. Open php.ini in a text editor, and add the following lines.
Listing 4. Enable and configure the extension
zend_extension = /usr/lib/php/extensions/no-debug-non-zts-20020429/xdebug.so xdebug.profiler_output_dir = "/tmp/xdebug/" xdebug.profiler_enable = Off xdebug.profiler_enable_trigger = 1 |
The first line, zend_extension, loads the Xdebug extension. The second line names the directory to place profiler output. If necessary, create the named directory and change its mode to grant the Web server user write access.
The third line disables the profiler. However, the fourth line enables the profiler whenever the HTTP GET or POST parameterXDEBUG_PROFILE is set. (If you want the profiler to be on continuously, change Off in the third line to On.)
After adding the lines and verifying that your output directory is writable, restart your Web server. As with other PHP extensions, to verify that Xdebug is installed and available, create a bare-bones PHP program to call phpinfo() and scan the results. You should see something resembling Figure 1. (Portions of the entire output have been elided for brevity.)
Figure 1. Phpinfo indicates whether Xdebug is installed
You can also scroll down to the Zend logo. Xdebug should appear next to the logo if Xdebug has been properly loaded and configured.
To profile your code, simply point your browser to your PHP application. If you set the profiler to work on a case-by-case basis per trigger, append your URL with XDEBUG_PROFILE=1 or, as show below, embed the parameter within a form.
As an example, let's profile the simple ACME Fibonacci Maker, fibonacci.php, shown in Listing 5. For convenience, theXDEBUG_PROFILE parameter is set within the form in a hidden variable. (When the code moves to production, Xdebug is likely to be disabled, rendering this variable harmless.)
Listing 5. Fibonacci.php
"st", 2 => "nd", 3 => "rd" ); if ( $_REQUEST['n'] < 4 || $_REQUEST['n'] > 20 ) { $suffix = $suffix[$n]; } else { $suffix = 'th'; } echo ' |
Point your browser to http://localhost/fibonacci.php (or the appropriate URL) and type a number -- say, 16. The result -- the 16th element in the Fibonacci series -- is shown in Figure 2.
Figure 2. Sample Fibonacci application
If you list the contents of the profiler output directory (named in php.ini), you should see a file named something like cachegrind.out.951917687. The prefix cachegrind.out. is fixed. By default, the numerical suffix is a CRC32 hash of the directory path to the fibonacci.php file. Hence, if each of your applications lives in its own directory, output from each application is segregated by file name. (If you prefer to associate output with time, add this line:
xdebug.profiler_output_name = timestamp |
to php.ini.)
From your terminal window, launch KCacheGrind and open cachegrind.out.951917687. A new window resembling Figure 3 should open momentarily.
Figure 3. KCacheGrind application
Click the Callees tab, double-click the highlighted line of source code, and choose Source File from the Grouping list. Your view should change to resemble Figure 4.
Figure 4. Peering at the results
As you might expect, virtually all the processing time (99.87 percent of the 70,989 milliseconds) was spent executing 3,193 invocations of the function fib(). To hasten this application, which becomes proportionally slower as you proceed further into the Fibonacci sequence, the costly rework of recalculating Fibonacci numbers would have to be avoided. Indeed, the ACME Fibonacci Maker is an excellent place to apply computational reuse.
A smarter version of the fib() function is shown below. The new version trades memory for time because it retains intermediate computations for later use. Figure 5 shows the analysis: Instead of 3,192 function calls, only 30 were required (and only half of those computed any result), and time was reduced to 20 milliseconds.
Listing 6. Updated fib() function
function fib($nth = 1) { static $fibs = array(); if ( ! empty ($fibs[$nth] ) ) { return( $fibs[$nth] ); } if ( $nth < 2 ) { $fibs[$nth] = $nth; } else { $fibs[$nth - 1] = fib( $nth - 1 ); $fibs[$nth - 2] = fib( $nth - 2 ); $fibs[$nth] = $fibs[$nth - 1] + $fibs[$nth -2]; } return( $fibs[$nth] ); } ?> |
Figure 5. Faster Fibonacci function
While a single run of an application can point to valid problems -- just try the 50th element of the Fibonacci sequence in the original application above -- it's typical to collect statistics over several invocations and look at patterns.
If you retain the default "crc32" naming scheme, each time you run fibonacci.php, the data file is overwritten. However, you can alter that behavior and append subsequent runs to the same file by setting xdebug.profiler_append = 1 in php.ini. Restart the Web server after making the change.
An example of aggregate data after three runs of the Fibonacci Maker is shown in Figure 6. Total time was a little more than 2 seconds; 99.97 percent of the time was spent in fib(). Figure 6 shows the Call Graph tab, which is generated by the dot utility of GraphViz. Detailed use of KCacheGrind is beyond the scope of this article, but thorough documentation is available online.KCacheGrind can slice and dice data in many ways, and the proper view depends on which problem you want to solve.
Figure 6. Aggregate profiling data
While it's difficult to demonstrate meaningful profiling without a substantial body of code, the next example does show the amount and kind of information you can yield from more exemplary code. Listing 7 shows a (contrived) application to assemble toy rockets. The rocket is made of several parts, and each part requires a certain time to produce. In PHP, a class represents each part, and an instance method represents the construction time for each part. You can think of the toy as an application and each part as a feature.
Listing 7. A set of PHP classes to emulate construction of a toy
build( MINUTE ); } function build( $delay = 0 ) { if ( $delay <= 0 ) return; while ( $delay-- > 0 ) { } } } class Capsule extends Part { function Capsule() { parent::Part(); $this->build( CAPSULE * MINUTE ); } } class Booster extends Part { function Booster() { parent::Part(); $this->build( BOOSTER * MINUTE ); } } class Stage extends Part { function Stage() { parent::Part(); $this->build( STAGE * MINUTE ); } } class SpaceShip { var $booster; var $capsule; var $stages; function SpaceShip( $numberStages = 3 ) { $this->booster = new Booster(); $this->capsule = new Capsule(); $this->stages = array(); while ( $numberStages-- >= 0 ) { $stages[$numberStages] = new Stage(); } } } $toys = array(); $count = PRODUCTION; while ( $count-- >= 0 ) { $toys[] = new SpaceShip( 2 ); } ?> |
Running the code produces a new data file. Again, load that data into KCacheGrind. If you switch to the Source and Call Graphtabs, your view should resemble Figure 7.
Figure 7. A profile of the spaceship application
The Flat Profile pane (left) shows all the functions (methods) called during the application. The left-most column shows a quasi-cumulative total, the second column shows individual discrete measurements for each method, and the third column lists the number of times the method was called. Conveniently, the colored squares are reflected in the call graph, making it easy to correlate timings with the sequence of events.
Obviously, the time to construct the stages is the most expensive. The overhead required to build each part (represented byPart's constructor) is the next most costly operation. It's also interesting to see the very slight expense for PHP's own define()function.
Finally, you can also peek into how memory was used. Choose Memory and Class from the drop-down menus near the top, and switch to the Types and Caller Map tabs at top and bottom. Your screen should look something like Figure 8.
Figure 8. Memory usage within the spaceship application
No comments:
Post a Comment