Squiz Labs Blog - The latest news from the R&D division of Squiz®

Subscribe to our RSS feeds

PHP_CodeSniffer general memory improvements

PHP_CodeSniffer version 1.3.6 introduced memory improvements when using the summary report. You can take a look at the improvements that were reported at the time when running over the Symfony2 codebase.

Obviously, the summary report is only useful when running PHP_CodeSniffer from the command line and viewing the output. It wont be used in continuous integration systems, or in text editors, or by developers actually fixing errors.

But these improvements were just the start of a bigger change to try and restructure the PHP_CodeSniffer reporting system to stop it relying on so much memory during a run over a large number of files with a large number of errors. These changes are now in development and are available in a new git branch for testing right now.

To test these new changes, I again ran the PEAR coding standard over a Symfony2 checkout. Symfony2 doesn't use the PEAR coding standard, so this is a good way to generate a lot of errors and warnings over a large code base.

The PEAR standard generates 29,572 errors and 11,752 warnings in 2049 different files. Here is a summary of the time and memory usage, as reported by PHP_Timer:

Type of run Time (seconds) Memory (Mb)
Old code with warnings
(--report=full)
54 189.25
Old code without warnings
(--report=full -n)
52 140.75
New code with warnings
(--report=full)
52 75.00
New code without warnings
(--report=full -n)       
51 75.00

Clearly, the memory improvements are significant. There is probably a bit more that can be squeezed out depending on the type of report being shown, but that is pretty close to what can be expected from reports like CSV, XML and Checkstyle, that are typically used by automated tools and continuous integration systems.

Smaller code bases and/or less errors and warnings will further reduce the memory usage and you may actually see larger improvements on your own projects.

To summarise the changes: instead of keeping errors and warnings in memory so reports can be generated at the end of a run, PHP_CodeSniffer now writes a partial report to a file after each file is processed. At the end, the reporting system is given this cached data and able to add headers and footers, before it is finally output to screen. If the report is going to be written to a file instead, that file will be used throughout the run for the partial reports. Otherwise, a temp file is written to the current directory and removed once the run has been completed.

If you would like to try out these changes, you can grab the code from a new branch on the git repository. The easiest way to do this is to run the following commands:

git clone git://github.com/squizlabs/PHP_CodeSniffer.git
cd PHP_CodeSniffer
git checkout report-memory-improvements
php scripts/phpcs /path/to/code ...

To achieve these improvements, it has been necessary to remove a couple of rarely used features. So far, only support for multi-file sniffs (of which there are none in the core distribution) has been removed and there is no longer a shortcut to print both the summary and source reports at the same time, but this can still be done using the standard reporting arguments.

Other than that, PHP_CodeSniffer should still work exactly the same. If not, please get in touch and let me know. The best way is to contact me on Twitter, or submit bug reports and pull requests.


PHP_CodeSniffer 1.4.1 released

PHP_CodeSniffer version 1.4.1 has just been uploaded to PEAR and is now available to install. This release includes a few important changes for developers who maintain their own standards and sniffs.

Ignore Patterns

In version 1.3.6, ignore patterns were changed so that they are checked against the relative path of a file instead of the absolute path. This change was done to allow standards to define ignore patterns that didn't have to assume where the code was installed. But there were only very specific cases where using relative paths was better than absolute paths, and all existing ignore patterns needed to be checked to ensure they still worked. Some didn't, and I felt that was important enough to revert the change.

So from version 1.4.1, ignore patterns are now checked against the absolute path of a file again. If you need the ability to check an ignore pattern against the relative path of a file, you can specify this in the ruleset.xml file:

<!--
    Patterns can be specified as relative if you would
    like the relative path of the file checked instead of the
    full path. This can sometimes help with portability.

    The relative path is determined based on the paths you
    pass into PHP_CodeSniffer on the command line.
-->
<exclude-pattern type="relative">^/tests/*</exclude-pattern>
<exclude-pattern type="relative">^/data/*</exclude-pattern>

The T_INLINE_ELSE Token

Natively, PHP doesn't tokenize the question mark or colon in an inline IF statement as a special type of token. PHP_CodeSniffer has always converted the question mark into a token called T_INLINE_THEN but it has always left the colon as T_COLON. Sniffs that look for inline IF statements typically look for the T_INLINE_THEN token and then look ahead for a T_COLON to find the ELSE component.

But now, the colon in an inline IF statement is tokenized as T_INLINE_ELSE. For example:

<?php
$foo = ($bar === true ? 'yes' : 'no');

Will tokenize as:

Process token 0 on line 1 [lvl:0;]: T_OPEN_TAG => <?php\n
Process token 1 on line 2 [lvl:0;]: T_VARIABLE => $foo
Process token 2 on line 2 [lvl:0;]: T_WHITESPACE =>  
Process token 3 on line 2 [lvl:0;]: T_EQUAL => =
Process token 4 on line 2 [lvl:0;]: T_WHITESPACE =>  
Process token 5 on line 2 [lvl:0;]: T_OPEN_PARENTHESIS => (
Process token 6 on line 2 [lvl:0;]: T_VARIABLE => $bar
Process token 7 on line 2 [lvl:0;]: T_WHITESPACE =>  
Process token 8 on line 2 [lvl:0;]: T_IS_IDENTICAL => ===
Process token 9 on line 2 [lvl:0;]: T_WHITESPACE =>  
Process token 10 on line 2 [lvl:0;]: T_TRUE => true
Process token 11 on line 2 [lvl:0;]: T_WHITESPACE =>  
Process token 12 on line 2 [lvl:0;]: T_INLINE_THEN => ?
Process token 13 on line 2 [lvl:0;]: T_WHITESPACE =>  
Process token 14 on line 2 [lvl:0;]: T_CONSTANT_ENCAPSED_STRING => 'yes'
Process token 15 on line 2 [lvl:0;]: T_WHITESPACE =>  
Process token 16 on line 2 [lvl:0;]: T_INLINE_ELSE => :
Process token 17 on line 2 [lvl:0;]: T_WHITESPACE =>  
Process token 18 on line 2 [lvl:0;]: T_CONSTANT_ENCAPSED_STRING => 'no'
Process token 19 on line 2 [lvl:0;]: T_CLOSE_PARENTHESIS => )
Process token 20 on line 2 [lvl:0;]: T_SEMICOLON => ;
Process token 21 on line 2 [lvl:0;]: T_WHITESPACE => \n

Existing sniffs will need to be modified to ensure they work correctly.

Changing the Type of Messages

A problem coding standard developers used to face was that messages could not be changed from errors to warnings, or from warnings to errors. This meant that a new sniff had to be written, or they'd have to live with the existing one, even though it didn't match their coding standard. Sometimes severity levels could be used to work around the problem. But version 1.4.1 now allows you to change the type of any message:

<!--
    You can also change the type of a message from error to
    warning and vice versa.
-->
<rule ref="Generic.Commenting.Todo.CommentFound">
  <type>error</type>
</rule>
<rule ref="Squiz.Strings.DoubleQuoteUsage.ContainsVar">
  <type>warning</type>
</rule>

Mixed Line Endings

PHP_CodeSniffer has trouble tokenizing files that contain mixed line endings. The newline character that is found at the end of the first line is used throughout the tokenizing process to save a lot of processing time, but this may result in line endings being missed. Even the PHP tokenizer can give mixed results when the line endings do not match, although you are unlikely to see any errors on the screen. If processing JS code, you are more likely to see PHP notices from sniffs that get confused.

In version 1.4.1, a special internal warning is added to every file that contains mixed line endings. The warning lets you know that PHP_CodeSniffer may have problems checking that file. It wont stop  all the PHP notices from being shown, but it will at least tell you why they are there. And for continuous integration systems that hide the notices, you'll get a notification in the error report so you are informed.

This warning has the code Internal.LineEndings.Mixed and can be overriden in a ruleset.xml file in the same way the Internal.NoCodeFound message can be. This allows you to change the message, the type to an error or hide it completely by setting the severity to 0.

Everything Else

Apart from these changes, a lot of work has gone into testing and fixing issues with the new PSR-1 and PSR-2 standards that were released in version 1.4.0.

Special thanks to the 5 developers who directly contributed code to this branch. Your help and contributions are always very much appreciated.

You can view the full changelog on the PHP_CodeSniffer download page.

Stay up to date on all PHP_CodeSniffer changes, including new features and releases, by subscribing to the RSS feed or following me on Twitter.


PHP_CodeSniffer 1.4.0 released

PHP_CodeSniffer version 1.4.0 has just been uploaded to PEAR and is now available to install. This release includes two new coding standards; PSR-1 and PSR-2.

A lot of effort has gone into compiling these standards from existing sniffs, writing new sniffs, and making core changes to PHP_CodeSniffer to support new features that they require. I think they are a good representation of the written standards and another couple of great generic coding standards that you may choose to adopt in your own projects.

Special thanks to the 3 developers who directly contributed code to this branch. Your help is always very much appreciated.

I'd like to also thank everyone who tested the PSR-1 and PSR-2 standards as they were being developed over the last couple of months, and those developers who helped me on the PHP-FIG mailing list as I tried to decipher them. Your bug reports, pull requests and general feedback were always helpful.

But a lot more testing of these new standards needs to be done, so please keep the feedback coming through.

Apart from the new standards, version 1.4.0 contains a couple of other nice changes, including a new notify-send report and the ability to explain a standard. You can view the full changelog on the PHP_CodeSniffer download page.

Stay up to date on all PHP_CodeSniffer changes, including new features and releases, by subscribing to the RSS feed or following me on Twitter.


PHP_CodeSniffer memory improvements to summary report

Memory is the first thing to run out during a really big PHP_CodeSniffer run when a large number of errors and warnings are generated. The reason for this is pretty simple; PHP_CodeSniffer keeps the errors and warnings in memory so it can print consolidated error reports at the end.

The problem is that when you first introduce a coding standard to an existing code base, you're probably going to have a lot of errors and warnings being generated over a lot of files, so you might run out of memory.

So how do you identify the files that need the most work (so you can cut down the number of generated errors) when you can't view the report?

A new memory usage improvement for PHP_CodeSniffer's summary report can help. When you are printing the summary report by itself, PHP_CodeSniffer will no longer generate any error or warning messages. The summary report only shows the number of violations per file and not the detail, so this does not affect the functionality of the summary report in any way.

Note: if you tell PHP_CodeSniffer to also show sources in the summary report (the -s option) the errors and warnings have to be generated and stored, so you will not get any memory usage improvements.

To test this new change, I ran the PEAR coding standard over a Symfony2 checkout. Obviously, Symfony2 doesn't use the PEAR coding standard, so this is a good way to generate a lot of errors and warnings over a large code base.

The PEAR standard generates 29,091 errors and 11,446 warnings in 2007 different files. Here is a summary of the time and memory usage, as reported by PHP_Timer:

Type of run Time (seconds) Memory (Mb)
Old code with warnings
(--report=summary)
56 178.00
Old code without warnings
(--report=summary -n)
56 130.50
New code with warnings
(--report=summary)
51 79.25
New code without warnings
(--report=summary -n)       
51 79.25

When running with the old code, you can save a significant amount of memory by just ignoring warnings. When you do this, PHP_CodeSniffer wont even generate the warning messages and instead just generate the error message. This is a memory usage improvement that was implemented quite a while ago.

But the new code still uses far less memory than the existing code, even when the existing code ignores warnings. With warnings, the memory usage is less than half.

Obviously, these figures will change depending on the number of files being checked and the number of errors and warnings being generated, but you should see some memory improvements when running the summary report from now on.

This change will be released in PHP_CodeSniffer 1.3.6. I don't have a release date set, but it should be in the next couple of weeks. In the meantime, you can grab and run the lastest code from Github.


PHP_CodeSniffer 1.3.5 released

PHP_CodeSniffer version 1.3.5 has just been uploaded to PEAR and is now available to install. This release has been focused primarily on bug fixing. In particular, a number of bugs affecting Windows users have been resolved and all Windows users are encouraged to upgrade to this new version.

Special thanks to the 4 developers who directly contributed code to this branch. Your help is always very much appreciated.

You can view the full changelog on the PHP_CodeSniffer download page.

Stay up to date on all PHP_CodeSniffer changes, including new features and releases, by subscribing to the RSS feed or following me on Twitter.


PHP_CodeSniffer development moved to Git

Quite a lot of PEAR packages are being moved to hosting on Github, and Squiz Labs is starting to host projects over there as well, so I've moved development of PHP_CodeSniffer to Github and will be closing down the existing SVN repository.

The major benefit of this change is easier patch submission and acceptance. A lot of great developers submit features and bug fixes for PHP_CodeSniffer and I'm sure many will like the ability to submit pull requests instead of packaging up diff files.

Bug tracking remains on the PEAR website for now, and I'll of course continue accepting diffs on issue reports as I currently do.

You'll find PHP_CodeSniffer here: https://github.com/squizlabs/PHP_CodeSniffer

The README has all the info you need to get PHP_CodeSniffer installed from either PEAR or Git, as well as links to the bug tracker and news sources.


PHP_CodeSniffer 1.3.1 released

PHP_CodeSniffer version 1.3.1 has just been uploaded to PEAR and is now available to install. This release has been focused primarily on bug fixing, but given the amount of time since the 1.3.0 release, there were also a few features that made it in. Some of the highlights in this release are:

  • A new blame report for Mercurial (--report=hgblame)
  • Removed all dependencies on PEAR so PHP_CodeSniffer can run completely from an SVN checkout
  • The ability to check code from STDIN
  • Relative path support for command line report arguments
  • 19 bug fixes

Special thanks to the 10 developers who directly contributed code to this branch. Your help is always very much appreciated.

You can view the full changelog on the PHP_CodeSniffer download page.

Stay up to date on all PHP_CodeSniffer changes, including new features and releases, by subscribing to the RSS feed or following me on Twitter.


PHP_CodeSniffer 1.3.0 released

After more than 12 months of development, it's finally done. PHP_CodeSniffer version 1.3.0 (stable) has just been uploaded to PEAR and is now available to install.

Only bug fixes have been added since the 1.3.0RC2 release, but the changes between the 1.2 and 1.3 branches are numerous. If you're just catching up with this new branch, read the following release posts (and the articles linked within) to get an idea of what has changed:

Thanks to everyone who has downloaded and tested the 3 previous releases on this branch. Despite not being stable, these test releases were downloaded more than 8000 times, providing invaluable testing time for such a large architectural change.

And a special thanks to the 13 developers who directly contributed code to this branch. Your help is always very much appreciated.

Developers who have built custom coding standards are reminded to read the 1.3.0 upgrade guide to ensure your standards work with this new version.

You can view the full changelog on the PHP_CodeSniffer download page.

Stay up to date on all PHP_CodeSniffer changes, including new features and releases, by subscribing to the RSS feed or following me on Twitter.


Closure support in PHP_CodeSniffer

Quite a few sniffs included with PHP_CodeSniffer check various aspects of function declarations, from commenting to spacing and everything in between. But the rules for anonymous functions (closures) tend to be different. They are used inside function calls or assigned to variables mid-function. It doesn't make sense to enforce the same commenting and spacing rules in these contexts. Doing so would probably make the code harder to read, which is not the goal of a coding standard.

PHP's tokenizer does not have a token specifically for closures, so for this code:

<?php
function myFunc($foo)
{
    $callback = function ($bar) use ($foo) {
        $bar += $foo;
    };

}//end myFunc()
?>

You end up with a token stack like this:

 *** START SCOPE MAP ***
Start scope map at 1: T_FUNCTION => function
Process token 2 []: T_WHITESPACE =>  
Process token 3 []: T_STRING => myFunc
Process token 4 []: T_OPEN_PARENTHESIS => (
* skipping parenthesis *
Process token 7 []: T_WHITESPACE => \n
Process token 8 []: T_OPEN_CURLY_BRACKET => {
=> Found scope opener for 1 (T_FUNCTION)
...
Process token 11 [opener:8;]: T_VARIABLE => $callback
Process token 12 [opener:8;]: T_WHITESPACE =>  
Process token 13 [opener:8;]: T_EQUAL => =
Process token 14 [opener:8;]: T_WHITESPACE =>  
Process token 15 [opener:8;]: T_FUNCTION => function
* token is an opening condition *
* searching for opener *
        ...
        Process token 27 []: T_OPEN_CURLY_BRACKET => {
        => Found scope opener for 15 (T_FUNCTION)
        ...
        Process token 38 [opener:27;]: T_CLOSE_CURLY_BRACKET => }
        => Found scope closer for 15 (T_FUNCTION)
...
Process token 42 [opener:8;]: T_CLOSE_CURLY_BRACKET => }
=> Found scope closer for 1 (T_FUNCTION)
*** END SCOPE MAP ***

Notice that there are two T_FUNCTION tokens in there (tokens 1 and 15). The second is actually a closure, but sniffs can't tell the difference between them.

Instead of forcing sniff developers to check each function to see if it has a name, PHP_CodeSniffer, from version 1.3.0RC3 (not yet released) will have a new token to listen for: T_CLOSURE.

After the main token processing, the PHP tokenizer built into PHP_CodeSniffer does some additional work. You'll now see this at the end of PHP_CodeSniffer's verbose output:

*** START ADDITIONAL PHP PROCESSING ***
* token 15 on line 4 changed from T_FUNCTION to T_CLOSURE
*** END ADDITIONAL PHP PROCESSING ***

This means that all existing sniffs that listen for the T_FUNCTION token will now ignore closures by default. For all sniffs included with PHP_CodeSniffer, this is the correct behaviour. But if you have been making assumptions in your sniffs that closures also appear as T_FUNCTION tokens, you'll need to make a change to support 1.3.0. Just make sure you register both T_FUNCTION and T_CLOSURE tokens in your sniff, or when searching for next and previous tokens, make sure you pass an array with T_FUNCTION and T_CLOSURE instead of a single token.

This change also means you can start writing targeted sniffs to enforce rules for closures. Coding standards tend to be fairly relaxed on how they can be used and how they should be defined, leading to a variety of different styles. If you are already enforcing a specific style in your standard, and you are using closures, it might be a good time to sit down and think about what standards you want to put around them.

If you can't wait for the next release, you can play around with this new feature by installing PHP_CodeSniffer from the SVN source. Note that this will replace your existing PHP_CodeSniffer install:

pear uninstall PHP_CodeSniffer
svn co http://svn.php.net/repository/pear/packages/PHP_CodeSniffer/trunk PHP_CodeSniffer
cd PHP_CodeSniffer
pear install -f package.xml

Stay up to date on all PHP_CodeSniffer changes, including new features and releases, by subscribing to the RSS feed or following me on Twitter.


PHP_CodeSniffer 1.3.0RC2 released

PHP_CodeSniffer version 1.3.0 release candidate 2 has just been uploaded to PEAR and is now available to install. This release has been focused primarily on bug fixing, but (typically) there were also a few features that made it in. Some of the highlights in this release are:

I'd like to especially thank the 6 generous patch contributors for this release and all those who took the time to submit bugs and test the fixes.

All new features have been documented and will be available on the PHP_CodeSniffer documentation page once the documentation regenerates over the weekend.

Developers who have built custom coding standards are reminded to read the 1.3.0 upgrade guide to ensure your standards are ready for the 1.3.0 stable release. No release date is set, but it's a fairly easy upgrade process and doing it early will give you plenty of time for testing.

You can view the full changelog on the PHP_CodeSniffer download page.

Stay up to date on all PHP_CodeSniffer changes, including new features and releases, by subscribing to the RSS feed or following me on Twitter.


Squiz Labs

R & D division of Squiz Pty Ltd

Open source web experience management solutions

Squiz Labs is the research and development division of Squiz, the company behind the Squiz Suite of web experience management and development tools.

Our PHP and JavaScript developers are responsible for producing innovative enterprise-quality software while our interface designers and testing team ensure our products both look great and are easy to use.