Squiz Labs Blog - The latest news from the R&D division of Squiz®

Subscribe to our RSS feeds

PHP_CodeSniffer 2.0.0 released

Nineteen months ago, I started work on a project to allow PHP_CodeSniffer to fix the problems that it finds. Doing this required a lot of changes to the core classes, a lot of iteration and refactoring of the fixing and testing code, and an enormous amount of time and testing across many PHP projects to ensure I am confident enough to release something that actually modifies code. I could keep writing unit tests forever, but I've finally got to a point where I am happy to release this first version of the PHP Code Beautifier and Fixer (PHPCBF), for when you just can't be bothered fixing coding standard errors yourself.

I originally started with a goal of being able to fix the most commonly found errors when checking PHP projects using the PSR2 coding standard, but I expanded that goal to include all coding standards (including custom standards) and supported file types (PHP, JS and CSS). So as of version 2.0.0, when you run PHP_CodeSniffer and get  your standard report of the errors found, you will now be able to see which of those errors can be automatically corrected by PHPCBF. Sample report output can be seen on the wiki: https://github.com/squizlabs/PHP_CodeSniffer/wiki/Fixing-Errors-Automatically

Custom Coding Standards

If you have a custom coding standard that uses the sniffs included with PHP_CodeSniffer, you will find that many of the errors your standard finds are already fixable with PHPCBF. PHP_CodeSniffer comes with just over 600 unique errors and PHPCBF is able to fix just over half of those. The rest of the errors are almost entirely made up of things that cannot be automatically fixed, such as variable names and changes to comparison operators.

If you have your own custom sniffs, you are able to add auto-fixing code to them using the new auto-fixing tools built into PHP_CodeSniffer. You have access to all the normal token-based functions you are used to, as well as some new fixer-specific functions to do things like replacing token content and adding content and newlines to tokens. PHPCBF is also able to group a set of changes so that they are all applied or rejected together, detect sniffs trying to fix the same piece of code in the same run, and detect and resolve conflicting changes that are being applied.

The Diff Report

Along with the new PHPCBF script, PHP_CodeSniffer adds a new report type; the diff report. If you don't like the idea of a script fixing errors automatically, you can instead ask PHP_CodeSniffer to output a diff of the fixes that it would make using the command line argument --report=diff. If you like what you see, you can simply change the phpcs command to phpcbf, leave all the command line options the same, and let PHPCBF patch your files.

More Features

Version 2.0.0 brings a lot more changes that just auto-fixing, including:

View the full 2.0.0 changelog at PEAR or Github

Thank You

So many developers have helped test this auto-fixing code over the last year and half. More than twenty developers have directly contributed code to make the auto-fixing more accurate and more still have reported bugs to help diagnose issues with the fixing.

But I'd like to send special thanks to Alexander Obuhovich (@aik099 on Github and Twitter) for both testing PHPCBF and for being so active on Github issues and PRs, and to the developers working on the WordPress coding standards (https://github.com/WordPress-Coding-Standards/WordPress-Coding-Standards) for working with me to find solutions for running and testing complex custom coding standards and for trying out the auto-fixing code in their standard.

Version 1.5.6

I've also released version 1.5.6 today and I'm keeping the 1.5 branch around for a little while longer to make it easier to upgrade to 2.0, especially for developers who need to change custom coding standards. If you have having problems with the 2.0 version, please use 1.5.6 in the meantime.

View the full 1.5.6 changelog at PEAR or Github.

Stay up to date on all PHP_CodeSniffer changes, including new features and releases, by subscribing to the RSS feed or following me on Twitter.


Analysis of Coding Conventions

Coding standards tend to vary between projects. Even projects that use the same written standard can vary in a number of ways as standards tend to leave a lot of room to apply your own coding style in various areas. To get a better idea of what coding styles PHP developers are using, I used PHP_CodeSniffer to analyse and report on a number of big and small public PHP projects hosted on Github.

The result of the analysis is here: http://squizlabs.github.io/PHP_CodeSniffer/analysis/

There are currently 63 projects being analysed, with trend data recorded for about the last 8 months. The report currently lists 42 different coding style conventions ranging from "how many spaces surround the concat operator" to the classic "tabs or spaces for line indent". The coding conventions are listed from most contentious to least contentious. So the ones at the top show the most variations between projects while the ones at the bottom are generally accepted by all projects.

Each coding convention provides a list of projects that use each of the style variations within it. Clicking on any of those projects will take you to a project-specific report showing the coding conventions used within it and how often the same style is followed within the project. So you can take a look at the coding conventions for CakePHP, Drupal, Symfony2 or even PHP_CodeSniffer itself. 

How did you pick the projects to analyse?

A combination of popular PHP projects on Github, projects that I've used, and projects that have received some attention while the report was being built.

Can you include my project in the report?

If it is a public PHP project on Github, I sure can. All the repositories that are being analysed are listed in the repos.json file.

For each one, I need to know the name of the project, the URL (organisation/repository), the path in the project where PHP_CodeSniffer should start analysing code (default: /), any files or directories that should be ignored (default: none), and a list of additional PHP file extensions to check beyond .php and .inc (default: none). Here is an example:

{
    "name": "My Project",
    "url": "squizlabs/my-project",
    "path": "src",
    "ignore": "3rdparty/*,includes/*",
    "extensions": "phtml,module"
}

There are also 63 other examples in the repos.json file to take a look at. Submit a PR or send me a gist on Twitter. If you have no idea what the values should be, get me the Github URL and I'll figure it out for you.

How is the report generated?

From version 2.0.0a2, PHP_CodeSniffer collects metrics about the code it is checking. These are used to determine coding conventions and can be displayed using --report=info. The coding convention report uses a custom PHP_CodeSniffer report that outputs these metrics as a JSON file instead of to the screen. To save a lot of time, a custom coding standard is used to limit checks to just the sniffs that generate the metrics that the report wants to use. Some ugly custom wrapper scripts deal with updating the repostories, keeping trend data for each coding convention, and generating the HTML reports. The whole thing runs through HHVM to make it as fast as possible.


Stay up to date on all PHP_CodeSniffer changes, including new features and releases, by subscribing to the RSS feed or following me on Twitter.


PHP_CodeSniffer 2.0.0a2 released

I've just released the second alpha of PHP_CodeSniffer version 2.0.0. This update brings a new type of report, performance improvements, and Phar distribution for each download and testing.

Information Report

PHP_CodeSniffer now comes with an information report that is able to show you information about how your code is written rather than checking that it conforms to a standard. You can read more about this report, and see example output, on the wiki.

Performance Improvements

A number of minor performance improvements have gone into this version, which will probably only be obvious when checking very large code bases. As a result, there are a few important changes to know about:

  • Line length warnings will now be shown for lines that refer to licence and VCS information. The line length sniff previously ignored these lines, which meant that it had to run a regular expression on every line it checked.
  • The $tokens array has a new length index that you can use to determine the length of the token's content rather than having to call strlen() yourself and deal with character encoding.
  • The use of in_array() when checking the PHP_CodeSniffer_Tokens static vars impacted performance significantly, so they have been restructured so that you can also use isset() on them.
  • Custom reports can now specify a $recordErrors member var that, when set to FALSE, will tell PHP_CodeSniffer that it doesn't need to record errors during the run. This gives a significant memory saving if you are using a custom report to output summary information rather than a full list of errors found.

Phar Distribution

For the first time, PHP_CodeSniffer's phpcs and phpcbf commands are now available as Phar files. The Phars are still in testing and are known to not work with HHVM, but are an easy way to try out the new 2.0 alpha versions.

curl -OL https://github.com/squizlabs/PHP_CodeSniffer/releases/download/2.0.0a2/phpcs.phar
php phphcs.phar /path/to/code

curl -OL https://github.com/squizlabs/PHP_CodeSniffer/releases/download/2.0.0a2/phpcbf.phar
php phpcbf.phar /path/to/code

Everything Else

Besides these major changes, there are a number of bug fixes and improvements to automatic code fixing. Thanks to all the developers who directly contributed code to this release.

View the full 2.0.0a2 changelog at PEAR or Github

Stay up to date on all PHP_CodeSniffer changes, including new features and releases, by subscribing to the RSS feed or following me on Twitter.


PHP_CodeSniffer 2.0.0 alpha1 released

I've just released the first alpha of PHP_CodeSniffer version 2.0.0. This update brings an often requested feature; the ability for PHP_CodeSniffer to automatically fix the problems that it finds. It also contains a complete rewrite of the comment parsing sniffs, finally removing what I feel is the poorest code in PHP_CodeSniffer; the comment parsing classes.

Fixing Errors Automatically with PHPCBF

PHP_CodeSniffer now comes with a second script; phpcbf, the PHP Code Beautifier and Fixer. This script piggy-backs off PHP_CodeSniffer to provide fixes for a lot of common errors. It will never fix 100% of the errors PHP_CodeSniffer finds as many require a developer to make a decision (e.g., can you use === here?) or may cause the code to execute differently if changed (e.g., uppercasing a constant name). But there are still a lot of errors that can be corrected without causing issues. Out of the 566 unique error messages that PHP_CodeSniffer can currently produce, version 2.0.0a1 is able to correct 202 of them automatically, which is about 35%. When you run PHP_CodeSniffer and get a report of the errors found, you will now be able to see which of those errors can be automatically corrected.

Along with this new script, PHP_CodeSniffer adds a new report type; the diff report. If you don't like the idea of PHP_CodeSniffer fixing errors automatically, you can instead ask it to output a diff of the fixes that it would make using the command line argument --report=diff. If you like what you see, you can simply change the phpcs command to phpcbf, leave all the command line options the same, and let PHP_CodeSniffer patch your files.

All this new functionality is documented on the wiki, and includes sample output, so please take a read: https://github.com/squizlabs/PHP_CodeSniffer/wiki/Fixing-Errors-Automatically

Plugin for Sublime Text

I use Sublime Text 2 for development, as do some of the other developers at Squiz Labs. I've been working with one of these developers to build a plugin for Sublime Text that will run PHP_CodeSniffer on the current file, show which errors can be fixed automatically, fix the errors for you if you decide to have them fixed, and show a diff of what it did. I've added quite a few features to PHP_CodeSniffer to make this plugin feel as integrated as possible and I find it incredibly useful for tidying up my code before committing.

The plugin is still being documented, and made to work with Sublime Text 3 due to some issues with the diff library, but it will be released through Package Control as soon as they are worked out (or sooner, for ST2 only if needed). In the meantime, you can clone the repository yourself or just watch it for activity if you are interested in knowing when it is ready: https://github.com/squizlabs/sublime-PHP_CodeSniffer

Adding Fixes to Custom Sniffs

If you have your own custom sniffs and want to correct errors automatically, you need to make a couple of changes. The first thing you need to do is call the addFixableError() or addFixableWarning() methods instead of addErorr() and addWarning(), to let PHP_CodeSniffer know the error can be fixed. You can then make use of the new PHP_CodeSniffer_Fixer class to replace token values and add content to tokens, modifying the token stack as you go.

Here is a simple example, to ensure that comments don't appear at the end of a code line ($stackPtr is the comment token):

$error = 'Comments may not appear after statements';
$phpcsFile->addError($error, $stackPtr, 'Found');

The fix this, we just need to add a newline before the comment. Other sniffs will do things like fix alignment and spacing of the comment for us. It is also important we check if the fixer is enabled and if we are supposed to be fixing this specific error message. We end up with code like this:

$error = 'Comments may not appear after statements';
$fix   = $phpcsFile->addFixableError($error, $stackPtr, 'Found');
if ($fix === true && $phpcsFile->fixer->enabled === true) {
    $phpcsFile->fixer->addNewlineBefore($stackPtr);
}

If you are changing multiple tokens in a single fix, using a changeset will ensure that they are either all applied at once, or not applied at all. This is important if a partial change would lead to a parse error, or another equally bad outcome. In the following example, a changeset is used to ensure that all content between the array braces is removed, even if the content spans multiple lines. If one of the tokens to be removed has already been modified by another sniff, the whole changeset will be ignored and PHP_CodeSniffer will attempt to apply this changeset on a second run through the file.

$error = 'Empty declaration must have no space between parentheses';
$fix   = $phpcsFile->addFixableError($error, $stackPtr, 'SpaceFound');
if ($fix === true && $phpcsFile->fixer->enabled === true) {
    $phpcsFile->fixer->beginChangeset();
    for ($i = ($arrayStart + 1); $i < $arrayEnd; $i++) {
        $phpcsFile->fixer->replaceToken($i, '');
    }

    $phpcsFile->fixer->endChangeset();
}

Take a look at the PHP_CodeSniffer_Fixer class for all the methods you can use: https://github.com/squizlabs/PHP_CodeSniffer/blob/phpcs-fixer/CodeSniffer/Fixer.php

Removal of the Comment Parser

The other major change is the removal of the comment parsing classes. If you have written custom sniffs that either use the comment parsing classes or extend sniffs that do, you are going to need to review your sniffs and possibly rewrite them. This has already been done for all included sniffs, so there are quite a few examples to help you get started.

The best way to explain the change is to show an example of how PHP_CodeSniffer has changed the way it handles doc comments internally. If PHP_CodeSniffer is processing the following comment:

/**
 * PHP_CodeSniffer tokenises PHP code.
 *
 * @author    Greg Sherwood <gsherwood@squiz.net>
 * @copyright 2006-2012 Squiz Pty Ltd
 */

It would have previously tokenized it like this (spaces have been replaced by periods):

T_DOC_COMMENT => /**\n
T_DOC_COMMENT =>  * PHP_CodeSniffer.tokenises.PHP.code.\n
T_DOC_COMMENT =>  *\n
T_DOC_COMMENT =>  * @author....Greg.Sherwood.<gsherwood@squiz.net>\n
T_DOC_COMMENT =>  * @copyright.2006-2012.Squiz.Pty.Ltd\n
T_DOC_COMMENT =>  */

This format makes it very hard to look for specific things like a comment tag name, and makes it very hard to make fixes to the comment automatically. So PHP_CodeSniffer will now tokenize the comment like this:

T_DOC_COMMENT_OPEN_TAG => /**
T_DOC_COMMENT_WHITESPACE => \n
T_DOC_COMMENT_WHITESPACE => .
T_DOC_COMMENT_STAR => *
T_DOC_COMMENT_WHITESPACE => .
T_DOC_COMMENT_STRING => PHP_CodeSniffer.tokenises.PHP.code.
T_DOC_COMMENT_WHITESPACE => \n
T_DOC_COMMENT_WHITESPACE => .
T_DOC_COMMENT_STAR => *
T_DOC_COMMENT_WHITESPACE => \n
T_DOC_COMMENT_WHITESPACE => .
T_DOC_COMMENT_STAR => *
T_DOC_COMMENT_WHITESPACE => .
T_DOC_COMMENT_TAG => @author
T_DOC_COMMENT_WHITESPACE => ....
T_DOC_COMMENT_STRING => Greg.Sherwood <gsherwood@squiz.net>
T_DOC_COMMENT_WHITESPACE => \n
T_DOC_COMMENT_WHITESPACE => .
T_DOC_COMMENT_STAR => *
T_DOC_COMMENT_WHITESPACE => .
T_DOC_COMMENT_TAG => @copyright
T_DOC_COMMENT_WHITESPACE => .
T_DOC_COMMENT_STRING => 2006-2012.Squiz.Pty.Ltd
T_DOC_COMMENT_WHITESPACE => \n
T_DOC_COMMENT_WHITESPACE => .
T_DOC_COMMENT_CLOSE_TAG => */

As you can see, this is a significant number of extra tokens, but allows for much finer control over how a comment is processed. The T_DOC_COMMENT token has also been removed, replaced instead by T_DOC_COMMENT_OPEN_TAG. If you have a sniff listening for T_DOC_COMMENT, make sure you change it in your register() method and anywhere else it is used throughout your sniffs.

Here is a fairly complex example of a sniff rewrite; the Squiz FunctionComment sniff:

old: https://github.com/squizlabs/PHP_CodeSniffer/blob/master/...
new: https://github.com/squizlabs/PHP_CodeSniffer/blob/phpcs-fixer/...

And here is the commit of the changes required for the new comment tokenizer: https://github.com/squizlabs/PHP_CodeSniffer/commit/...

Essentially, instead of using the comment parser functions to get at tag values, you can use the standard PHP_CodeSniffer functions to navigate the token stack and look for what you need. This allows you to get a precise location of each piece of the comment and make fixes to it if you need to. You can also report more accurately on where an error has occurred.

Everything Else

Besides these major changes, there are a few important new features that have made it into version 2.0.0a1, including the ability to write your own custom report classes, the ability to set default command line argument values in ruleset XML files, and an easier way for custom sniff developers to skip through the token tree.

View the full 2.0.0a1 changelog at PEAR or Github

Stay up to date on all PHP_CodeSniffer changes, including new features and releases, by subscribing to the RSS feed or following me on Twitter.


PHP_CodeSniffer 1.4.7 and 1.5.0RC4 released

PHP_CodeSniffer versions 1.4.7 and 1.5.0RC4 have just been uploaded to PEAR and are now available to install. Version 1.4.7 is primarily a bug fix release, but also contains a new JUnit report format, a few new sniff settings, and a change to the PSR2 standard based on recently added errata.

Version 1.5.0RC4 additionally adds the ability to restrict violations to specific error messages by using the --sniffs command line argument. Previously, this only restricted violations to an entire sniff and not individual messages.

Special thanks to the 7 developers who directly contributed code to these releases. Your help is always very much appreciated.

View the full 1.4.7 changelog at PEAR or Github
View the full 1.5.0RC4 changelog at PEAR or Github

Stay up to date on all PHP_CodeSniffer changes, including new features and releases, by subscribing to the RSS feed or following me on Twitter.


PHP_CodeSniffer 1.4.6 and 1.5.0RC3 released

PHP_CodeSniffer versions 1.4.6 and 1.5.0RC3 have just been uploaded to PEAR and are now available to install. Version 1.4.6 is primarily a bug fix release, but also contains a new JSON report format, a huge number of sniff docs, and a few new sniffs (mostly in the Squiz standard).

Version 1.5.0RC3 additionally contains a fix for the --report-file command line argument, support for a new T_GOTO_LABEL token, and a change to PHP_CodeSniffer::isCamelCaps() to allow acronyms at the start of a string if not performing a strict check.

Special thanks to the 9 developers who directly contributed code to these releases. Your help is always very much appreciated. And special thanks to Spencer Rinehart for writing so much documentation for the included sniffs.

View the full 1.4.6 changelog at PEAR or Github
View the full 1.5.0RC3 changelog at PEAR or Github

Stay up to date on all PHP_CodeSniffer changes, including new features and releases, by subscribing to the RSS feed or following me on Twitter.


PHP_CodeSniffer 1.4.5 and 1.5.0RC2 released

PHP_CodeSniffer versions 1.4.5 and 1.5.0RC2 have just been uploaded to PEAR and are now available to install. Version 1.4.5 is primarily a bug fix release, although there are a few new sniffs and sniff settings that some developers may find useful. In addition to these changes, 1.5.0RC2 contains big changes to the way rulesets are processed to make them more predictable and to add a couple of new features.

From version 1.5.0RC2 onwards, ruleset processing has much better support for relative paths and detection of directories of sniffs. This may mean that sniffs that were not previously being included in a standard are now included correctly. Please check your standards to see if any new sniffs are being included. The best way to do this is to use the -e command line argument. For example, phpcs --standard=mystandard.xml -e , which will print a list of sniffs that will be run over your code.

Version 1.5.0RC2 also includes the ability to exclude whole directories of sniffs inside a ruleset and the ability to pass multiple standards to PHP_CodeSniffer on the command line. For example, phpcs --standard=PEAR,Squiz,mystandard.xml /path/to/code , which will run 3 standards against your code.

Special thanks to the 4 developers who directly contributed code to these releases. Your help is always very much appreciated.

You can view the full 1.4.5 changelog on the PHP_CodeSniffer 1.4.5 download page and the full 1.5.0RC2 changelog on the PHP_CodeSniffer 1.5.0RC2 download page.

Stay up to date on all PHP_CodeSniffer changes, including new features and releases, by subscribing to the RSS feed or following me on Twitter.


PHP_CodeSniffer 1.4.4 released

PHP_CodeSniffer version 1.4.4 has just been uploaded to PEAR and is now available to install. This is primarily a bug fix release, although there are a couple of nice new sniff features that some developers may find useful, including a new sniff to run CSS Lint on your CSS files.

Special thanks to the 6 developers who directly contributed code to this release. Your help is always very much appreciated.

You can view the full changelog on the PHP_CodeSniffer download page.

Stay up to date on all PHP_CodeSniffer changes, including new features and releases, by subscribing to the RSS feed or following me on Twitter.


PHP_CodeSniffer 1.4.3 released

PHP_CodeSniffer version 1.4.3 has just been uploaded to PEAR and is now available to install.

This is primarily a bug fix release, although support for the upcoming PHP 5.5 T_FINALLY token has been added, a few PSR-2 issues have been fixed and a change has been made to improve Composer support.

Special thanks to the 4 developers who directly contributed code to this branch. Your help is always very much appreciated.

You can view the full changelog on the PHP_CodeSniffer download page.

Just a special note to say that my wife and I are expecting another baby in less than 2 weeks, so PHP_CodeSniffer development is likely to reduce significantly over the Christmas/New Year period. But please keep sending in bug reports and pull requests. I'll review them all when I get back, or if I get some time over the holidays.   

Stay up to date on all PHP_CodeSniffer changes, including new features and releases, by subscribing to the RSS feed or following me on Twitter.


PHP_CodeSniffer general memory improvements

PHP_CodeSniffer version 1.3.6 introduced memory improvements when using the summary report. You can take a look at the improvements that were reported at the time when running over the Symfony2 codebase.

Obviously, the summary report is only useful when running PHP_CodeSniffer from the command line and viewing the output. It wont be used in continuous integration systems, or in text editors, or by developers actually fixing errors.

But these improvements were just the start of a bigger change to try and restructure the PHP_CodeSniffer reporting system to stop it relying on so much memory during a run over a large number of files with a large number of errors. These changes are now in development and are available in a new git branch for testing right now.

To test these new changes, I again ran the PEAR coding standard over a Symfony2 checkout. Symfony2 doesn't use the PEAR coding standard, so this is a good way to generate a lot of errors and warnings over a large code base.

The PEAR standard generates 29,572 errors and 11,752 warnings in 2049 different files. Here is a summary of the time and memory usage, as reported by PHP_Timer:

Type of run Time (seconds) Memory (Mb)
Old code with warnings
(--report=full)
54 189.25
Old code without warnings
(--report=full -n)
52 140.75
New code with warnings
(--report=full)
52 75.00
New code without warnings
(--report=full -n)       
51 75.00

Clearly, the memory improvements are significant. There is probably a bit more that can be squeezed out depending on the type of report being shown, but that is pretty close to what can be expected from reports like CSV, XML and Checkstyle, that are typically used by automated tools and continuous integration systems.

Smaller code bases and/or less errors and warnings will further reduce the memory usage and you may actually see larger improvements on your own projects.

To summarise the changes: instead of keeping errors and warnings in memory so reports can be generated at the end of a run, PHP_CodeSniffer now writes a partial report to a file after each file is processed. At the end, the reporting system is given this cached data and able to add headers and footers, before it is finally output to screen. If the report is going to be written to a file instead, that file will be used throughout the run for the partial reports. Otherwise, a temp file is written to the current directory and removed once the run has been completed.

If you would like to try out these changes, you can grab the code from a new branch on the git repository. The easiest way to do this is to run the following commands:

git clone git://github.com/squizlabs/PHP_CodeSniffer.git
cd PHP_CodeSniffer
git checkout report-memory-improvements
php scripts/phpcs /path/to/code ...

To achieve these improvements, it has been necessary to remove a couple of rarely used features. So far, only support for multi-file sniffs (of which there are none in the core distribution) has been removed and there is no longer a shortcut to print both the summary and source reports at the same time, but this can still be done using the standard reporting arguments.

Other than that, PHP_CodeSniffer should still work exactly the same. If not, please get in touch and let me know. The best way is to contact me on Twitter, or submit bug reports and pull requests.


Prev Next

Squiz Labs

R & D division of Squiz Pty Ltd

Open source web experience management solutions

Squiz Labs is the research and development division of Squiz, the company behind the Squiz Suite of web experience management and development tools.

Our PHP and JavaScript developers are responsible for producing innovative enterprise-quality software while our interface designers and testing team ensure our products both look great and are easy to use.