About
This parser was inspired by HTML Parser for PHP-4, however this parser was written from the ground up to be optimized for speed. Also, it was written for PHP 5.
The Simple HTML Parser might be described as a stream parser. It divides HTML into nodes (tags, text, or comments) and returns them one at a time in the order they appear in the source.
Usage
Download
You can get the latest version here. The version number is listed as a property of the class.
This software is released under the GNU Lesser General Public License
To Do
Things I would like to see added:
- Handle embedded Javascript better: Currently, if the Javascript outputs any HTML tags the parser will get confused, unless the code block is wrapped in HTML comment tags or a CDATA block.
Change Log
- Oct 22, 2008 - Accepted commented out tags such as "<!--p></p-->" as valid comments. Patch contributed by Roman Kirillov http://sigizmund.com
If you make any modifications to this library, please email me () a patch, so that you can help improve the library for everyone.