HTML_Safe have been just released, a first beta 1.0.0RC1 as a PEAR Package. Its main goal is to strips down all potentially dangerous content within HTML. SafeHTML is using HTMLSax to parse HTML.

Danerous tags within HTML includes :

  • opening tag without its closing tag 
  • closing tag without its opening tag 
  • any of these tags: “base”, “basefont”, “head”, “html”, “body”, “applet”, “object”,
    “iframe”, “frame”, “frameset”, “script”, “layer”, “ilayer”, “embed”, “bgsound”,
    “link”, “meta”, “style”, “title”, “blink”, “xml” etc.
  • any of these attributes: on*, data*, dynsrc
  • javascript:/vbscript:/about: etc. protocols
  • expression/behavior etc. in styles
  • any other active content

I just like small classes that offer useful functionnality, and HTML_Safe is very easy to use


require_once 'HTML/HTML_Safe.php';
$safehtml =& new safehtml();
$result = $safehtml->parse($doc);

Its very interesting since it work just on security issue, and there is no configuration to strip some tags and keep others. Which could be a security problem itself.

The beta version have been just released today, so maybe it take time to be available for download on PEAR, so you can get it from the safehtml website