Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

HTML purifying in php

I am writing PHP class which have to remove all potentially dangerous elements or bogus html tag (such as bad links) from HTML source.

Usually I would use HTML Purifier library or similar library,
but self-written code is required in this project.

There are two conditions:

  1. It can not have more than 3kB code
  2. it should execute really fast

I wrote something that could do the job: http://pihost.pl/purify.php
but i do not know if it is safe enough to use

My question is:
Is there any way to test it properly?
Or maybe someone has quick, small and tested library like this?

like image 653
Ascon Avatar asked Dec 07 '10 20:12

Ascon


1 Answers

An important thing to consider -- how does your purifier react to broken/malformed HTML? To combat that situation, I would suggest running it through PHP tidy first to clean up the HTML, before you purify it.

If you want a series of tests, you can try checking out the tests that HTMLPurifier uses.

like image 189
Vivin Paliath Avatar answered Sep 26 '22 11:09

Vivin Paliath