Lisp HTML sanitizer

Lately, I was thinking a lot about enabling webapp users to edit rich text easily while staying secure and injection-free.  Until recently, I would just use trane-bb module of CL-Trane, and make users type BBCode inside a textarea, since many users are familiar with it, and I’d be able to easily convert their BB to safe HTML.  However, all JavaScript WYSIWYG editors provide HTML code, which is not that surprising.  I googled around and read a bit on all the issues related with BBCode, Textile and other markup languages, and came to agree with John Atwood (Is HTML a Humane Markup Language?) on HTML being the actually friendly, single markup language.  I was pleasantly surprised to see Bese‘s fork of Franz‘s phtml actually support HTML sanitizing, and (having contributed quite a bit to Bese a few years ago) not surprised at all that this feature is not actually described or documented anywhere.  So, if you’re worried about accepting HTML (and if you’ve decided to accept HTML from users, you should be worried!), check this out:

darcs get