You have a few more options...
If the HTML is really XHTML, you can parse it as XML.If the HTML is not XHTML, but uses limited tags that are not self-closing, you can convert those tags to their self-closing form and parse the HTML as XML.
I have had...