I've wrestled with this one, and I now take a "location specific" approach to it.
If the caption is displayed
over a thumbnail, the whole caption - title and comment/description - is an active link. On a folder thumbnail it opens the folder, and on an image thumbnail it opens the lightbox (or the slide page, in the case of a "slide page" skin). In short, the captions have the same anchors as the thumbnail images themselves.
To avoid collisions, I strip all HTML tags out of the user-entered title and comment/description when they're used in this context. This does mean that they lose things like embedded
bold and
italic tags as well as embedded anchor strings, but that's just too bad.
For use elsewhere, like
below a thumbnail or in the caption area of the lightbox, I leave the user-entered stuff alone.