Skip to main content


I have a bunch of files that I am parsing out a simple tag
<institution content-type="division">(.+?)?<\/institution>

Everything is going fine until I hit this result:
2) School of Chemical &amp; Biomolecular Engineering

This is complaining because it thinks the ) is unmatched. How do I get it to accept whatever is in the capture? I tried playing around with \Q and \E but that isn't working. For clarity, I won't know what is in this tag, it could be anything at all and the tag is simple enough that I don't want to use some XML parser.

submitted by /u/sirhalos
[link] [comments]