Some time ago I wrote Entity enzyme, or The pacman effect strikes back. It was an article about the pacman effect of the ampersand in WordPress and how to try to solve it with a simple enzyme. Recently I’ve discovered that it’s an escaping / unescaping issue still unresolved in WordPress 2.1, and it’s somewhat nastier than I initially thought.
If you want to write HTML entity codes in a post, and you need to represent that of an ampersand for example (it’s &), then there is no way to get it right. In fact, WordPress will always resolve an entity code.
Thus, if you write &, you’ll get & after the first save roundtrip, and if you try to escape the & into & with &, obtaining &amp; (this looks weird but it’s the way to do it in plain HTML), then the first time you save the post you get & back, and the second time you save the post you get & back.
In general if you write & followed by any number of amp; (like &amp;amp;amp;amp;amp;amp;) then WordPress will make the & eat up an amp; at each saving roundtrip, hence the pacman effect.
But the title of my previous post about this issue was “the pacman effect strikes back.” In fact it’s not only a problem of the post content but also of the custom fields, thus corrupting the last resort we WordPress bloggers have for content AS IS. And this is where I get really upset.
Last tuesday I found the culprit and asked the wp-hackers list wether they considered it a bug or not. I’m still waiting for an answer, so I hope this post will help me to broaden the question and understand if I need to submit it to WordPress Trac or go in for a hack myself.