Found the Culprit of the Pacman’s Effect

Some time ago I wrote Entity enzyme, or The pacman effect strikes back. It was an article about the pacman effect of the ampersand in WordPress and how to try to solve it with a simple enzyme. Recently I’ve discovered that it’s an escaping / unescaping issue still unresolved in WordPress 2.1, and it’s somewhat nastier than I initially thought.

If you want to write HTML entity codes in a post, and you need to represent that of an ampersand for example (it’s {[83.entity(83.amp)]}), then there is no way to get it right. In fact, WordPress will always resolve an entity code.

Thus, if you write {[83.entity(83.amp)]}, you’ll get & after the first save roundtrip, and if you try to escape the & into {[83.entity(83.amp)]} with {[83.entity(83.amp)]}, obtaining {[83.entity(83.amp)]}amp; (this looks weird but it’s the way to do it in plain HTML), then the first time you save the post you get {[83.entity(83.amp)]} back, and the second time you save the post you get & back.

In general if you write & followed by any number of amp; (like {[83.entity(83.amp)]}amp;amp;amp;amp;amp;amp;) then WordPress will make the & eat up an amp; at each saving roundtrip, hence the pacman effect.

But the title of my previous post about this issue was “the pacman effect strikes back.” In fact it’s not only a problem of the post content but also of the custom fields, thus corrupting the last resort we WordPress bloggers have for content AS IS. And this is where I get really upset.

Last tuesday I found the culprit and asked the wp-hackers list wether they considered it a bug or not. I’m still waiting for an answer, so I hope this post will help me to broaden the question and understand if I need to submit it to WordPress Trac or go in for a hack myself.

{[129.page(.nabble)]}