How to write a safe catch-all RegExp

In my Chili recipes I use the regexp /(?:.|n)/ when I want to mean each and every char. It cannot be just a dot, because a dot doesn’t match n (AKA new line AKA line feed AKA 0x0A). Due to the fact that I erase from the input any r (AKA carriage return AKA 0x0D) before trying a match, the regexp I use is almost safe, in fact I do not consider u2028 nor u2029, two additional (unicode) new line chars that a dot doesn’t match.

In short, and in general, a safe catch-all regexp is /(?:w|W)/ which means any word char or any non-word char.

So, if you are concerned with these unicode details and can’t wait till the next release of Chili, I advise you to search (?:.|n) and replace any occurrence with (?:w|W).

How to add semantics to WordPress posts

Browsing the WordPress’ ideas repository I’ve found one that could make WordPress fill the gap between a blog tool and a knowledge base: Structured Blogging.

The fact that it got 60 votes but only 50% stars means two things:

  1. voting people are lazy thinkers
  2. it’s a difficult task at many levels

Nonetheless it’s a good idea because this is and will be for many years to come the blogging era, where millions of authors post content to the Internet. There is no good reason for that content to be unstructured except lazyness and complexity.

Nothing can be done about lazyness but complexity can be substantially reduced. The solution needs:

  • a mechanism for adding XML tags from some dictionary
  • a template system for showing XML content appropiately

On the path to something readily usable (I hope Matt will provide it in WordPress 3.0), a mechanism could be a jQuery plugin for the visual editor, and a template system could be based on Enzymes.

Autocompletion

I think that tinyMCE can be extended by plugins, but I don’t know how to. jQuery itself has an autocompletion plugin that could be used this way. When an author presses a < key, the autocompletion gets triggered and the popup … pops up 🙂

If you have ever used IME (for Japanese for example), you should know what I mean. Options are organized in a hierarchy and you get the most relevant at the top, depending on the few characters already typed. XML dictionaries could be ajaxed and locally cached; they could be plugins that you install into your blog, or be web services too; they could be international standards compliant or be completely custom.

The most-relevant-first feature of the autocompletion popup would be a big help for authors. If I already opened a Tag, then the most relevant options after pressing < could be:

  1. the closing tag for that Tag
  2. any tag that could be inside that Tag (properly sorted if needed)
  3. any root of the cached dictionaries (if possible in Tag)
  4. an option for accessing additional dictionaries (if possible in Tag)

Templates

A PHP template for properly rendering XML content is very easy to write, and Enzymes could be used in the clockwork. In fact an Enzymes feature is the possibility of easily transcluding all the content of a post or a page by means of the special char *

The XML to HTML template can be transformed in an Enzymes template with a couple of PHP instructions, thus allowing you to almost copy and paste already available templates. Then you could apply a solution like this:

  1. write a private post or page with the XML content, using the autocompletion feature described above (say this post is #123)
  2. write a public post or page with the Enzymes statement {[123.* /template.php]} in its content

The post in 1. needs to be private because you don’t want it to appear on the blog as is, and 2. makes the transformation happen. As an added benefit, you can display that semantic content inside of a greater context, and you can transclude it wherever you want in your WordPress blog (as many times as you like).