AdSense Enzymes

AdSense Enzymes are very simple.

At the simplest level of abstraction, I can directly trasclude the custom field which I’ve stored the ad code into. And I can do it either by means of a statement into the content of a post or a page, or by means of a call to the metabolize function (available in Enzymes 1.1) into the php code of a WordPress template file.

The former method is useful when I want to place an ad unit in a particular/variable position inside the content of a post or a page; the latter method is useful when I want to place an ad unit in a general/constant position inside the blog.

For example, if I put the statement {[1.ad001]} here, it would reproduce by itself the ad unit right here, because in the first post I’ve stored the ad code in a custom field called ad001. But to make the ad unit appear before any post, I need to find the line in the index.php file of my default theme that reads

and replace it with this line

Taking the abstraction one step further, I’d like to store the string 1.ad001 in a new home field, so that I can provide a separation layer that makes it possible for me to replace the ad code simply by changing the content of a field, rather than having to change the php file again.

Indirect transclusion is not directly available in Enzymes, but it can be easily achieved by means of a simple enzyme like this

preg_match( '/'.$this->e['substrate'].'/', $this->substrate, $matches ); return $this->item( $matches['sub_id'], $matches['sub_key'] );

I’ve called this enzyme get, and I’ve put it into the first post. So the Enzymes statement becomes {[1.get(1.home)]} and the edited line for the index.php file becomes

I’m currently using the latter for my blog home, so I don’t have to worry about placing ads every time I post a new log, and the former for my pages, so that I can place the ads insdide the content, in a position that I hope will fit better.

Why does Google limit to three the number of ad units per page? Is it a technical reason?

Number of submatches of a regular expression

Chili development started because I found a bug at the core of Code Highlighter, and wanted to fix it. The bug was inside the snippet used to count the number of submatches of a given regular expression.

That number is central to the working of a clever parsing engine, based upon the possibility of matching once a big expression against the target, rather than matching many times smaller expressions.

The big expression is built by alternating the smaller ones, so that each of them becomes a submatch of the big expression, like in this example: (A)|(B)|(C).

But that is just the tip of the iceberg, because each of the smaller expressions can in turn have many submatches, which add up to the total number of submatches returned in an array as the result of the big match.

If a match is found, then submatches[0], which holds the global match is certainly not empty, as not empty must also be submatches[x], being x the index of the first smaller expression that matched.

In the above example, if the number of submatches of A, B, and C be always 0, then submatches[1] would be not empty if A matched, else submatches[2] would be not empty if B matched, else submatches[3] would be not empty if C matched.

But if the number of submatches of A was nA, of B nB, and of C nC, then the x for (A) would be 1, for (B) it would be (1+nA)+1, and for C it would be (1+nA)+(1+nB)+1.

Now we have all the info required for detecting the smaller expression that matched the target by looking at submatches[1], else at submatches[2+nA], or else at submatches[3+nA+nB]. The first of them which is non empty is the one that matched.

The number of open parentheses is related to the number of submatches. However it is not exactly that number, due to the following exceptions

  • parentheses are also used for temporary grouping
  • parentheses can be escaped, and considered part of the target
  • the escaping device can be escaped

Instead of trying to count the open parentheses by means of only one expression that accounts for the above exceptions, I’ve found it’s cleaner to use three separate steps.

  1. re = X.replace( /\./g, “%” )
    this removes any escaped character
  2. re = re.replace( /[.*?]/g, “%” )
    this removes any character class
  3. nX = ( re.match( /((?!?)/g ) || [] ).length
    this matches all the open parentheses not followed by a “?”

In particular:

  1. This step disables any escaped backslash or open parenthesis (as well as any other escaped character, but I don’t care). This way I’m done with the issue of escaping based on the use of the backslash sign. The X represents the regular expression under examination
  2. This step disables any open parenthesis inside any character class (as well as any other character inside any character class, but I don’t care). In fact those open parenteses could have been written without escaping them, because they are escaped by default
  3. This step is just the classical short definition of what describes a submatch in a regular expression. nX represents the number of submatches of X

Enzymes 1.1 Released Today – updated

Changes

  • the title and the excerpt are now directly supported, together with the content
  • any location of a WordPress theme is now supported

Example

  • open the php file you are interested in (eg: sidebar.php)
  • select the location where you want the enzyme’s result to appear (eg: before the last closing ‘div’)
  • paste there a code like this
  • save and test
  • generic custom field keys are now supported. Just wrap your international, multiworded key in a pair of ‘=’ and you are done (any ‘=’ your key may include must be escaped by ”)

Example

  • escaped: {[.=æ°´=]}
  • metabolized: mizu (æ°´) means water
  • no bug fixes, hence no need to upgrade, except for getting the new features
  • backward compatibility preserved, hence no reason not to upgrade 🙂

Files