How to workaround the Same Origin policy

How do you programmatically import stuff from a web page on another domain? I have this problem from time to time, mostly when I want to try something in JavaScript without worrying about deploying a proper setup. And recently I had this problem once again. And once again I hit the Same Origin policy thick wall.

Recent browsers support CORS, but for fiddling with a third party page you can’t reasonably ask their owners to ask their hosting providers to allow requests from your domain of choice, like jsfiddle.net, so your only chance is JSONP. And when you look around for a solution involving JSONP and Same Origin policy, you inevitably find YQL, used as a proxy.

{[ .yql-select | 1.hilite(=mysql=) ]}

YQL, which can act as a free proxy by using a simple query, is very attractive. If you are lucky, you can find some example of how to use it with jQuery. So, after some programming work you get to a nice jQuery plugin, like this:

{[ .proxyGet | 1.hilite(=javascript=) ]}

A little diversion. Disco-tools is a European effort for putting order into the messy terminology used for skills and competences, only made worse by so many different languages used for European CVs. So they decided to publish a nice thesaurus of all possible skills in the world, properly translated to many languages. Now at version 2, you can browse the thesaurus like a tree. If you look at the network traffic while clicking around, you will notice that some ajax calls are issued to get a piece of subtree like this (source view):

{[ .disco-source | 1.hilite(=javascript=) ]}

Then you could want to use my nice jQuery plugin for getting cross domain stuff using YQL. Notice that disco-tools.eu is very messy. They return a JSON object but send a wrong (text/html) Content-Type. We then need further filtering here because YQL seems unable to refrain from honoring the received Content-Type and always “corrects” the response by enclosing it into the body of a fictitious HTML page. So you’d get to something like this:

{[ .yql-usage | 1.hilite(=javascript=) ]}

For some vanilla JSON objects, the provided filter is enough to extract the good JSON part. But sometimes, like in this case, the wrong Content-Type clashes against the unescaped HTML special characters in the complex JSON object, and YQL’s “correction” is even much broader, because it absolutely cannot stand author’s errors. So you’d get to something like this:

{[ .yql-result | 1.hilite(=html=) ]}

In the above snippet I had to color it like HTML, which is how YQL thinks it is. For example, you can see that all open HTML tags get properly closed.

THAT IS EXTREMELY HARD TO FIX IN GENERAL !!!

I had no other choice but to deploy my own proxy and give up on YQL.

{[ .my-proxy | 1.hilite(=php=) ]}

And use it like this:

{[ .my-usage | 1.hilite(=javascript=) ]}

to get the same result the browser would get:

{[ .my-result | 1.hilite(=javascript=) ]}