Friday, January 11, 2013

Problems due to Same Origin Policy -- printing contents of a pdf from a url

This is one of the biggest problems .

Let us consider this scenario :

We have a url which hosts a pdf and we want  to print it from our web page on click of a button. It's at that point of time that we must look at Same Origin Policy.

What is Same Origin Policy : 

In computing, the same origin policy is an important security concept for a number of browser-side programming languages, such as JavaScript. The policy permits scripts running on pages originating from the same site – a combination of scheme, hostname, and port number – to access each other's methods and properties with no specific restrictions, but prevents access to most methods and properties across pages on different sites. Same origin policy also applies to XMLHttpRequest.

Let us consider these two urls :

http://www.example.com/dir/page.html  /  http://www.example.com/dir2/other.html --> Success since Same protocol and host

Since the above two urls have same protocol, host and domain, hence resource sharing is a possibility here .

However, if it would have been something like :

http://www.example.com/dir/page.html  /  http://www.goIbibo.com/dir2/other.html --> Failure

Now the above two links either don't share the same host or domain and hence it violates the same origin policy. So resource sharing here is not possible .

Workarounds :

To enable developers to, in a controlled manner, circumvent the same origin policy, a number of "hacks" such as using the fragment identifier or the window.name property have been used to pass data between documents residing in different domains. With the HTML5 standard, a method was formalized for this: the postMessage interface, which is only available on recent browsers. JSONP can also be used to enable ajax-like calls to other domains.

In other words, it is really difficult when you try to do that .

Consider a classic scenario :

As stated earlier you want to print a pdf hosted in another site from your web page. How can you do this ?

Well , let us say you have fished out a code like this ::

Inside your body tag you have this ::


<input type="submit" class="btn-red" value="Print"
name="Submit" id="printbtn"
onclick="printPDF('http://www.irs.gov/pub/irs-pdf/fw4.pdf')" />

Inside your head tag you have this ::


function printPDF(pdfUrl)
    {

    if ((navigator.appName == 'Microsoft Internet Explorer') )
    window.print(pdfUrl,"_self");
    else
    {
    var w = window.open(pdfUrl,"_self");
    w.print();
    w.close();
    }
    }

So you are trying to print the contents of the URL in  http://www.irs.gov/pub/irs-pdf/fw4.pdf using javascript. However, your web site and the host which hosts the pdf  are different . Hence this does not work.

Further more, window.print() always will print just the content of the current window and hence all you will see is your html mark ups and not the pdf.

So what do you do ?

Probably , now you will think, hang on, i can use this idea :


I can open an iframe and then call window.print() inside of it. I can probably get away with hiding the iframe by giving it position: absolute; left: -9999px.

This is the code that you make up :


function printPDF(pdfUrl) {

var iframe = document.createElement('iframe'),
iframeDocument;

iframe.style.postion = 'absolute';
iframe.style.left = '-9999px';
iframe.src = pdfUrl;
document.body.appendChild(iframe);

if ('contentWindow' in iframe) {
iframeDocument = iframe.contentWindow;
} else {
iframeDocument = iframe.contentDocument;
}

var script = iframeDocument.createElement('script');

script.type = 'text/javascript';

script.innerHTML = 'window.print();';

iframeDocument.getElementsByTagName('head')[0].appendChild(script);

}





But even this idea will not work as the pdf contents will never be available for windows to print, since it's a different host altogether.

Idea No  2 ::

Some people might think , let's have a print.css and let's set the display for every element to none except what i want to print. But whatever you do you are going to try to print using : window.print() only. So this will not work as well.

Idea No 3 ::

Now you will say that, alright man, okay i will use pdf.js and extract the pdf into text . Then i will re construct the pdf from the text on my side and then print it. But that idea will be a flop idea as well. It's because even pdf.js follows the same origin policy .

So is there no way of doing this ?
Hang on, now i did not say that.

The costliest way ::

The same origin policy can be bypassed using : CORS .

What is CORS :

It means Cross-site HTTP requests .

Cross-site HTTP requests are HTTP requests for resources from a different domain than the domain of the resource making the request.  For instance, a resource loaded from Domain A (http://domaina.example) such as an HTML web page, makes a request for a resource on Domain B (http://domainb.foo), such as an image, using the img element (http://domainb.foo/image.jpg).  This occurs very commonly on the web today : pages load a number of resources in a cross-site manner, including CSS stylesheets, images and scripts, and other resources.

How it works :

The Web Applications Working Group within the W3C has proposed the new Cross-Origin Resource Sharing (CORS) recommendation, which provides a way for web servers to support cross-site access controls, which enable secure cross-site data transfers.  Of particular note is that this specification is used within an API container such as XMLHttpRequest as a mitigation mechanism, allowing the crossing of the same-domain restriction in modern browsers.  The information in this article is of interest to web administrators, server developers and web developers.  Another article for server programmers discussing cross-origin sharing from a server perspective (with PHP code snippets) is supplementary reading.  On the client, the browser handles the components of cross-origin sharing, including headers and policy enforcement.  The introduction of this new capability, however, does mean that servers have to handle new headers, and send resources back with new headers.

So all this is fine and dandy but it is extremely difficult

Moral of the story : Don't try to do this. If you have a url , the contents of which you want to print , then host the same thing from your end. Things become much easier then. Otherwise, it will become very difficult a problem to handle .

I recently faced a similar problem, hence i have shared all my knowledge that i gathered while researching this . I sincerely hope it helps other people who are trying similar stuffs .

Good Bye .

No comments:

Post a Comment