data URI scheme

From Wikipedia, the free encyclopedia

Jump to: navigation, search

The data URI scheme is a URI scheme that provides the ability to include data items in-line in a web page as if they were being referenced as external resources. They tend to be simpler than alternative inclusion methods, such as MIME with cid or mid URIs. Data URIs are a form of Uniform Resource Locators, although they do not actually remotely locate anything. The data URI scheme is defined in RFC 2397 of the Internet Engineering Task Force.

Although the IETF published the data URI specification in 1998,[1] they never formally adopted it as a standard.[2] Nonetheless, the HTML 4.01 specification references the data URI scheme[3] and data URIs have now been implemented in most browsers.

Contents

[edit] Web browser support

Data URIs are currently supported by the following web browsers:

  • Gecko and its derivatives, such as Mozilla Firefox
  • Opera
  • KDE, through the KIO input/output system. This allows the KDE browser, Konqueror to support data URIs.
  • Safari; although Safari's rendering engine, WebKit, is a derivative of Konqueror's KHTML engine, Mac OS X does not share the KIO slaves architecture so the implementations are not shared.
  • Google Chrome
  • Internet Explorer 8; Microsoft has limited support to certain "non-navigable" content, such as in <img> tags and CSS rules, for security reasons, including concerns that JavaScript embedded in a data URI may not be interpretable by script filters such as those used by web-based email clients.[4]

[edit] Advantages

  • HTTP request and header traffic is not required for embedded data, so data URIs consume less bandwidth whenever the overhead of encoding the inline content as a data URI is smaller than the HTTP overhead. For example, the required base64 encoding for an image 600 bytes long would be 800 bytes, so if an HTTP request required more than 200 bytes of overhead, the data URI would be more efficient.
  • When browsing a secure HTTPS web site, web browsers commonly require that all elements of a web page be downloaded over secure connections, or the user will be notified of reduced security due to a mixture of secure and insecure elements. HTTPS requests have significant overhead over common HTTP requests, so embedding data in data URIs may improve speed in this case.
  • Web browsers are typically configured to use a maximum of two concurrent connections to a server as per the HTTP specification, so inline data frees up a download connection for other content.
  • Environments with limited or restricted access to external resources may embed content when it is disallowed or impractical to reference externally. For example, an advanced HTML editing field could accept a pasted or inserted image and convert it to a data URI to hide the complexity of external resources from the user.
  • It's possible to manage a multimedia page as a single file.

[edit] Disadvantages

  • Data URIs used more than once must be repeated each time they are used, increasing download time.
  • Data URIs cannot be cached separately from their containing documents and hence a data URI must be redownloaded every time its containing document is redownloaded.
  • Content must be re-encoded and re-embedded every time a change is made.
  • Internet Explorer through version 7 (some 70% of the market as of 2008 Q2), lacks support.
  • The beta version of Internet Explorer 8 limits data URIs to a maximum length of 32 kB.[4]
  • Data is included as a simple stream, and many processing environments (such as web browsers) may not support using containers (such as multipart/alternative or message/rfc822) to provide greater complexity such as metadata, data compression, or content negotiation.
  • Base64-encoded data URIs are roughly 33% larger in size than their binary equivalent. This is not quite as disadvantageous if the containing page is sent compressed using HTTP's Content-Encoding header.
  • Data URIs make it more difficult for security software to filter content.[5]

[edit] Format

data:[<MIME-type>][;charset="<encoding>"][;base64],<data>

The encoding is indicated by ;base64. If it's present the data is encoded as base64. Without it the data (as a sequence of octets) is represented using ASCII encoding for octets inside the range of safe URL characters and using the standard %xx hex encoding of URLs for octets outside that range. If <MIME-type> is omitted, it defaults to text/plain;charset=US-ASCII. (As a shorthand, the type can be omitted but the charset parameter supplied.)

[edit] Examples

[edit] HTML

An HTML fragment embedding a picture of a small red dot:

<img src="data:image/png;base64,
iVBORw0KGgoAAAANSUhEUgAAAAoAAAAKCAYAAACNMs+9AAAABGdBTUEAALGP
C/xhBQAAAAlwSFlzAAALEwAACxMBAJqcGAAAAAd0SU1FB9YGARc5KB0XV+IA
AAAddEVYdENvbW1lbnQAQ3JlYXRlZCB3aXRoIFRoZSBHSU1Q72QlbgAAAF1J
REFUGNO9zL0NglAAxPEfdLTs4BZM4DIO4C7OwQg2JoQ9LE1exdlYvBBeZ7jq
ch9//q1uH4TLzw4d6+ErXMMcXuHWxId3KOETnnXXV6MJpcq2MLaI97CER3N0
vr4MkhoXe0rZigAAAABJRU5ErkJggg==" alt="Red dot" />

As demonstrated above, data URIs may contain whitespace for readability.

[edit] CSS

A CSS rule that includes a background image:

ul.checklist > li.complete { margin-left: 20px; background:
  url('data:image/png;base64,
iVBORw0KGgoAAAANSUhEUgAAABAAAAAQAQMAAAAlPW0iAAA
ABlBMVEUAAAD///+l2Z/dAAAAM0lEQVR4nGP4/5/h/1+G/5
8ZDrAz3D/McH8yw83NDDeNGe4Ug9C9zwz3gVLMDA/A6P9/A
FGGFyjOXZtQAAAAAElFTkSuQmCC') top left no-repeat; }

[edit] JavaScript

A JavaScript statement that opens an embedded subwindow, as for a footnote link:

window.open('data:text/html;charset=utf-8,%3C!DOCTYPE%20HTML%20PUBLIC%20%22-'+
  '%2F%2FW3C%2F%2FDTD%20HTML%204.0%2F%2FEN%22%3E%0D%0A%3Chtml%20lang%3D%22en'+
  '%22%3E%0D%0A%3Chead%3E%3Ctitle%3EEmbedded%20Window%3C%2Ftitle%3E%3C%2Fhea'+
  'd%3E%0D%0A%3Cbody%3E%3Ch1%3E42%3C%2Fh1%3E%3C%2Fbody%3E%0D%0A%3C%2Fhtml%3E'+
  '%0D%0A','_blank','height=300,width=400');

This example does not work with Internet Explorer 8 due to its security restrictions that prevent navigable file types from being used.[4]

[edit] Inclusion in HTML or CSS using PHP

Because base64-encoded data URIs are not human readable, a website author might prefer the encoded data be included in the page via a scripting language such as PHP. This has the advantage that if the included file changes, no modifications need to be made to the HTML file, and also of keeping a separation between binary data and text based formats. Disadvantages include greater server CPU use unless a server-side cache is used.

<?php
function data_url($file, $mime) 
{  
  $contents = file_get_contents($file);
  $base64   = base64_encode($contents); 
  return ('data:' . $mime . ';base64,' . $base64);
}
?>
 
<img src="<?php echo data_url('elephant.png','image/png'); ?>" alt="An elephant" />

Similarly, if CSS is processed by PHP, the above function may also be used:

<?php header('Content-type: text/css');?>
 
div.menu
{
  background-image:url(<?php echo data_url('menu_background.png','image/png'); ?>);
}

In either case, client or server side features/UA detection/discrimination systems, (like conditional comments) may be used to provide a standard http: URL for Internet Explorer and other older browsers.

[edit] See also

  • An alternative for attaching resources to an HTML document is MHTML, usually found in HTML email messages.
  • MIME for the used mediatypes

[edit] References

  1. ^ Masinter, L (August 1998). "RFC 2397 - The "data" URL scheme". Internet Engineering Task Force. http://tools.ietf.org/html/rfc2397. Retrieved on 2008-08-12. 
  2. ^ "Proposed Standards". Official Internet Protocol Standards. Internet Society. 2009-01-04. http://www.rfc-editor.org/rfcxx00.html#Proposed. Retrieved on 2009-01-04. 
  3. ^ Raggett, Dave; Le Hors, Arnaud; Jacobs, Ian (1999-12-24). "Objects, Images, and Applets: Rules for rendering objects". HTML 4.01 Specification. W3C. http://www.w3.org/TR/1999/REC-html401-19991224/struct/objects.html#h-13.3.1. Retrieved on 2008-03-20. 
  4. ^ a b c "data Protocol". MSDN. http://msdn.microsoft.com/en-us/library/cc848897%28VS.85%29.aspx. Retrieved on 2009-01-05. 
  5. ^ Masinter, L (August 1998). "Security". RFC 2397 - The "data" URL scheme. Internet Engineering Task Force. 2. http://tools.ietf.org/html/rfc2397. Retrieved on 2008-08-12. 

[edit] External links

Personal tools