Thursday, November 29, 2007

Encoding "data" URLs

"data" URLs can sometimes be useful for embedding small pieces of data when hosting a file is impossible or undesirable.

Two types of encoding are supported by data URLs — standard URL-escaped encoding and base64 encoding. (The result of the latter is technically URL-escaped as well, but none of the 65 symbols used actually require escaping.) Which encoding produces a shorter URL depends on what percentage of the characters it contains must be escaped in URLs. URL escaping triples the size of escaped characters (a percent sign followed by two hexadecimal digits) but leaves non-escaped characters intact, whereas base64 increases the size of all characters by ~33%.

This means that base64 is more efficient whenever the number of escape-requiring characters exceeds ~17% of the document's size:

3x + (1 - x) > 4/3 2x + 1 > 4/3 6x + 3 > 4 6x > 1 x > 1/6 = 16.6%

Related Links

0 comments: