Jonasfj.dk/blog

Jonasfj.dk/Blog
A blog by Jonas Finnemann Jensen

December 29, 2012

Introducing BJSON.coffee for Binary JSON Serialization

Filed under: Computer,English by jonasfj at 2:15 pm

Lately, I’ve been working an web application which will need to save binary blobs inside JSON objects. Looking around the web it seems that base64 encoding is the method of choice in these cases. However, this adds a 30% overhead and decoding large base64 strings to Javascript typed arrays (ArrayBuffer) is an expensive tasks.

So I’ve been looking at different binary data formats: BSON, Protocol Buffers, Smile Format, UBJSON, BJSON and others. Eventually, I decided to give BJSON a try for the following reasons.

BJSON is easy to make a lightweight implementation
It can encapsulate any JSON object
BJSON documents can be represented as JSON objects with ArrayBuffers for binary blobs.

My primary motivation is the fact that BJSON can serialize ArrayBuffers, as an added bonus a BJSON encoding of JSON object is typically smaller than the traditional string encoding with JSON.stringify(). Now, I’m sure there is valid arguments to use another binary encoding of JSON objects, so I’m going to stop with the arguments and talk code instead…

Well, time to introduce BJSON.coffee, a CoffeeScript implementation of BJSON for modern browsers. Aparts from null, booleans, numbers, arrays and dictionaries also available JSON, the BJSON specification also defines the inclusion of binary data. The specification notes that “this is not fully transcodable“, but as you might have guessed BJSON.coffee uses ArrayBuffers to represent binary data.

Essentially, BJSON.serialize takes a JSON object that is allowed to contain ArrayBuffers and serializes to a single ArrayBuffer. While, BJSON.parse takes an ArrayBuffer and returns a JSON object which may contain ArrayBuffers.For those interested in using BJSON instead of a normal string encoding of JSON objects, there is both good and bad news. The bad news is that UTF-8 string encoding in modern browsers is so slow, that BJSON is slower than a conventional string encoding of JSON objects. Although, this might not be the case when/if the string encoding specification is implemented.

The good news is that the BJSON encoding is 5-10% smaller than the conventional string encoding of JSON objects. The table/terminal output below from my testing script, shows some common JSON objects harvested from common web APIs.

Test:                   size (JSON):   size (BJSON):  Compression:   Success:
bitly-burstphrases.json    17714          16372            16%       true    
bitly-clickrate.json         104             88            24%       true    
bitly-hotphrases.json      65665          60592            16%       true    
bitly-linkinfo.json          497            468            10%       true    
complex.json                 424            384            25%       true    
flickr-hottags.json         3610           3205            11%       true    
twitter-search.json        12968          11994            10%       true    
yahoo-weather.json          1241           1172             5%       true    
youtube-comments.json      26296          24938             5%       true    
youtube-featured.json     110873         104422             5%       true    
youtube-search.json        93202          88473             5%       true

BJSON.coffee is available at github.com/jonasfj/BJSON.coffee. It should work in all modern browsers with support for typed arrays, Firefox 15+, Chrome 22+, IE 10+, Opera 12.1+, Safari 5.1+. However, I have pushed a github page which runs unit-tests in the browser and shows compatibility results from other browsers using Browserscope. So please visit it here, click “run tests” and help figure out where BJSON.coffee works.

Update: Being bored today I decided to a quick jsperf benchmark of JSON.stringify and BJSON.serialize to see how much slower BJSON.serialize is. You can find the test here, which seems to suggest that BJSON.serialize might be unreasonably slow at the moment. However, it seems that slow UTF-8 encoding is responsible for much of this, and I believe it is possible to improve the current UTF-8 encoding speed.

Comments (0)

December 11, 2012

Last.fm Radio API Update – Marks the end of TheLastRipper

Filed under: Computer,English,TheLastRipper by jonasfj at 7:07 pm

Back in high school I started TheLastRipper, an audio stream recorder for last.fm. The project started as product for a school project on copyright and issues with piracy. It spawn from the unavailability of non-DRM infested music services. Which at the time drove many teenagers to piracy. Whilst, you the morality of recording internet radio can be argued. All the research we did at the time, showed it to be perfectly legal. Nevertheless, I’m quite happy that I didn’t have to defend this assertion.

Anyways, I’m glad to see that the music industry didn’t sleep for ever. These days we have digital music stores and I’m quite happy to pay for DRM-free music, and as bonus I get to support the artists as well. So it can hardly comes as surprise that I haven’t used TheLastRipper for years. Nor have I contributed to the project or attended bug reports for years. In fact it has been years since the project saw any active development.

I do occasionally get an email or see a bug report from an die-hard user of TheLastRipper, and it’s in the wake of such an email I’ve decide write this post. Partly because more people will ask why it stopped working, and partly because I had a lot of fun with project and it deserves a final post here at the end. Personally, I’ve long thought the project dead, I know the Linux client have been, so I was surprised to that anybody actually noticed it when the Radio API was updated.

As you may have guessed from the title of this post TheLastRipper has died from being unmaintained while last.fm have updated their APIs. Honestly, I’m quite surprised last.fm have continued to support their old unofficial API for as long as they have. So this is by no means the result of last.fm taking action against TheLastRipper. Truth be told, I’m not even sad that it’s finally dead. The past many years, state of TheLastRipper have been quite embarrassing. The code base is ugly, buggy and completely unmaintainable.

Over the years, TheLastRipper have been downloaded more than 475.000 times, distributed with magazines (Computer Bild) and featured in countless blog posts from around the world. Which, considering that this started as a high school project is pretty good. It’s certainly been a great adventure and I’ve worked with a lot of people from around the world. So here at the of my last post on TheLastRipper, I’d like to say thanks to all the bug reporters, comment posters, testers, developers and people who hopefully also had fun participating in this project.

Oh, and to the few die-hard users out there, I’ll recommend that you buy your digital music from one of the many DRM-free music stores you can find 🙂

Comments (8)