BSON and why MongoDB Developed it?

You might have heard about JSON(JavaScript Object Notation) but have you heard about BSON?

And no it's not pronounced as Bison.
bson data type

There are a few key differences between JSON and BSON(Binary JSON).

JSON - the predecessor of BSON๐Ÿงฌ

To understand what BSON is and why it was developed, we first need to know about the origin of JSON.

JSON, or JavaScript Object Notation, is a human-readable data interchange format, specified in the early 2000s. Even though JSON is based on a subset of the JavaScript programming language standard, itโ€™s completely language-independent.

The following shows an example :

{
  "_id": 1,
  "name": { "first" : "John", "last" : "Backus" },
  "contribs": [ "Fortran", "ALGOL", "Backus-Naur Form", "FP" ],
  "awards": [
    {
      "award": "W.W. McDowell Award",
      "year": 1967,
      "by": "IEEE Computer Society"
    }, {
      "award": "Draper Prize",
      "year": 1993,
      "by": "National Academy of Engineering"
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

Now, to provide flexibility with storage the MongoDB team wanted to use the JSON format to store data.
But there were some issues with using JSON as a data storage format.

Issues with JSON

๐Ÿ“Œ JSON only supports a limited number of basic data types. Most notably, JSON lacks support for dates and binary data.

๐Ÿ“Œ JSON objects and properties donโ€™t have fixed length which makes traversal slower.

In order to make MongoDB JSON-first, but still high performance and general purpose, BSON was invented to bridge the gap: a binary representation to store data in JSON format, optimized for speed, space, and efficiency.

What is BSON(anyways)?

When you think of BSON, remember it is not something entirely different from JSON. It's just how EVs are better than regular fuel gas cars(in some ways).

BSON for beginners

๐Ÿšจ No, BSON is not a scam. It provides some power ups ๐Ÿ’ช over JSON that are particularly useful for storage. But for data transfers over the network, JSON is still the No. 1 choice.

BSON stands for โ€œBinary JSON,โ€ and thatโ€™s exactly what it was invented to be.

๐Ÿ‘‰ BSONโ€™s binary structure encodes type and length information, which allows it to be traversed much more quickly compared to JSON.

๐Ÿ‘‰ BSON adds some non-JSON-native data types, like dates and binary data, without which MongoDB would have been missing some valuable support.

Let's see 2 examples where JSON data is converted to BSON during storage and how it looks :

{"hello": "world"} โ†’
\x16\x00\x00\x00           // total document size
\x02                       // 0x02 = type String
hello\x00                  // field name
\x06\x00\x00\x00world\x00  // field value
\x00                       // 0x00 = type EOO ('end of object')
Enter fullscreen mode Exit fullscreen mode
{"BSON": ["awesome", 5.05, 1986]} โ†’
\x31\x00\x00\x00
 \x04BSON\x00
 \x26\x00\x00\x00
 \x02\x30\x00\x08\x00\x00\x00awesome\x00
 \x01\x31\x00\x33\x33\x33\x33\x33\x33\x14\x40
 \x10\x32\x00\xc2\x07\x00\x00
 \x00
 \x00
Enter fullscreen mode Exit fullscreen mode

You can learn more about BSON grammar here ๐Ÿ‘‰: BSON Specifications

JSON vs BSON

JSON vs BSON

JSON and BSON are basically cousins ๐Ÿ‘ซ that share some features but not all as you can see above.

One particular way that differentiates BSON from JSON is that BSON provide better Data type support. JSON only has the number data type but BSON provides support to differentiate integers from floating point numbers. ๐Ÿ˜Ž๐Ÿ˜Ž

This is a great advantage when it comes to data storage and operations. The functions working on integers can provide different output from floating point.