Storage overhead of TokuMX

TokuMX is an open source performance engine for MongoDB that dramatically increases MongoDB performance in large applications with demanding requirements. In addition to replacing B-tree indexing with more modern technology, TokuMX adds transaction support, document-level locking for concurrent writes, and replication.

There are lots of features provided by TokuMX as opposed to MongoDB. Aside from the mentioned excerpt, TokuMX claims saved documents are being compressed, so I thought it will be a good idea to test it since we already tested the storage overhead of MongoDB.
Continue reading


Storage overhead of MongoDB GridFS

Let’s have a look how MongoDB stores files with GridFS.

There are two collections for each GridFS bucket, one is bucket.fs.files and the other is bucket.fs.chunks.

While fs.files collection stores metadata about the file, fs.chunks collection stores the actual file data.

The file is divided into chunks and stored in the chunks collection. The default chunk size is 256KB. That means if a file size is more than 256KB than more than one chunk will be needed. The last chunk won’t be filled up to 256KB, that means if a file size is 10KB than only one chunk will be enough and it will hold only 10KB. MongoDB allocates files on the disk to persist the documents. The first file has 16 MB in size. Next file has double size of the previous. It gets bigger and bigger until it gets 2 GB in size. If the smallfiles parameter is used, the max file allocation size is 512 MB.
Continue reading