Facebook uses a NoSQL graph API called TAO that runs on sharded MySQL. Facebook started off with MySQL databases but their data requirements became far too great to use these databases directly. TAO converts the existing sharded MySQL master-slave pairs into a scalable and geo-distributed database cluster. They allow objects and associations stored persistently in the same MySQL instance and cached on the same set of servers.
They previously used InnoDB for managing social activities, along with RocksDB, their own custom-built database, for some of their data storage needs.
Facebook is the world’s largest social media platform with over 2.91 billion active users. It’s no surprise that such a large website requires a lot of data storage.
In this blog post, we will discuss the different databases that Facebook uses to store their data. Keep reading!
Contents
TAO
Facebook created a data model and API specifically for managing connections in their social network, called TAO. TAO is a geographically dispersed database that can quickly process data requests from Facebook’s high-demand workloads. It is used on thousands of machines, stores petabytes of data, and can handle a billion reads and millions of writes every second, replacing memcache in certain situations.
You can read the TAO paper on the Facebook research website which goes through the system architecture.
MySQL
MySQL is a free, open-source database management system. It is one of the most popular database systems in use today. MySQL is known for its speed, reliability, and ease of use. As per Facebook, it is the primary database used to store user data.
MySQL is a relational database system, meaning that it organizes data into tables and columns. It supports a wide range of features, including transactions, foreign keys, views, triggers, and stored procedures. MySQL also has a rich set of APIs that allow you to access its functionality from your own code.
Facebook uses MySQL because of its high-performance levels and superior capabilities when compared with other relational databases. Its architecture works specifically for large volumes of interactive applications which demand rapid response times.
InnoDB
Facebook used the InnoDB storage engine for managing social activities such as likes, comments, etc. InnoDB is a free, open-source database engine that is a fork of the MySQL codebase. It was created by Innobase Oy and later acquired by Oracle Corporation.
InnoDB supports transactions and row-level locking so that multiple users can access the same data at the same time without fear of corruption. It also has better crash recovery capabilities.
Facebook uses InnoDB because it is a stable, reliable engine that can handle large amounts of data. InnoDB also has built-in support for foreign keys, which are used to enforce referential integrity between tables. This is important for Facebook, as they need to ensure that data in different tables remain consistent.
RocksDB
RocksDB is an open-source embedded database developed by Facebook (Meta). It is designed to be fast and efficient. RocksDB offers several advantages compared to InnoDB in terms of space efficiency.
RocksDB uses less disk space because it uses compression by default. It also uses a log-structured merge-tree (LSM) data structure, which reduces the number of disk reads and writes.
Facebook needed something that could handle the scale of its data. That’s why they developed RocksDB. Some of the benefits of RocksDB include :
- RocksDB uses compression by default, which means it takes up less disk space.
- It is designed to be fast and efficient. It uses a log-structured merge-tree (LSM) data structure, which reduces the number of disk reads and writes.
- RocksDB is highly configurable. You can tune it to your specific needs.
- High performing, adaptable, with basic and advanced operations.
Although there are many advantages of RocksDb, it lacked some features such as replication support and SQL layer, which were important for Facebook. So they decided to construct MyRocks, which uses RocksDB as a new storage engine for MySQL. With the help of the MyRocks, they were able to improve the space efficiency by 50%.
Some benefits of the MyRocks include:
- MyRocks is highly compression efficient. This means it takes up less disk space.
- It offers faster replication as it doesn’t require random reads and arrives with read-free features.
- As it writes the data at the bottommost level, it can avoid compaction, which leads to faster data loading.
Conclusion
In conclusion, Facebook uses MySQL with TAO which enables it to operate at the massive scales required for such a large social Network. Facebook also previously used InnoDB, and RocksDB to store and manage data. Each database has its own advantages and disadvantages, but overall they are all fast, reliable, and efficient.