Posts

Showing posts from July, 2020

System design for Sharding a Database

Image
Database sharding is the process of splitting the data across multiple servers for scalability of an application. The above diagram gives a clear idea of sharding a database. In the above diagram, the above database is a collection of a single database with all the customers from 1 to 8. In the below diagram, we have split the database into shards of 4. Sharding has many advantages, it helps us to maintain the database lot easier, easily scalable, and maintainable. Let us design this database in detail, Our system would have some requirements, let's start with it and design our system with those assumptions in mind. Requirements Data size: Let's assume we have data of few 100 TBs Data partition: Data partition can be done in many ways, it depends on the problem on what basis we can partition our data. We can do partition based on customer_id, location, items, inventory,  and others. Estimations In this part, we will discuss the rough estimations of our system Data part:  In the