Part 2 - Distributed data

Moving from data stored on one machine to data distributed across multiple machines.

Why would you do that?

* Scalability
* Fault tolerance / Higher availability
* Latency

Distributing data

* Replication
    * Copy of the same data on different nodes
    * Provides redundancy 
* Partitioning (also called sharding)
    * Splitting data into multiple sub-sets (partitions)

These 2 approaches can of course be combined. You can replicate shards!

results matching ""

    No results matching ""