Limits of distributed systems

Distributed systems are fundamentaly different from programs which run on a single machine.

Fault and partial failures

Partial failures are when a sub-part of a systems becomes faulty when the rest operates properly. These are unpredictable, if you try to do anything involving multiple nodes and the network, it may sometimes work and sometimes unpredictably fail...

Unreliable networks

Systems are designed with a shared-nothing approach. Just a bunch of isolated machines connected by a network. The only way for them to exchange data is by using a service making network requests.

Whenever you try to send a packet over the network, it may be lost or arbitrarily delayed. Likewise, the reply may be lost or delayed, so if you don’t get a reply, you have no idea whether the message got through.

Typical faults: Typical networking faults

A client can't tell where it's been faulty, so how to deal with these? Timeout

Client's software must be able to handle these gracefully.

Unreliable clocks

Passionating chapter, which you'd rather read it yourself.

Unless you're in real-time environments, which you're not it unless it's mission-critical embedded system, you can't trust the clock. Neither the one from the node you're running on, nor the other ones on the network. Neither the Time-of-day clock nor the Monotonic clock. This has terrible consequences in distributed systems as it really complicates everything. Even simple things such as writing an entry in a database.

Knowledge, Truth, and Lies

Nodes in a cluster cannot know anything for sure, it can just guess.

That's why in distributed systems:

* Truth is defined by the majority, a _quorum_ (and that's why you want to have an odd number of nodes in a cluster)
* For some things, one node can be the only one in charge (writing to a DB or a file for instance)
* We can use fencing tokens and ensure we're protected against the Byzantine Generals problem (some nodes may "lie", i.e. send corrupted data)

results matching ""

    No results matching ""