How we can spend out time together in a way that matters.

I attended a master class with Priya Parker for her book tour, and she shared with a human-centered approach to gathering that will help everyone create meaningful experiences for work and play.

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




Control theory for fun and profit

FaunaDB is a distributed system. Like all distributed systems, we have the somewhat vexing problem of an unreliable network and faulty nodes (not byzantine faulty, just the regular slow or dead kind). The chance that a faulty node received a share of work in a request is greatly amplified when the result of one request requires a combination of data from several nodes. In such a scenario that is common for a distributed system-a faulty node could impact many requests. Methods to increase the reliability of your system in the presence of faulty nodes are therefore indispensable.

Whenever you need to request data over a potentially unreliable link and multiple nodes have the data there is an easy method to turn a set of unreliable links into a single virtual link that is both more reliable and faster in the tail in aggregate than any one of the individual links. Just issue requests redundantly. The fastest path of the available paths will return first. Of course, issuing redundant requests has its own problems: increased server load for machines servicing the requests and increased network traffic for both parties.

Instead of flooding the cluster with redundant messages in every scenario, we could try to determine when we have to send redundant messages. The most straightforward way to do this is to simply wait and see whether the result comes in fast enough. Instead of sending redundant requests right away, the node would wait for a specific delay and send a backup message when it did not receive an answer within the given time-frame.

This is all well and good in theory. There is, of course, an obvious pressing and practical concern: how does one figure out how long one should wait before sending the hedged request? Ideally, it’s small enough since each request that requires backup messages would take at least the time of the delay. But it can’t be too small either since the cluster would be flooded with backup messages. A simplistic and obvious answer is to measure it. We could determine the percentage of messages that we desire backup messages for, measure it directly, and adapt where needed.

Let’s say we aim to wait for the 98% percentile like in the paper. By gathering latency measurements for all our requests, we can determine the exact ‘backup request delay’ to make sure that only 2% of the requests trigger a redundant request.

In reality, it’s much more complex, by the time we have set our delay based on our measurements, the latencies might have changed due to external factors. Additionally, by setting this delay, the measurements will change, influencing the measurements which we rely on to take action can have bad side-effects.

In a naive approach where we set the delay each time exactly according to our measurements, we might accidentally make things worse. If due to an anomaly such as a netsplit, our latencies are severely skewed for only a split second we might set an extremely high delay. Setting such a high delay might affect our measurements which then results in an even higher delay, which again has an effect on the percentile, which in turn results in a higher delay, which again results in … . . .

I believe you get the point, a system that acts in a loop where it takes action on measurements that are influenced by its actions could quickly go rogue instead of stabilizing on an optimal value. There are many bad scenarios to avoid:

In essence, dealing with such a moving target raises even more questions: over what time slice should we measure? How should we evolve the time to wait if we measure the latencies continuously? Weighted average? Exponential? What weightings?

The ideal solution should react rapidly to changes in network conditions and converge to the right value without overshooting it. Now, in our case, getting this a little bit wrong either side isn’t the end of the world but it does have consequences: slow convergence or overshoot depending on the direction from which it arrives either means overly slugging request servicing or overconsumption of computing resources.

Fortunately, this kind of problem falls into a well-known area of study which is very common in electronics yet less often encountered in the wild in computer algorithms. In fact, the thermostat in your home has to solve similar issues. It needs to achieve a target by reacting to measurements that are influenced by its own actions (start heating) and external stimuli (an open door lets in the cold) in a constant loop.

Check out this visual analogy where a motor is controlling the position of a ball in response to external stimuli. This system has to decide on a course of action based on measurements of a constantly moving target that is influenced both by external stimuli (the hand that moves the ball out of center) as well as its own actions (tilting the plate).

This plate controller is doing exactly what we’d like our software to do. It has a setpoint (keep the ball in the center of the plate), it has an error (the distance of the ball to the setpoint) and it has a mechanism for reducing the error (tilting the plate). Our setpoint is the ratio of requests that trigger a backup read to those that don’t, our error is just the gap between the observed ratio and the target ratio and our mechanic to influence this ratio is the delay before issuing the backup request.

How is the plate being controlled? With a PID controller.

PID is an initialism standing for Proportional, Integral, Derivative which are the components used to tame the system and a PID controller is the embodiment of this function which has taken many forms over the years; mechanical (a pneumatic device), electrical (chips) and finally the programmatic implementation that can be used to optimize distributed systems.

The function combines the error (P), the accumulated error over time (I) and the predicted error (D) each multiplied by a tunable constant respectively. Why does this make sense?

By specifying a weight for these three factors (using the constants Kp, Ki, and Kd), we can finetune how our system should behave to prevent overshooting, accumulating errors or unstable situations.

The introduction of backup requests and a delay before we actually issue these requests was only the first step of the approach. By introducing the PID controller as a third element we can more readily adapt to delay when the circumstances change. With these three insights combined into one algorithm, we expect to retain all of the benefits of issuing backup requests while making fewer requests.

Add a comment

Related posts:

Mental Health For Writers

May is Mental Health Awareness month and it’s a perfect opportunity to look at how we can tend to our mental (and physical) health as writers. While any artistic or creative endeavor can bring with…

Labor Stands with Lina.

The 92 unions of the Texas Gulf Coast Area Labor Federation, AFL-CIO today reaffirmed their support for Harris County Judge Lina Hidalgo. We’re incredibly disappointed in the Houston Chronicle…

Custom Tribe Subdomains are LIVE

Most of our community is well aware that we have never created a “Road Map” in the traditional sense. To be quite honest, we dislike the word immensely. It sounds much too ‘set in stone’. We’ve…