Skip to main content

Stability of DLedger

· 5 min read
Xinyu Ma
Research Scientist @ Meta

In this post I want to analyze the stability of Dledger, a DAG-based distributed logging system.

Brief Introduction

In Dledger, every record can refer to some existing records by putting the name of those records into its content. This is done for 2 reasons:

  • Keep a causal ordering among records.
  • Make the whole system be directly or indirectly referred to by a small set of records, so we can synchronize on this set.

The tailing records are the records that have not been referred to by any other record. One thing we need to confirm is that the number of tailing records TT is small. Therefore we need to study the stability of TT.

We may assume that record generation is a Poisson procedure, that is, in a unit of time the number of records each node generate have a Poisson distribution with argument λ\lambda. Assume the system has nn nodes and the number does not change. Thus, total number of records generated is a Poisson distribution of nλn\lambda.

Since each record cannot be immediately received by another node after its generation, we assume that there is a propagation time Δt\Delta t, so a record is invisible to all nodes before this time interval, and visible to all after it.

Case 1: Refer to All

The easiest case is that each new record refers to all tailing records the node knows. Then, the number of tailing records in the system will be this record, plus all records generated in the past Δt\Delta t time. The average is Tmean=1+nλΔtT_{mean} = 1 + n\lambda\Delta t.

This system is stable, since the value of TT at one time point is not affected by its previous value. However, if TT is large enough, it may be not feasible to refer to all known tailing records in a single record.

Case 2: Refer to rr

In this case, a new record randomly pick r>1r>1 records from the tailing records to refer to. To simplify the calculation, we may assume that those rr records are sampled with replacement. We also assume that the stable value of TT is larger than nn, because otherwise it's degenerated to Case 1: r<nλΔtr < n\lambda\Delta t.

Equilibrium

In this section, we assume the point of equilibrium TT is reached, and calculate its value.

Suppose a new record is generated. At this time, the new record may see TT tailing records. But they are really tailing only before Δt\Delta t, since some records are generated during this time, but not visible yet. There are nλΔtn\lambda\Delta t new records generated, so by the assumption of equilibrium, there will be the same number nλΔtn\lambda\Delta t of tailing records that are referred to and not tailing any more. Therefore, if we randomly pick a record from the TT records, the probability of picking a record that is still tailing is

p=TnλΔtTp = \frac{T-n\lambda\Delta t}{T}

TT is large, so the number of tailing records that this new record refers to is approximately rprp. By the assumption of equilibrium, whenever we add 1 new tailing record, on average 1 tailing record should be removed. Thus, we have rp=1rp = 1 and

T=rr1nλΔtT = \frac{r}{r-1}n\lambda\Delta t

With Δt\Delta t small

To show the dynamics of TT, let's remove the assumption of equilibrium and take continuous approximation.

Suppose there are TT' tailing records. Then, nλdtn\lambda\mathrm{d}t records are generated, each of which makes rr references. Therefore, if we assume that TT' records are visible to these newly generated records, for one record, the probability of not being referred to is

(11T)rnλdt\left( 1 - \frac{1}{T'} \right)^{rn\lambda\mathrm{d}t}

Hence, the number of current tailing records will be

T=T(11T)rnλdt+nλdtT = T'\left( 1 - \frac{1}{T'} \right)^{rn\lambda\mathrm{d}t} + n\lambda\mathrm{d}t

Omit infinitesimals of higher order (by Taylor expansion) and get the dynamic equation:

dT=nλ[1+rTln(11T)]dt\mathrm{d}T = n\lambda\left[1 + rT\ln{\left(1 - \frac{1}{T}\right)}\right]\mathrm{d}t

In the formula above, dT\mathrm{d}T is always negative, which may come from the assumption that Δt\Delta t is small.

With batched generation

Here we use a different model: we don't assume Δt\Delta t is small, but records generation time is aligned with Δt\Delta t instead. That is, all records generated from [t,t+Δt][t, t+\Delta t] will be considered as generated at t+Δtt+\Delta t. Similarly, we have

T=T(11T)rnλΔt+nλΔtT = T'\left( 1 - \frac{1}{T'} \right)^{rn\lambda\Delta t} + n\lambda\Delta t

Let α=nλΔt\alpha=n\lambda\Delta t, which is the number of record generated after the dissemination time. Then, the condition when TT increase after Δt\Delta t is

(11T)rα+αT>1\left( 1-\frac{1}{T} \right)^{r\alpha} + \frac{\alpha}{T} > 1

Look at the left hand side. When T<αT < \alpha, the 2nd term will dominate and it will be greater than 1. After that, the 1st term will dominate, which is always less than 1. So the moving tread of TT is

  • When TT is less than the equilibrium point, TT will increase.
  • When TT is greater than the equilibrium point, TT will decrease.
  • As TT get larger, the decreasing speed will become slower. This is because as TT goes to infinity, we remove a fixed number (r1)α(r-1)\alpha of tailing records each time, which is relatively smaller compared to TT.

Conclusion

The number of tailing records will be stable in Dledger. The point of equilibrium is approximately proportional to the number of peers, the record generation rate, and the dissemination time. If the record generating speed is λ=300/s\lambda = 300/s, and Δt=100ms\Delta t = 100 ms, then n=20n=20 will lead to 600600 tailing records. If the speed is λ=1/min\lambda = 1/min, we can have n=360000n= 360000. This level of scalability can support a lot of applications.