CRDTs for Free
Conflict-Free Replicated Data Types (CRDTs) are one of the core technical building blocks behind Ossa's secure synchronization protocol. CRDTs are special data types that enable data synchronization without a central server and allow users to make document edits even while offline. To provide these features, CRDTs must satisfy certain mathematical properties.
These properties were defined by Shapiro et al., who required CRDTs to guarantee Strong Eventual Consistency (SEC). SEC ensures that an application's state eventually converges between replicas once they have received the same set of update operations. CRDTs satisfy SEC by making sure that concurrent operations commute. Here is the commutativity property stated mathematically (in Lean pseudocode):
def commutativity := forall st: CRDT, op1: Operation, op2: Operation,
concurrent(op1, op2) =>
st.apply(op1).apply(op2) == st.apply(op2).apply(op1)
In other words, given two concurrent operations op1 and op2, if you apply op1 and then op2 to a CRDT, its final state is the same as if you applied op2 and then op1.
As a result, application state converges to the same value, regardless of the order that updates were received.
In practice, this enables collaborative applications where devices can sync directly over peer-to-peer connections and users can update their documents while offline.
Designing data types that satisfy the required CRDT properties is not easy. Fortunately, there are many CRDT primitives for commonly used data structures like registers, sequences, sets, and maps. As an example, consider the simple Last Writer Wins CRDT that serves as a register.
struct LWW<Time, A> {
time: Time,
value: A,
}
impl<Time, A> CRDT for LWW<Time, A> {
type Op = LWW<Time, A>;
fn apply(self, op: LWW<Time, A>) -> LWW<Time, A> {
if self.time < op.time {
op
} else {
self
}
}
}
Here, a LWW register is a tuple of when the register was updated and its latest value.
When a new operation is received, the value with the latest timestamp is selected.
In real-world applications, a program's state is made up of many different primitive data types. For example, in the Ossa shared cookbook demo application, cookbooks and recipes require registers and maps. Luckily, CRDTs can be composed to easily create higher level data types that are also CRDTs. For example, in the demo application, recipes and cookbooks are defined as follows:
struct Recipe<Time> {
title: LWW<Time, String>,
ingredients: LWW<Time, String>,
instructions: LWW<Time, String>,
}
struct Cookbook<Time> {
title: LWW<Time, String>,
recipes: TwoPMap<RecipeId<Time>, Recipe<Time>>,
}
Fields of these structs are either register or map CRDTs.
In Ossa, users need to define a type for update operations and implement a trait in order to create a CRDT. Here are the update operation types for recipes and cookbooks.
pub enum RecipeOp<Time> {
Title(LWW<Time, String>),
Ingredients(LWW<Time, String>),
Instructions(LWW<Time, String>),
}
pub enum CookbookOp<Time> {
Title(LWW<Time, String>),
Recipes(TwoPMapOp<RecipeId<Time>, RecipeOp<Time>>),
}
Notice that operations are enums where each variant holds the update operation for its corresponding field.
Here are the CRDT trait implementations for recipes and cookbooks.
impl<Time> CRDT for Recipe<Time> {
type Op = RecipeOp<Time>;
fn apply(self, op: RecipeOp<Time>) -> Recipe<Time> {
match op {
RecipeOp::Title(t) => Recipe {
title: self.title.apply(t),
..self
},
RecipeOp::Ingredients(i) => Recipe {
ingredients: self.ingredients.apply(i),
..self
},
RecipeOp::Instructions(i) => Recipe {
instructions: self.instructions.apply(i),
..self
},
}
}
}
impl<Time> CRDT for Cookbook<Time> {
type Op = CookbookOp<Time>;
fn apply(self, op: CookbookOp<Time>) -> Cookbook<Time> {
match op {
CookbookOp::Title(t) => Cookbook {
title: self.title.apply(t),
..self
},
CookbookOp::Recipes(rs) => Cookbook {
recipes: self.recipes.apply(rs),
..self
},
}
}
}
These implementations simply pattern match on the operation's variant, and update the field by recursively calling apply.
While it is fairly straightforward to compose CRDTs in this manner, it is somewhat tedious.
Ossa alleviates this by providing a macro that automatically derives the update operation data type and CRDT trait implementation.
Now the CRDTs for recipes and cookbooks are simply implemented as follows:
#[derive(CRDT)]
struct Recipe<Time> { ... }
#[derive(CRDT)]
struct Cookbook<Time> { ... }
This is significantly shorter than the manual definitions. Furthermore, it helps guarantee that derived CRDTs satisfy their required mathematical properties by preventing bugs introduced by manual implementations.
Ossa allows developers to define CRDTs for free with stronger guarantees on their correctness. This is just one way Ossa helps developers create secure, decentralized applications. If you want to learn more about this approach, check out our OOPSLA paper, "Verifying replicated data types with typeclass refinements in Liquid Haskell". That's where we first used this technique to automatically compose CRDTs, and we even proved that the composed CRDTs satisfy their required convergence properties in Liquid Haskell.