siiky
2023/02/21
2023/03/04
2023/05/12
whitepaper,distributed,programming
Abstractly defines clocks where both process IDs and events organically change (grow or shrink) over time, and then gives one possible implementation based on trees.
IDs are formalized in a really neat way. Each node has "permission" to modify/increment only a subset of the events, based on its ID, which is a characteristic function -- i.e., given some event e, ID(e) = 1 if the node can modify/increment e, otherwise ID(e) = 0. To make notation easier on the eyes, bold 0/1 represent the constant functions that always return 0/1. Operations on IDs then use arithmetic notation: e.g. i1+i2 is the same as ∀e:(i1(e)+i2(e)).
Each tuple of an ID tree determines whether a certain node can modify/increment the left/right half of the events tree. For example, a node of ID (0, 1) can only modify/increment the right half, and a node of ID ((1, 0), 0) can only modify/increment the left-most quarter.
Here's are some of the operations that I hacked together (clearly incomplete). As detailed above, note that the numbers 0 and 1 are not integer but functions!
(define null-id 0) ; TODO
(define whole-id 1) ; TODO
(define make-id (constantly null-id))
(define (id-norm id)
(cond
((and (pair? id)
(= (car id) null-id)
(= (cdr id) null-id))
null-id)
((and (pair? id)
(= (car id) whole-id)
(= (cdr id) whole-id))
whole-id)
(else id)))
(define (id-split id)
(cond
((= id null-id) ; 0
(values null-id null-id))
((= id whole-id) ; 1
(values `(,whole-id . ,null-id) `(,null-id . ,whole-id)))
((= (car id) null-id) ; (0, i)
(receive (id1 id2) (id-split (cdr id))
(values `(,null-id . ,id1) `(,null-id . ,id2))))
((= (cdr id) null-id) ; (i, 0)
(receive (id1 id2) (id-split (car id))
(values `(,id1 . ,null-id) `(,id2 . ,null-id))))
(else ; (i1, i2)
(values `(,(car id) . ,null-id) `(,null-id . ,(cdr id))))))
(define (id-sum id1 id2)
(cond
((= id1 null-id) id2)
((= id2 null-id) id1)
(else
(receive (l1 r1) (values (car id1) (cdr id1))
(receive (l2 r2) (values (car id2) (cdr id2))
(id-norm `(,(id-sum l1 l2) . ,(id-sum r1 r2))))))))
(define (make-events-root n) n)
(define events-root? (o not events-node?))
(define (events-root-base n) n)
(define (make-events-node b l r) `(,b ,l . ,r))
(define events-node? pair?)
(define events-node-base car)
(define events-node-left cadr)
(define events-node-right cddr)
(define (events-node-values e)
(values (events-node-base e) (events-node-left e) (events-node-right e)))
(define ((events-tree-lift/sink op) e m)
(if (pair? e)
(receive (b l r) (events-node-values e)
(make-events-node (make-events-root (op (events-root-base b) m)) l r))
(make-events-root (op (events-root-base e) m))))
(define events-tree-lift (events-tree-lift/sink +))
(define events-tree-sink (events-tree-lift/sink -))
(define ((events-min/max op) e)
(if (events-node? e)
(receive (b l r) (events-node-values e)
; If e is normalized, no need to recurse for min, return b
(+ (events-root-base b)
(op ((events-min/max op) l) ((events-min/max op) r))))
(events-root-base e)))
(define events-min (events-min/max min))
(define events-max (events-min/max max))
(define (events-norm e)
(if (events-node? e)
(receive (b l r) (events-node-values e)
(let ((m (min (events-min l) (events-min r))))
(make-events-node (make-events-root (+ (events-root-base b) m))
(events-tree-sink l m)
(events-tree-sink r m))))
e))
(define (events-join e1 e2) #f) ; TODO
(define (events-increment e) #f) ; TODO
(define make-clock cons)
(define clock-id car)
(define clock-events cdr)
(define (clock-values c)
(values (clock-id c) (clock-events c)))
(define (fork c)
(receive (id e) (clock-values c)
(values (make-clock (id-fork/l id) e)
(make-clock (id-fork/r id) e))))
(define (peek c)
(values c (make-clock null-id (clock-events c))))
(define (event c)
(receive (id e) (clock-values c)
(make-clock id (events-increment e))))
(define (join c1 c2)
(receive (id1 e1) (clock-values c1)
(receive (id2 e2) (clock-values c2)
(make-clock (id-join id1 id2) (events-join e1 e2)))))
How can this apply to/work on a P2P network (e.g. IPFS, BitTorrent, ...) where nodes simply spawn out of thin air, not from other already existing nodes? At first sight I don't see how it can.
Couldn't understand from reading the paper how this is done... Maybe I didn't pay close enough attention.
Is an implementation based on a (finite) bitset/bitvector feasible? The invariant would be that the XOR of all the IDs is the whole ID: 0b11111...111. The biggest downside comparing with the tree-based implementation is that processese can't arbitrarily fork if they feel like it; they can only fork if the popcount (# of 1 bits) is greater than 1.
For example, with a cardinal of 4, a process of ID 0b1111 could fork into AT MOST 4 different (concurrent) processes:
1. 0b0001
2. 0b0010
3. 0b0100
4. 0b1000
Or 3 (concurrent) processes:
1. 0b0011
2. 0b1000
3. 0b0100