- Mar 2024
-
arxiv.org arxiv.org
-
if t ≠ ref(t)
Meaning, it's possible to ref to itself?
-
- Feb 2024
-
www.swirlds.com www.swirlds.com
-
The whitened signature is the signature XORed with thesignatures of all unique famous witnesses in the received round.
^
-
- Jan 2024
-
clojure.org clojure.org
-
If a pure function mutates some local data in order to produce an immutable return value, is that ok? It’s an interesting question. Clojure data structures use mutation every time you call, e.g. assoc, creating one or more arrays and mutating them, before returning them for immutable use thereafter. The reason is performance - you simply can’t get as fast using only pure functions and immutable data. Once constructed and shared however, being immutable and persistent is essential to robust programs. The things Clojure mutates internally are small, newly allocated arrays that constitute the internal nodes of its data structures. No one ever sees the arrays.
Functional interfaces are enough for FP. Inside you can mutate, noone will know, it does not affect in=>out result. And that's what FP cares about.
-
-
www.swirlds.com www.swirlds.com
-
Once round r is settled, thefuture rounds will reshuffle, and the calculations for round r + 1 famous witnesseswill be done using the new stake record.
Change to stake record that happened by concluding round r will not affect round r events, they will use the previous stake record. Events that are r+1 will use the new one.
-
- Dec 2023
-
www.semanticscholar.org www.semanticscholar.org
-
When wave 𝑤 completes (Line 34), we use the global perfectcoin to retrospectively elect some process and consider its vertexin the wave’s first round as the leader of wave 𝑤 (Line 35).
Can we use random selection of a concluding vertex based on hashes of possible concluding witnesses?
We can sprinkle it with more state known to all, e.g., round number.
-
In a nutshell, the idea is to interpret the DAG as a wave-by-wave protocol and try to commit a randomly chosen single leadervertex in every wave.
How much txes will not be received in a round?
This is an interesting approach. It seems that in each round received txes will be more than in a received round of Hashgraph, by
receivedTxesByEachMember difference intersection(receivedTxesByEachMember)
. -
Our validity property requiresthat all messages broadcast by correct processes are eventually or-dered (with probability 1), whereas most Byzantine SMR protocols(i.g., [ 17, 37 , 43 ] require that in an infinite run, an infinite number ofdecisions are made, but some proposals by correct processes can beignored.
BAB that forms a DAG retains conflicting txes by ordering them.
-
optimal qua-dratic communication
That is a not lowest there is for a single YES/NO answer.
Fast path of ConsensusOnDemand requires O(n), it's only used when many peers approve a tx as not conflicting with others. And this is often the case for cryptocurrency, where conflict occurs only when txes on the same account are issued in parallel, which is rare.
Hashgraph's YES/NO on famous witnesses, which totally orders many txes, is O(n * log n).
-
-
www.semanticscholar.org www.semanticscholar.org
-
The message complexity is herebyimproved to O(n).
Message complexity of fast path of ConsensusOnDemand is O(n).
-
-
www.swirlds.com www.swirlds.com
-
O(n log n)
Hashgraphs YES/NO about fame is O(n * log n)
-
set of each event z such that z isa self -ancestor of a round r unique famouswitness , and x is an ancestor of z but notof the self -parent of z
Many self-ancestors may be ancestors of z. We'd need to take the earliest one.
-
(d), one event (orange) is an ancestorof each of 4 intermediate events by different creators (red)
Red event of the rightmost member is not an ancestor of the orange event. Rather, orange event is the ancestor to itself, and so can see itself. So orange event should be also a red event.
-
Amuch faster approach is to pick just a few events (vertices in the hashgraph),to be called witnesses, and define a witness to be famous if the hashgraphshows that most members received it fairly soon after it was created
Why is it faster? What's the defenition of "faster"?
-
x.famous ← UNDECIDED
assigned to all x, not just witnesses
-
If two events have the same received round, then they are sorted by theirreceived time
Or a conflict resolution strategy. Such as CRDT or custom one.
-
If Alice and Bob both have the same hashgraph, then theycan calculate a total order on the events according to any deterministic function ofthat hashgraph, and they will both get the same answer. Therefore, consensus isachieved, even without sending vote messages
^
-
When Bob sends an event to Carol, that sequence number can be calculated byCarol, so there is no need for Bob to actually send it over the internet
I.e., akin to logical clock.
Also, if timestamps are required, they can be compressed as delta milliseconds from the previous timestamp.
-
Only the state itself needs to bekept. It might be possible to do this every few minutes, so there will never be ahuge number of transactions and events stored. Only the consensus state itself. Ofcourse, a member is free to preserve the old events, transactions, and states, perhapsfor archive or audit purposes. But the system is still immutable and secure, evenif everyone discards that old information.
These old events do not contribute towards processing of the new ones. So they are not required to be kept in "RAM" so to say, across all the participants. This history can be kept safe somewhere in cold storage.
-
To defend against attackers whocan control the internet, there are periodic coin rounds where witnesses can votepseudorandomly. This means that even if an attacker can control all the messagesgoing over the internet to keep the votes carefully split, there is still a chancethat the community will randomly cross the 2n/3 threshold. And so agreement iseventually reached, with probability one.
Pseudo-random voting occurs periodically to mess with adversaries, given they have control over communication layer.
-
The protocol might re-quire that Alice send those events in topological order, so Bob will always have anevent’s parents before receiving the event
Or send them all at once.
-
to a random member
Perhaps only to members that do not know it yet, as known by local hashgraph
-
-
www.semanticscholar.org www.semanticscholar.org
-
Issuing a Transaction: The client c1 composes and signs a transaction.Client c1 sends the transaction to the validators. If the inputs of the trans-action are not part of genesis, the client also sends the confirmations of theinputs to the validators.
Due to validators not talking to each other, may it be that they end up in different states? As client may share to a part of them, some 2/3.
Clients do send inputs though. So I guess that's the way the history of txes gets communicated.
Perhaps that's the whole point. Validators approve based on the inputs, given they weren't spent.
-
-
tik-db.ee.ethz.ch tik-db.ee.ethz.ch
-
Finally, the high communication complexity of the underlying protocols, i. e., consensus of shards, handling cross-shard transactions, and bootstrapping of participants to shards, can make the bandwidth become the bottleneck.
Bootstrapping new participants may be a bandwidth bottleneck, as they need to download previous history.
Perhaps compactien/snapshots to the resque.
-
Furthermore, we show that against a somewhat adaptive adversary, sharding protocolsmust periodically compact the state updates (e. g., via checkpoints [26], cryptographic accumulators [7], zero-knowledgeproofs [6, 31] or other techniques [12, 19–21]); otherwise, the necessary bandwidth resources of each participantincrease infinitely in time, eventually exceeding the resources needed in a non-sharded blockchain
Compacting history is necessary, as it's ever-growing, and given it grows faster than storage of peers it'll eventually reach the size > than available storage.
-
Additionally, we show that scalable sharding isimpossible if participants maintain (and verify) information on all other shards with which they execute cross-shardtransactions. This happens because the storage requirements become proportional to that of a non-sharded protocol.
If, in order to process a cross-shard tx, a shard needs to verify other shard's txes, then storage does not scale, as it needs to download txes. (and perhaps compute does not scale, as it needs to verify them)
-
-
www.semanticscholar.org www.semanticscholar.org
-
To this end,Bob’s representative replica collects and aggregates CREDITmessages for the same incoming payment into a dependencycertificate for Bob’s xlog.
Akin to aggregating delta of a tx as commits in corresponding subject repos.
-
has messagecomplexity of O(N 2)
This is a ton of complexity per message.
-
Astro guarantees total order within—but not across—xlogs
xlog is an index of txes of a sender by tx idx.
There can be another index by beneficiary.
And the total of an account is the sum across the two.
I guess the representative would need to aggregate the two.
This resembles an AST of txes. Where tx's parents are other txes it depends on. E.g., sending money would depend on all current heads: last spent money tx and all novel receive money txes.
-
Essentially, preventing Alice from double-spending meanspreventing her from reusing sequence numbers. To do so, thebroadcast layer in Astro provides Byzantine resilience. Thisensures that a malicious client cannot broadcast two differentpayments with the same sequence number.
Wouldn't broadcast layer become a peer that keeps all passed through it txes, filtering the ones with duplicate idx?
And all communication's required to go through that peer.
Getting us back to centralized architecture.
There can be a set of such peers, but then they'd need to have consensus on what txes they seen.
-
unlike consensus
Asynchronous BFT consensus is possible, as shown by Hashgraph.
-
Astro builds on the insight that consensus is unnecessary toprevent double-spending
* total order consensus
-
-
vegvisir.cs.cornell.edu vegvisir.cs.cornell.edu
-
While that prohibits applications thatneed a unique total ordering on conflicting transactions, likecryptocurrencies where double spending must be prevented
Vegvisir's consensus is suitable for CRDTs, however it's not suitable for conflicting txes.
-
-
arxiv.org arxiv.org
-
The gossip-about-gossip method allows for the history ofall the communications within an infrastructure to be reconstructed
More like acknowledgements about acknowledgements. Sent/received isn't captured.
-
None of these methods, though, includesdynamic membership, and this is a key differentiator in this paper
That's not true. The Hashgraph paper describes that there can be dynamic membership, by the exact same technique described in this paper - nodes vote for member addition.
-
- Nov 2023
-
www.swirlds.com www.swirlds.com
-
is proved in section 5 that if x and y are ondifferent branches of an illegal fork, then w can strongly see at most one of x andy, but not both.
But a solution of ignoring events issued by peers that created a fork means that
w
can strongly seex
but having learned of the forky
it would ignorex
. So, in presence of this solution, we can't be sure on finality of "strongly seen". Or, I guess, the solution can be applied only to seen, but not strongly seen ones. So strongly seen events stay final. -
If there are n members (n > 1), then an event w can strongly see an event x,if w can see more than 2n/3 events by different members, each of which can seex.
I.e., x is "strongly seen" when it's seen by super-majority.
-
In other words, w can see x if x is known to it, and no forksby that creator are known to it
I.e., forks indicate that a peer is missbehaving. A strategy is to ignore all his events.
Also peers that communicate with a malicious peer are endangered, so "excercise caution".
-
If Aliceis honest, she will calculate virtual votes for the virtual Bob that are honest. Evenif the real Bob is a cheater, he cannot attack Alice by making the virtual Bob voteincorrectly.
I.e., peers can't attack by sending incorrect vote(localState) results.
-
The otherancestor events (gray) are not stored in the red event, but theyare determined by all the hashes. The other self-ancestors (darkgray) are those reachable by sequences of self-parent links, and theothers (light gray) are not.
So they are a part of DAG, addressed by hashes, right?
-
Most Byzantine fault tolerance protocols without aleader depend on members sending each other votes. So for n members to agreeon a single YES/NO question might require O(n2) voting messages to be sent overthe network, as every member tells every other member their vote. Some of theseprotocols require receipts on votes sent to everyone, making them O(n3). And theymay require multiple rounds of voting, which further increases the number of votingmessages sent.
Naive leaderless BFT is damm costly on communication.
-
It is sufficient to tell Carol that this event is the next one by Alice, and that itsother-parent is the third one by Bob. With appropriate compression, this can besent in very few bytes, adding only a few percent to the size of the message beingsent.
I.e., events can be uniquely identified by peer+event index. So they can be compressed to [int, int].
-
Alice will not send Carol the red event until Carol alreadyhas all its earlier ancestors
Nice. Interesting, how communication history can be used to learn what a peer knows. It is a stale view, and the peer may learned some more on top. But at least it spares the need to re-transmit already known values.
-
Instead of Alice signing a transaction she creates, she will sign theevent she creates to contain that transaction. Either way, she is only sending onesignature
Having reified communication that is signed spares the need for signing the stuff that's being communicated.
-
Virtual voting - every member has a copy of the hashgraph, so Alice cancalculate what vote Bob would have sent her, if they had been runninga traditional Byzantine agreement protocol that involved sending votes.So Bob doesn’t need to actually her the vote.
Vote is a function of local state. So instead of computing it, you can send local state and let others compute it for themselves. This makes it resilient to malicious vote(localState) execution, where adversary could send intentianally wrong result.
-
Fairness - it should be difficult for a small group of attackers to unfairlyinfluence the order of transactions that is chosen as the consensus.
Fairness is stronger than Censorship Resistance. It's more like "Manipulation Resistance".
-
An algorithm for a single YES/NO decision can then be extended to decidinga total order on a set of transactions, which may further increase the vote traffic.
Umm, if we settled on partial order of txes - we have total order, no?
-
A proof-of-expired-timesystem [9] can avoid the wasted electricity (though perhaps not the cost of miningrigs) by using trusted hardware chips that delay for long periods, as if they weredoing proof-of-work computations.
Wow, that's a thing. Hindering performance at the HARDWARE level. This just takes it up a notch.
It's like limiting max typing speed on a type writter.
-
It also means that itis necessary to slow down how fast blocks are mined, so that the community canjointly prune branches faster than new branches sprout. That is the purpose of theproof-of-work. By requiring that the miners solve difficult computation problemsto mine a block, it can ensure that the entire network will have sufficiently longdelays between mining events, on average
I.e., PoW as a performance reduction to cater for the upstream problem.
Reminds reasoning behind the design of the QUERTY layout.
Instead of solving the upstream problem, e.g., by having convergence of branches, we bolt on a patch atop, hindering performance so branches can be dropped.
-
though at any time, if an honest memberrepeatedly sends messages to another member, the attackers must eventually allowone through
How nice of him
-
-
arxiv.org arxiv.org
-
Example 9.3 (Bait-and-Switch Attack).
^
-
In the permissioned setting there is the following inter-esting line of research. The aim is to construct an atomicbroadcast protocol based on a combined encoding of thedata transmission history and voting on “leader blocks”.Such protocols allow the network participants to reach aconsensus on a total ordering of the received transactions,and this linearised output forms the ledger.
Atomic broadcast is the technique to broadcast all observed communication and voting in order to reach consensus.
-
The main difference of our proposal to all the aforemen-tioned protocols is that consensus is found on the heaviestDAG without the need for a “linearisation” using any leaderselection. This reduces the purposes of blocks to data trans-mission and voting
Dope.
-
These protocolsremove the bottleneck of data dissemination of the classicalNakamoto consensus by decoupling the data disseminationfrom the consensus finding
By having a DAG of txes and a DAG of votes we have reduced data transmission. As DAG of txes is transmitted once, and so only DAG of votes needs to be extensively communicated with peers. And technique such as Bloom filters will make it pretty efficient.
-
Blockchain-based protocols relyon a chain or “linearisation” of blocks that create a totalorder of transactions. These blocks have three purposes:leader election, data transmission and voting for ancestorblocks through the chain structure, see [34]. Each of theseaspects can be, individually or combined, addressed througha DAG-based component. The various proposals differ inhow and which of these components are replaced by a DAG.
Usually blockchain blocks take on responsibility of: 1. leader election 2. data transmission 3. voting
These can be decoupled and catered for in DAG-based fashion.
-
Tangle 2.0Leaderless Nakamoto Consensus on the Heaviest DAG
.
-
However, as soon as a node from X changesits vote, the resulting situation is symmetric to the initialcondition
Perhaps being able to vote only once can mitigate.
-
Upon receipt of the missing parent block,the past cone is now complete (unless their parents aremissing - in which case the node has to repeat this procedurerecursively).
Or use Bloom filter for fetching all missing blocks at once.
-
Improvements are proposed,for example, in Hashgraph [41] and Aleph [42] and morerecently in Narwhal [43] based on the encoding of the “com-munication history” in the form of a DAG
Have acks as first-class citizen of a DAG to reach consensus.
-
-
-
Probabilistic. The units appended to the system are alwaysaccompanied by the risk of being reversed. The probability isinversely proportional to its cumulative confidence (includ-ing the forms of depth/score/weight etc). The confidence isobtained from the contributions of subsequent units.- Deterministic. The units, once attached to the system atsome point in time, will instantly and permanently get con-firmed so that they can never be removed from the system.
Consensus strategies can be classified into probabilistic and determenistic.
-
5.1 Properties Between BA and NC
^
-
Secondly, building smartcontracts or state transition applications rely on linear order
Perhaps order can be achieved ad-hoc, as required.
E.g., there are SIM-commuting txes addToBalance(Alice, 1B) and addToBalance(Bob, 1B). They are accepted without ordering. Later on there appeared smartcontract "give a billion to first billionare", which is not SIM-commuting with the previous txes. So order of 1st and 2nd tx needs to be decided upon. So we can delay ordering for when such tx3 occurs. And order determenistically, based on the baked-into blockchain rules. E.g., select one one pseudo-randomly based on previous hashes.
-
Each partici-pated node maintains a separate chain, and nodes mutually interactvia the gossip protocol. The node locally creates an event to recordthe history of received information
Each node in Hashgraph maintains a log of observed txes.
-
Graphchain, instead, introduces an incentivemechanism to maintain the graph. Transactions are required topost transaction fees as the offering for collection. That means eachtransaction must refer to a group of ancestor transactions to de-plete their deposit (fees) when verifying the validity. Meanwhile,ancestor transactions must have enough fees for collection. Feesare depleted by incoming transactions from the oldest ancestorsin paths to reach a prescribed total amount. High-fee transactionsattract powerful miners working in parallel for rapid confirmation,while low-fee transactions will finally be picked up by a small miner.Based on this design, Graphchain makes the current transactionsquickly become enshrined in the ancestry of all future transactions.
Graphchain requires tx to come with feels, that are depleted by acks that refer to it.
-
Specifically,the extension of the tangle follows the rule3 where one tip (pend-ing/unapproved transaction) is required to approve two ancestortransactions. Thus, users who issue a transaction will contributeto the security of the system.
Acks on a tx make it contribute to security.
-
several systems,both tie-breaking and conflict solving are integrated intothe extension rule. The consensus mechanism, by this way,reaches the agreement in one-step
Priority and tie-breaker as guidance for consensus.
-
-
Local file Local file
-
Unlike the alternatives, HoneyBadgerBFTsimply does not care about the underlying network
HoneyBadgerBFT is network-agnostic.
-
-
dsf.berkeley.edu dsf.berkeley.edu
-
A program withnon-monotonicity can be made consistent by including coordinationlogic at its points of order.
.
-
-
www.semanticscholar.org www.semanticscholar.org
-
The most telling example fromSection 3.2 is the use of tombstoning for low-latency deletion. Inpractice, memory for tombstoned items must be reclaimed, so even-tually all machines need to agree to delete some items.
Embracing temporality of data as opposed to deletion we eliminate the need for deletion.
Marking an entry with a "valid to" timestapm would make it valueable for future use, as in the future one can run queries as of some time and get the valid results of that time.
Not all info is practilally useful to keep around forever though, e.g., removal of a cart item. Although usefulness is use-case-dependent, and there may be future use-cases for that info. Even more, eventually there will be a use-case. E.g., shop can use removal as insight on shopping patterns. So if it's cheap to store all the data there is - let's store it.
-
A related pattern in storage systems is the use of tombstones: specialdata values that mark a data item as deleted. Instead of explicitlyallowing deletion (a non-monotonic construct), tombstones maskedimmutable values with corresponding immutable tombstone val-ues.
Logical deletion is a way to make non-monotonic deletion monotonic.
-
CALM provides the formal framework forthe widespread intuition that we can indeed “work around CAP”in many cases, even if we violate traditional systems-level notionsof storage consistency.
CAP is about storage consistency.
CALM is about result eventual consistency.
-
This is the conclusion of the Dynamo design.In later work [18 ], we go further to make checkout monotonic inthis setting as well. : the checkout operation is enhanced with amanifest from the client of all its update message IDs that precededthe checkout message: replicas can delay processing of the checkoutmessage until they have processed all updates in the manifest
I.e., By having in checkout baked in exact cart we can assure monotonicity. This exact cart will be checked out, with no regard to any outside adds and removes of items.
-
-
pdos.csail.mit.edu pdos.csail.mit.edu
-
A keyinvariant of these coherence protocols is thateither 1) a cache line is not present in anycache, 2) a mutable copyis present in a single cache,or 3) the line is present in anynumber of caches but is immutable
That looks like the design of Rust.
-
At the core of this dissertation’s approach is this scalable commutativityrule: In anysituation where several operations commute—meaning there’s no wayto distinguish theirexecution order using the interface—theyhave a implementation that is conflict-free duringthoseoperations—meaning no corewrites acachelinethat was reador written byanother core.Empirically, conflict-free operations scale, so this implementation scales. Or, more concisely,whenever interface operations commute, theycan be implemented in a waythat scales.
I.e., Operations commute if there's no way to distinguish order between them using the interface.
-
-
Local file Local file
-
Importantly, the external transaction determines valid modifiers of each state.
Managing users and what they can do, as well as effects of their actions, invariants that should hold, can be achieved by specifying it in yet another system-level tx.
-
It isinflexible for a node to join or leave a workflow at any time since the config-uration of channels is fixed.
This isn't true. Channel configuration, that includes MSP (Membership Service Provider), a definiiton of members, can be updated via special tx. The result will be a new block, solely containing the configuration change tx.
So it's possible to change members of a channel, but it seems to be costly, and only authorities can do that.
So rather, it's impractical for a node to join and leave a channel, since it would require coordination with the authorities of that channel.
UPD: in this paper they refer to 2016 verison, whereas I refered to 2018 version.
-
Moreover, interactions between channels rely on atrusted channel [3] or an atomic commit protocol [2], which either breaks thedecentralization or still treat the system as an entirety.
Some solutions to cross-shard tx include: - trusted channel, breaking decentralization - atomic commit, involving all the participants of cross-sharding tx
-
-
www.semanticscholar.org www.semanticscholar.org
-
The configuration of a channel may be updated using achannel configuration update transaction. This transactioncontains a representation of the changes to be made to theconfiguration, as well as a set of signatures.
In Fabric it's possible to update configuration of a channel by issuing a special tx.
-
-
dspace.mit.edu dspace.mit.edu
-
The relative ordering of aborted trans-actions is kept the same. As a result, every transaction cancommit eventually, since the first transaction in a batch al-ways commits and the position of an aborted transactionalways moves forward across batches
Aria eventually commits txes.
-
-
www.semanticscholar.org www.semanticscholar.org
-
However, as the orderingbetween transactions is determined by consensus and fixedacross all nodes, we leverage this to permit both transactionsto write to different copies of the object, but subsequentlycommit only the first as determined by the ordering.
Wasted effort in ordering and execution of a to-fail tx. Moreover, all the info info needed to execute it properly atop the other tx is there!
-
-
arxiv.org arxiv.org
-
Hot Key Theorem. Let l be the average time between atransaction’s execution and its state transition commitment.Then the average effective throughput for all transactionsoperating on the same key is at most 1l .
.
-
Compatibility with external oracles: To allow the useof external oracles in the deterministic second executionphase, we gather and save oracle inputs in the pre-orderexecution step.
I.e., external oracles are read deps as of pre-order execution step.
-
A dependency exists when two transactionsoverlap in some keys of their RW sets (read-only transactionsdo not even enter the orderer). In that case, we cannot processthem independently. Therefore, we need to keep track ofdependencies between transactions so we know which subsetsof transactions can be processed concurrently
Note: RW of dependee transaction overlaps with W of dependent transactions. As clarified later in the paper.
-
This requires theuse of a deterministic contract language, such as Ethereum’sSolidity, which must be learned by the application developercommunity
Or any other determenistic compute. Such as Docker, or Nix, Guix, IPVM.
-
-
syslab.cs.washington.edu syslab.cs.washington.edu
-
With this optimization, TAPIR can now accept: (1) readsof past versions, as long as the read timestamp precedes thewrite timestamp of the next version, and (2) writes in thepast (per the Thomas Write Rule [45]), as long as the writetimestamp follows the read timestamp of the previous versionand precedes the write timestamp of the next version.
I.e., write in the past, preserving serializability.
-
-
www.semanticscholar.org www.semanticscholar.org
-
A peer checks if a sufficient number of valid endorsingpeer signatures, based on the endorsement policy, have been col-lected (Validation System Chaincode (VSCC) validation)
Checks that are tx-based could be performed by the endorsing peers in order to fail-fast. They would not even need to execute tx if it's syntactically invalid.
-
Table 1: Notations for sets
Very nice to give legend. Spares the need to search text for definition of an abbreviation.
Perhaps even nicer would be to have wikipedia-like modals on hover.
-
Optionally, the clientmay check the validity of the endorsing peers signatures and theconsistency between the read/write set received from differentpeers.
I guess this check is required in order to fail-fast. And fail may be due to: 1) read-set inconsistency between endorsement peers (some peer has stale db) 2) tx is not deterministic (if it's a smartcontract - it's a bad smartcontract, if it's a plain tx issued by peer - who cares, as long as issuer is happy to tx one version of the results)
In 1) case, client been let down by that stale endorserer. He could pick endorsments that have the latest read-set and send them over, since including stale endorsment would result in failure of his tx. Alternatively, what about not fail tx with stale endorsements, but pick the ones that are latest (optionally checking that latest endorsments count > some threshold, to be sure in faithful compute of the results).
It makes sense, no? Those are valid results as of some state of db. Pick the latest one. And if you can't commit them due to a conflicting tx that happened prior - re-run atop it.
The main idea is - don't fail, it's a bad UX. Let the tx eventually commute. Given the tx is well-formed (determenistic).
The issuer may not be happy for a tx to be ru-run. In this case let him state that explicitly. If he's fine with it - eventually commit.
The issuer may wish for tx to commit within a certain timeframe, say 1 hour or until he explicitly aborts. Again, he can be empowered to express that.
-
The client collects the endorsing peers responses andsends them to the ordering service nodes
Delegating execution to endorsing peers allows to: 1) Spare the need for clients to be compute-able 2) Gives trust that tx been executed faithfully (if client to entrusted with executed - it can provide false results. Execution by trusted peers would be required to ensure trustful result of a tx. These peers would not necesserily be the system-owned-peers, they can be the stakeholders of blockchain. But here we're back to having endorsment peers, just of a different kind).
Also, is there harm in commiting "wrong" results that's been provided by the issuer? This tx won't be a smartcontract, but it's kinda ain't already, as tx is being issued by some peer, instead of being globally registered.
-
-
www.semanticscholar.org www.semanticscholar.org
-
First, wedeliberately do not send transaction read-write sets to theorderers, only transaction IDs. Second, we chose to keep toFabric’s design goal of allocating different tasks to differenttypes of nodes, so our orderers do not examine the contents ofthe read and write sets
Good call. Let it do its thing.
Conflicts can be identified by Fast Peer. And istead of dropping the rest of conflicting txes, it could delay their (re-execution) to the next block, in FIFO manner, one tx per block.
-
Arguably, the use of BFT protocols in permissionedblockchains is not as important as in permissionless sys-tems because all participants are known and incentivizedto keep the system running in an honest manner.
I.e., stakeholders are incentivized to maintain consensus, and BFT is less valuable, as they more inclined to well-behave.
-
(a) to achieve con-sensus on the transaction order
Why do we need tx order if txes that are conflicting will be rejected by the commiter? (except for XOX and FabricCRDT) Couldn't we drop them now?
Or rather, couldn't orderers order the conflicting txes and process them one-by-one, FIFO? So users don't need to resubmit, their tx will get processed at some time. Let user supply process time he's fine to wait for.
-
Transactions follow an execute-order-commit flowpattern instead of the common order-execute-commit pattern.Client transactions are first executed in a sandbox to determinetheir read-write sets, i.e., the set of keys read by and writtento by the transaction. Transactions are then ordered by anordering service, and finally validated and committed to theblockchain
In Fabric, Execution gives insight into read-write sets.
-
-
dl.acm.org dl.acm.org
-
Furthermore, it doesn’t say that every reorderinghas indistinguishable results on a given implementation;it requires instead that every reordering is allowed bythe specification to have indistinguishable results.
I.e., an implementation may accidentally compomise SIM-commutativity of its interface.
-
-
www.semanticscholar.org www.semanticscholar.org
-
But as noted above, a partial order like happens-before is iso-morphic to a join semi-lattice! That means that we can capture theCmRDT log as a CvRDT. One simple way to do this is to hold theDAG corresponding to the partial order: this is simply a “grow-only”set—whose only operation is add—which holds edges of a DAG.Each directed edge connects two operations, and indicates that theoperations happened in the order of the edge. (Another solutionwould be to use CvRDTs for vector clocks.) A natural query oversuch a CvRDT is to “play the log”: i.e. produce a topological sort ofthe log and apply the operations in the resulting order.
CvRDT can be used to capture causal DAG of ops (CmCRDT), as used by ipfs-log.
-
-
cacm.acm.org cacm.acm.org
-
Instead of explicitly allowing deletion (a non-monotonic construct), tombstones mask immutable values with corresponding immutable tombstone values.
Logical deletion is monotonic.
-
deletion (a non-monotonic construct)
Deletion is non-monotonic.
-
Unfortunately, neither the add nor delete operation commutes with checkout—if a checkout message arrives before some insertions into either the Added or Deleted sets, those insertions will be lost.
checkout is meant to be performed on some state of the cart. When it's issued the exact state is known. So checkout can refer to it. Making it concrete / context-free.
-
As a traditional approach to avoid such "race conditions," we might bracket every non-monotonic delete operation with a global coordination to agree on which add requests come before it. Can we do better?
Alternatively, DEL can refer to the exact ADD. As shown in SU-Set paper.
-
-
www.vldb.org www.vldb.org
-
However, this powerful abstractioncomes with a hefty cost: concurrent transactions must coordinate inorder to prevent read/write conflicts that could compromise equiv-alence to a serial execution.
ACID requires coordination of parallel transactions that intersect by their by their write-sets.
Using CRDT as result + I-confluence rules allow them to be executed in parallel.
However, when when a tx intersect by its read-set with write-set of another, its result may be affected. Such txes may be I-confluent, but application-level correctness is not guaranteed.
-
we only consider a singleset of invariants per database rather than per-operation invariants
I.e., this paper uses invariants on results, not invariants on txes themself.
-
we explic-itly require more information from the application in the form ofinvariants (Kung and Papadimitriou [38] suggest this is informationis required for general-purpose non-serializable yet safe execution.)
I.e., invariants that are I-confluent are required to know what can be safely executed coordination-free.
-
However, we are not aware of related work that dis-cusses when coordination is strictly required to enforce a given set ofinvariants
I.e., when result is not I-confluent and requires coordination has not been studied, this paper is the first.
-
or safety (e.g., the CRDT OR-Set does not—by itself—enforce invariants, such as ensuring that no employee belongs totwo departments), and are therefore not sufficient to guarantee cor-rectness for all applications
I.e., CRDTs allow results to converge. However, converged state is not guaranteed to be I-invariant.
-
This means that, inour payroll example, we can provide coordination-free support forconcurrent salary increments but not concurrent salary decrements
That's not true. Given an agreed-upon invariant (> or <) users are able to perform inc or dec. So I is an agreed upon contract that allows for some ops to be performed coordination-free. Moreover, this contract may be agreed upon to change.
So, I is a consensual contstraint that allows for coordination-free execution of some ops.
Even more, given we know collaborating parties. Say Alice and Bob are family members that have a shared balance. They may agree that account balance should not reach negative number. Then in order to be able to make coordination-free withdrawals within a given day they can agree that they won't withdraw more than 1/2 of money in that day. So they are safe to withdraw less that 1/2 of money within that day. At the end of the day they may coordinate to determine what's left in the account, And so are able to follow the same strategy the next day.
I.e., parties can agree to share resource and act coordination-free on their share. They may coordinate in order to reconsile shares as they see fit.
E.g., in the payroll example, managers that are in charge for salary adjustment would share an employee salary and would be able to adjust it to not less than their share.
Share holder could be malicious and issue parallel ops that are I-valid but are not I-confluent. E.g., manager could issue two decrements that would tank employee salaree beyound the agreed upon threshold. So there should be consensus within the share/shard.
-
To avoid variants of these anomalies, many optimistic, coordination-free database designs have proposed the use of abstract data types(ADTs), providing merge functions for a variety of uses such ascounters, sets, and maps [19, 44, 50, 58] that ensure that all updatesare reflected in final database state.
How ADTs are different from op-based/delta-based CRDTs?
Both capture deltas as datatype building block. Both converge to a snapshot.
-
However, the reliance on invariants also has drawbacks. I-confluence analysis only guards against violations of any providedinvariants. If invariants are incorrectly or incompletely specified, anI-confluent database system may violate application-level correct-ness.
I.e., make sure that use specify I-variants correctly, otherwise they won't give you correctness of coordination-free execution.
-
If two indepen-dent decisions to commit can result in invalid converged state, thenreplicas must coordinate in order to ensure that only one of the deci-sions is to commit.
I.e., I-valid?(merge(tx1, tx2) = false is not I-confluent, even if I-valid?(tx1) && I-valid?(tx2) = true.
And in such cases coordination is required to determine which tx to adopt first, and whether to adopt the second one on top.
-
Theorem 1 establishes I-confluence as a necessary and sufficientcondition for invariant-preserving, coordination-free execution
I.e., I-confluence is required to reach a convergent state that is I-valid in coordination-free manner.
-
A system is globally I-valid iff all replicas alwayscontain I-valid state
Meaning that I-valid state is preserved by tx results and by merge of tx results - tx results are I-confluent.
I.e., I-valid?(tx1) && I-valid?(tx2) && I-valid(merge(tx1, tx2)).
-
A system is convergent iff, for each pair of servers,in the absence of new writes to the servers and in the absence ofindefinite communication delays between the servers, the serverseventually contain the same versions for any item they both store
I.e., strong eventual consistency of tx results is required for convergence.
-
invariants, orpredicates over replica state: I : D →{true,f alse}
I.e., invariant is a predicate over db state.
-
On a wide-area network, delay is lower-bounded bythe speed of light (worst-case on Earth, around 75ms, or about 13operations per second [9]). Under network partitions [13], as delaytends towards infinity, these penalties lead to unavailability [9, 28]
I.e., coordination in presense of network partition leads unavailability.
Using quorums to acknowledge accepted txes seems to mitigate this. Given 50% of acks is required and tx issuer is able to propagate his tx to 50% of quorum members.
-
The classic answer to maintain-ing application-level invariants is to use serializable isolation: ex-ecute each user’s ordered sequence of operations, or transactions,such that the end result is equivalent to some sequential execu-tion [15, 30, 53].
I.e., serializability is a property of tx execution such that their result is the same as their sequential execution.
I.e., serialized?(someTxExecutionScheme, txes) => someTxExecutionScheme(txes) = serialTxExecutionScheme(txes)
-
Given knowledge of application transactions and correctness crite-ria (e.g., invariants), it is often possible to avoid this coordination(by executing some transactions without coordination, thus pro-viding availability, low latency, and excellent scalability) whilestill preserving those correctness criteria.3. Invariant confluence offers a necessary and sufficient conditionfor this correctness-preserving, coordination-free execution.
I.e., invariant preservation as a way to guarantee correctnes. It's damm efficient, but not all txes will be invariant-confluent.
-
Serializable transactions preserve application correctness at thecost of always coordinating between conflicting reads and writes.
I.e., serialization of txes as generic way to guarantee correctness. Albeit most costly way.
-
In this paper, we address the central question inherent in this trade-off: when is coordination strictly necessary to maintain application-level consistency? To do so, we enlist the aid of application pro-grammers to specify their correctness criteria in the form of invari-ants.
I.e., invariants as a way to specify database validity criteria, which grants knowledge about which txes are safe to run in parallel and which need to run on top of each other.
-
-
www.semanticscholar.org www.semanticscholar.org
-
Use cases that requiretransactional isolation of repeatable reads [ 5] are not a good fit,as FabricCRDT commits transactions even if their read-set is out-dated.
I.e., having tx results as CRDT allows to merge them. But this is only possible if one tx does not change read deps of the other tx. Otherwise the result of the other tx would be ivladit, and would be invalid to merge it. It would need to be re-run on top. E.g., for cases where double-spend can occur.
-
According to our findings, FabricCRDTsuccessfully merges all conflicting transactions without any failureswhen all transactions use CRDTs.
Note: when all txes use CRDTs. When there is a non-CRDT tx it may fail.
-
CRDT Transactions in a Block
Non-CRDT tx is akin to a snapshot. Multiple snapshots for the same cannot exist within a block if you don't have rules on how to merge them (approach described in XOX paper for post-order execution).
CRDT ops can be applied on top of a snapshot.
snapshot cannot be applied on top of CRDT.
So a block with non-CRDT and CRDT ops seems to only be fully valid if it has snapshot first and then CRDT deltas on top.
-
Once a transaction fails, the only option for clients is to create anew transaction and resubmit, which adds to the complexity ofFabric application development
As shown by XOX, it's possible to retain conflicting transactions and perform them on top of detected "happened before" ones.
-
-
arxiv.org arxiv.org
-
where there exists a happened-before relationVoteRegister1
Malicious voter may issue votes with no Vote1 "happened-before" Vote2. If we have 2 of 4 endorsment reuqirement, then he can send Vote1 to two orgs and Vote2 to the other two. Both sets of orgs will endorse the vote.
Seems it may be tackled by imposing 3 of 4 as endorsement threshold.
Or, in general case, > 1/2 orgs.
Given orgs may be malicious (faulty), we can take it into an account by setting fault threshold. Say to 1/4.
Then endorsement threshold will be: ((1/2 + 1/4) +1) * number of orgs.
Or, in general case, endorsement theroshld = ((1/2 + fault threshold) + 1) * number of orgs
-
-
martin.kleppmann.com martin.kleppmann.com
-
Rather than moving forward by a fixed number of predecessors in each round trip, this algorithm allows some commits to be skipped, so that it reaches the common predecessors faster.
Binary search as an alternative
-
-
vegvisir.cs.cornell.edu vegvisir.cs.cornell.edu
-
Instead of resolvingbranches, it permits them, resulting in a Directed AcyclicGraph (DAG) structure of the blockchain rather than a linearone.
Seems it makes possible to double-spend.
-
-
edoc.hu-berlin.de edoc.hu-berlin.de
-
In this chapter, we address theorthogonal problem of automatically suggesting an efficient set of MV for a given workloadof SPARQL queries.
Can we avoid the problem of selecting MVs by treating each query as AST, preserving it's results? And so a set of queries would form a forest of ASTs, where shared nodes would only be computed once.
Nodes become MVs.
Interestingly, a node may be only partially computed.
E.g., if there is a join node AB that joins A and B nodes, then A could be fully computed, and B could be computed only to find matching results for AB pivot.
Although here we mix logical representation of node A, node B, and a pivot node AB,
pivot(A, B)
, with it's physical representation ofjoin(A, B)
and it's optimisationjoin(A, B(A))
, where B(A) is B based on results of A, which resembles embedding A and AB, rather than A, B, AB.These are different ASTs.
B based on A
not= B.
All in all, out of logical queries physical ASTs can be derived that are optimized for those queries.
Perhaps having embeddings as joins is optimal. E.g.,
AB = join(A, B(A)) ABC = join(AB, C(AB))
-
Getting the complete set of solutions of q would require to query the residual pattern(a?,p4,?e) over the RDF dataset and join the results with the partial results of i1 andi2.
Residial patterns could have indexes created for them on-the-fly as overlapping embeddings. Making use of precomputed solutions of those embeddings.
So during query processing we don't seek matches for all residual patternss and join afterwards, dropping some of the matches. Rather we query for matches based on embedding matches, so we don't get "to be dropped" matches at all.
This is an approach on "how to efficiently compute join of embeddings". And it's about having a "join embedding" that joins based on most-restrcitive embedding first. I.e., that embedding that has less matches or, if embeddings haven't been computed yet - that embedding that is expected to be most restrictive (e.g., because it has most restrictive BGB).
-
We focus on covers with overlapping embeddings since they allow better estimations ofthe cost savings that can be achieved with them (see Section 2.3).
I.e., covers with overlapping embeddings allow better cost saving estimation than covers with non-overlapping embeddings.
-
Query
This query could have its own index. Making use of (c) and (d) to maintain it's results by joining their solutions.
I.e., there can be an index per query. Query patterns would be broken down to embeddings for the most efficient compute of queries.
I.e., query as an AST of embeddings.
-
However, for the rest of this thesis we only considerintensional overlaps to avoid the costly pre-computation and maintenance of extensionaloverlaps
An extensional overlap is the one not made intensional.
Extensional overlaps can be made intensional overlaps.
And maintained in the same way.
-
Let G be a data graph, I the set of indexes on G, andQ a query against G. We call an index I ∈ I eligible for Q iff P(I) v P(Q). The set ofall eligible indexes for query Q is denoted by IQ.
The notation is meh.
Alternative:
G - graph
P - pattern
S - solution
I - index
EI - eligible index
EIs = ei(P, I) - eligible indexes I for P
-
itis not possible to map variables ?y and z from i2 to any variable of q
?c would contain all ?y. Since ?y is a more restrictive pattern than ?c.
I.e., it may be possible to adopt some of the ?y matches for ?c.
-
each occurrence of the triple pattern (?a,p1,?b) of q must be contained inthe set of occurrences of i1
Not only be contained, but exactly match.
-
During index creation, indexes are computed andpersistently materialized and the results are made available for the query processor. Inour approach, indexes are defined offline, i.e., created by an administrator before queriesare executed.
Alternative approach is to create indexes at arbitrary time, e.g., after query execution.
Then, during query execution available indexes are used. And the result can always be indexed as one needs.
This would speed up query execution. As there is no cost of index creation and maintenance at this stage.
-
The frequency of I in G, written as #G(P), is thenumber of occurrences of P in G.
|O| would be more uniform in style, akin to |P|.
Or O(P(Q), G), |P(Q)| - size of query, |O(P(Q), G)| - size of solutions.
Or, in polish notation, (O (P Q) G).
-
O represents the set of all occurrences of P in G
I.e., O = solutions of P.
-
which is the set of triple patterns in the body of Q
I.e., a BGP.
-
Figure2.1 shows the graph representation of a triple pattern.
I.e., tripple pattern reified in RDF.
-
The query pattern must be divided into that part that is executed byusing materialized results and the rest of the query patterns that are not possible or notbeneficial to execute using them
Perhaps would be nice to have that as optimization at the query engine level?
-
1.2.6 Materialized View Selection Problem
A naive approach is to have MV per each intersection of queries.
-
The MV–selection problem can be viewed as a combinatorial optimization problemsince its solution has to be chosen from among a finite number of possible configurations.Therefore, explicit and complete enumeration and scoring of all possible MV subsets is,in principle, a possible way to solve it. This method is however impractical in most cases,since the computational effort grows exponentially with the number of candidate MV[25].
Enumirating all possible MV combinations in order to find the best one found impractical.
-
However, we deal with the fact that queries may notbe fully answered using only materialized results. Thus, the partial results need to beextended to answer the query.
Resembles "Partial Materialized Views" approach used by SageDB.
-
-
www.vldb.org www.vldb.org
-
Partial Materialized Views
.
-
-
arxiv.org arxiv.org
-
doc.get(“todo”).idx(1).get(“done”) := true;
A fine dot.notation interface for delta-based CRDT
-
In order to generate globally unique operation identifierswithout requiring synchronous coordination between repli-cas we use Lamport timestamps [42].
Ehh. Isn't author+deps enough?
-
“done”: true
Ehh, why?
This node has been marked as deleted and has"done" true field. Why is this field leaked out of the deleted node?
Data Laced with history gets it right by having data structure as AST. Where you'd have a node with this field, and that has been marked as deleted, resulting in node and all its fields "deleted".
-
-
Local file Local file
-
Collaboration = keep all edits and merge them
-
-
agregore.mauve.moe agregore.mauve.moeAgregore1
-
Agregore works by letting you load web pages and content from peer to peer protocols the same way you would load them from a website.
I.e., use Hypercore to maintain a personal website. Use Aggregore as browser.
-
-
docs.holepunch.to docs.holepunch.to
-
Append a new value to the autobase.
This makes autobase view not just a view.
Why isn't it done in one of the inputs?
-
-
docs.holepunch.to docs.holepunch.to
-
Hypercore is a secure, distributed append-only log built for sharing large datasets and streams of real-time data. It comes with a secure transport protocol, making it easy to build fast and scalable peer-to-peer applications.
Hypercore bundles CRDT with p2p.
It is akin to OrbitDB's db, if only it'd have one writter.
OrbitDB is a toolkit, where you can assemble your CRDT p2p solution. Making use of components: - ipfs-log as base for your CRDT - libp2p as base for your p2p - auth to manage readers and writters
Whereas Hypercore seems to bundle it all up.
-
-
dat-ecosystem-archive.github.io dat-ecosystem-archive.github.io
-
Multi-writer will allow Dats to be modified by multiple devices and multiple authors at the same time. Each author will have their own secret key and publish a Dat with their data in it. Multi-writer fuses all of these separate Dats into one “meta-Dat” that is a view of everyone’s data combined
-
-
-
multi-causal operations will prevent the structured log from being linearly interpretable and may severely affect performance, simplicity, and intent
Welp, what can we do when an op does express intent that depends on multiple others? E.g., transferm money from Bob to Alice
-
This won’t be convergent by default since the operations don’t have an inherent total order, but it’s easy to fix this by giving each one a globally-unique ID in the form of an owner UUID 3 plus a Lamport timestamp. (In the diagrams below, this is encoded as “SX@TY”, where “SX” represents the UUID for site X, and Y is just the timestamp.) With this scheme, no two operations can have the same ID: operations from the same owner will have different timestamps, while operations from different owners will have different UUIDs
Welp, would be nice to have author inside the operation.
Lamport timestamp doesn't seem to add any new information. It's an idx of op, which can be derived from the longest chain of its parent if needed.
Wall clock is useful. It's akin to "decision time".
Wall clock + author would be enough for uniqueness of operptons. And linking can be done by hash.
-
Nonetheless, I made one additional tweak to ensure that remapping only happens very rarely. Instead of storing just the UUID in the site map, I also store the wall clock time at which the UUID was added. In the site map, these tuples are sorted first by time, then by UUID. Assuming that modern connected devices tend to have relatively accurate clocks (but not relying on this fact for correctness), we can ensure that new sites almost always get appended to the end of the ordered array and thus avoid shifting any of the existing UUIDs out of their previous spots. The only exception is when multiple sites happen to be added concurrently or when the wall clock on a site is significantly off.
Perhaps an alternative way is to have a tx log on each devices - which will only be accreted with newly observed ops. Then it's order never changes and we can use idx as identifiers that stay evergreen.
-
I’ve solved this with the help of a secondary CRDT that’s stored and transferred along with the CT: an ordered, insert-only array of known UUIDs called the site map. The 16-bit site identifier corresponding to a UUID is simply its index in the array.
Reminds of the techinque WasmTree uses, where an internal representation of RDF nodes replace URLs with an integer ID and has a map to resolve URLs back.
Except that here we use idx of a log entry in place of its UUID. Which requires to update mapping on merge of new ops. Whereas by having integer ids in entries, in place of UUIDs, we would be spared from that. Con is that there is a storage cost to store those integer ids. Which is lower than cost of UUIDS, but more than of that idx approach.
-
-
martin.kleppmann.com martin.kleppmann.com
-
We see further interesting problems around types, schemamigrations, and compatibility. Different collaborators may beusing different versions of an application, potentially withdifferent features. As there is no central database server,there is no authoritative “current” schema for the data. Howcan we write software so that varying application versionscan safely interoperate, even as data formats evolve?
Events to the rescue?
-
CRDTs do not require a peer-to-peer networking layer;using a server for communication is fine for CRDTs. However,to fully realize the longevity goal of local-first software, wewant applications to outlive any backend services managedby their vendors, so a decentralized solution is the logicalend goal.
Should not bake a server into a lofi software as a mean to sync. It needs to outlive the server.
-
URLs are a good mechanism for sharing
URLs as collaboration sites been found handy.
They could be treated as "compilation sites" on a subject.
Where each user could have their own compiled view.
I.e., they are IDs of subjects. Note: ID not= content. Content is user-specific. A personal compilation / view on it.
-
Visualizing document history is important
Would it be valuable to have it generic?
Same time-travel interface for all the apps.
-
However, in all the prototypes we devel-oped, we found that the default merge semantics to be suffi-cient, and we have so far not identified any case requiringcustomised semantics
The mentioned "one user sets state on a task to one state and another to another" happened. As mentioned, it's not clear how to resolve. So they did encounter merge conflicts. I guess these conflicts are not app-specific and conflict resolution can be implemented in a generic way as well. Such as present users with choice which delta to adopt.
-
In practice, the CouchDB model has not been widelyadopted [42]. Various reasons have been cited for this: scala-bility problems when a separate database per user is required;difficulty embedding the JavaScript client in native apps oniOS and Android; the problem of conflict resolution; theunfamiliar MapReduce model for performing queries; andmore. All in all, while we agree with much of the philosophybehind CouchDB, we feel that the implementation has notbeen able to realize the local-first vision in practice.
CouchDB is a meh local-first backend (storage).
-
These thick-client apps have the advantage of being fastand working offline, because the server sync happens in thebackground.
Rather "thick-client local-first apps". As "thick-client" on it's own does not imply local-first, there may be a server.
-
web apps will never be ableto provide all the local-first properties we are looking for,due to the fundamental thin-client nature of the platform
Web apps can be thick clients.
Optionally they can be loaded gradually. E.g., with granularity down to components, as shown by ReactJS.
Having loaded the entirety - they can potentially work offline.
Making web app and its components content-addressable would allow to load them with no regard to location / some authoritative service.
-
In many web browsers,if the user clears their cookies, all data in local storage isalso deleted [121]; while this is not a problem for a cache,it makes the browser’s local storage unsuitable for storingdata of any long-term importance
Local Storage is not a good solution for persistence.
-
In principle it is possible to collaborate without a repository service, e.g.by sending patch files by email [48],
Sending git diffs as text over emails or other medius is the way it's been meant to use, if I'm not mistaken.
-
It’s interesting to note that most software engineers havebeen reluctant to embrace cloud software for their editors,IDEs, runtime environments, and build tools. In theory, wemight expect this demographic of sophisticated users to em-brace newer technologies sooner than other types of users.
I.e., software engineers may be best keen to adopt lofi concepts.
-
But Git has no capability forreal-time, fine-grained collaboration, such as the auto-matic, instantaneous merging that occurs in tools likeGoogle Docs, Trello, and Figma
Because it captures snapshots and not deltas.
-
Git is excellent for asynchronous collaboration, espe-cially using pull requests (Figure 6, [61]), which takea coarse-grained set of changes and allow them to bediscussed and amended before merging them into theshared master branch
There is no "shared master branch" in git. Subsequently, no authority that decides what's in it. Each git user is free to adopt any changes they see fit.
-
If your computer’s hard drive fails, you canrestore your work simply by installing the app and waitingfor it to sync.
In an event when the sync service is down and you lose local copy - you won't be able to restore.
For this case it's better to have multiple sync options. Such as your other devices and incentivized third-parties.
-
The flip side to this is a total loss of ownership and control:the data on the server is what counts, and any data on yourclient device is unimportant — it is merely a cache
Whereas it is the opposite in reality. Devices are the primary sites of intent they issue. And server is merely an aggregation site.
-
Attachments are easy to understandand trustworthy.
Given email is signed.
-
Using a version control system such as Git
Git alone does not dictate how one transfers stuff.
-
Consider a significant personal creation, such as aPhD thesis or the raw footage of a film. For these you mightbe willing to take responsibility for storage and backups inorder to be certain that your data is safe and fully under yourcontrol
Loosing a PHD paper, especially after it has been published and some other papers already referred to it / based their reasoning on top of it, would be a serious hit to humanity's knowledge. Sadly, this is the reality today, may happen.
-
With data ownership comes responsibility: maintainingbackups or other preventative measures against data loss,protecting against ransomware, and general organizing andmanaging of file archives
Would be nice for lofi software to make it as seemless as replicating across chosen parties automatically.
E.g., in your account you linked your devices, and chosen them as replication sites. In addition you chosen a third-party, e.g., Web3Storage.
So you can use apps and rest worry-free about persistence of your data.
-
so you have the freedom to process thisdata in arbitrary ways
Even better if software uses known data model, such as RDF, and so you're not locked onto the tool.
Perhaps even better to capture your intent as events and have data models derived out of it. This way you're not even locked at e.g., RDF ecosystem.
-
You should be able to copy and modifydata in any way, write down any thought,7 and no companyshould restrict what you are allowed to do
I.e., don't rely on third-party as a medium to express yourself. Otherwise they're in control what you can express.
E.g., Twitter/X, Google Docs and their bans.
-
Although there does not seem to be a great danger ofGoogle shutting down Google Docs anytime soon, popu-lar products (e.g. Google Reader) do sometimes get shutdown [106] or lose data [105], so we know to be careful
I have dread in my head every time I use cloud service that there is an issue about to happen. There will be a data breach, data loss, service becoming unavailable (e.g., they don't like you anymor, e.g., Github can block you from your private repos at their will), it's just a matter of time.
So I need to get my data out, back it up, if service been kind to allow for that. But that's a burden. It's a hidden cost you learn to recognize over time. May not be apparent from the start to everybody.
-
Cuneiform script on clay tablet, ca. 3000 BCE. Image from Wikimedia Commons [5].
This needs explanation.
-
Besides having severalpeople edit the same document in real-time, it is sometimesuseful for one person to tentatively propose changes thatcan be reviewed and selectively applied by someone else
This stems from the design that there is a mutable place and some authority over it. E.g., a Google Doc, Wikipedia page.
Would be nicer to embrace people as primary sites of their intent. And allow to compose them. This way a person who would previously need to request changes to a doc, would simply have a kind of forked doc locally, where he would make the change. And that delta may be shared with the original doc's authors, who may accept or reject it. With no regard to their decision, the forked version stays as intented, with the delta.
I.e., docs are sites that aggregated some deltas. Have a site per person. Allow sites to exhcange deltas. This allows for having personal views on a subject.
Akin to how it's done in Git. If only it tracked deltas and they can be arbitrary.
I.e., NextGraph gets it pretty damm right.
Also RhizomeDB is this spirit.
-
neverneeding to show you a spinner while you wait
For the cases when you do wish to send op to a remote party you'd still need a spinner. Given you embrace distributed compute.
-
Local-first software is different: because it keeps the pri-mary copy of the data on the local device, there is never aneed for the user to wait for a request to a server to complete.
Local-first has most responsive UX due to operations being performed locally, as opposed to sending them to the server.
-
The user interface may try to hide that latency by showingthe operation as if it were complete, even though the requestis still in progress — a pattern known as Optimistic UI [92] —but until the request is complete, there is always the possibil-ity that it may fail (for example, due to an unstable Internetconnection).
Optimistic UI / Optimistic Updates may actually fail at the source being updated (suprise-suprise, the word "optimistic" is in the definition).
-
-
martin.kleppmann.com martin.kleppmann.com
-
Online eventprocessing (OLEP) [36] is an approach for handling such multi-partition interactions by breaking them down into multiple streamprocessing stages.
This breaks down the original intent
transferMoney(Alice -> Bob)
into multiple one. Thus we don't have atomicity, as shown in the paper.Breaking down seems to be just one approach of how such events can be handled by OLEP though. An implementation detail.
It seems possible to achieve atomicity and isolation for such events, as described here.
-
then it is no longer sufficientto process the events in each partition independently
We can treat partitions as aggregates of events.
Given there is Alice and Bob, and Alice transfers Bob money.
Then we have 3 partitions, that aggregate their respective events.
Event
transferMoney(Alice -> Bob)
would end up in all three partitions: Alice, Bob, Alice+Bob. -
A challenge in many event-based systems is how to handle changesin the schema or data format [56]. The problem is especially pro-nounced in systems where we cannot guarantee that all replicasare running the same version of the software. For example, in ap-plications that run on end-users’ devices, it is up to the user todecide when to install a software update; thus, a user who is hes-itant to install updates may be running a much older version ofan application than a user who always installs the latest version.Nevertheless, those users should be able to interoperate as much aspossible, which means that any data format changes must be bothforward and backward compatible. The challenge becomes evengreater if users are able to customise the software they are running,e.g. through end-user programming [30].
Event-processing logic may be different across users.
-
In systems that are based on immutable events, one promisingapproach is to use bidirectional functions (lenses) to convert be-tween different versions of a data model, which allows differentreplicas to use different state representations while still being ableto interoperate [49].
Bidirectional lenses as a way to integrate different data models.
-
Partially ordered event-based systems are well placed to supportsuch branching-and-merging workflows, since they already makedata changes explicit in the form of events, and their support forconcurrent updates allows several versions of a dataset to coexistside-by-side.
Event-based systems allow for divergent views as first-class citizens.
-
in Git terms, one user can create a branch (a setof commits that are not yet part of the main document version),and another user can choose whether to merge it
Users are empowered to create composed views out of events of their choice.
I.e., collaboration as composition.
I.e., divergent views as first-class citizens.
-
Database transactions support a weak form of multi-version con-currency control by allowing an uncommitted transaction’s writesto be either committed or rolled back by aborting the transaction [8].However, most databases do not allow one user to share the state ofan uncommitted transaction with another user, and most databasesdo not allow the user to find out what data has changed in anuncommitted transaction (the equivalent of git diff). Moreover,in most database systems, an uncommitted transaction may holdlocks and thus prevent other transactions from making progress.
I.e., DB-first approach does not have divergent view as first-class citizen.
-
compute the differencesbetween versions of the document to visualise the change history
-
we can reconstruct the state of thedocument at any past moment in time
-
an initial event represents only the intention toperform a certain action
-
Moreover, if processing an event may have external side-effectsbesides updating a replica state – for example, if it may trigger anemail to be sent – then the time warp approach requires some wayof undoing or compensating for those side-effects in the case wherea previously processed event is affected by a late-arriving eventwith an earlier timestamp. It is not possible to un-send an emailonce it has been sent, but it is possible to send a follow-up emailwith a correction, if necessary. If the possibility of such correctionsis unacceptable, optimistic replication cannot be used, and SMR oranother strongly consistent approach must be used instead. In manybusiness systems, corrections or apologies arise from the regularcourse of business anyway [27], so maybe occasional correctionsdue to out-of-order events are also acceptable in practice.
-
-
Local file Local file
-
Disadvantages of the OLEP ap-proach. In the previous examples, log consumers update the state in data stores (the database and search index in Figure 2; the account balances and account statements in Figure 3). While the OLEP approach ensures every event in the log will eventually be processed by every consumer, even in the face of crashes, there is no upper bound on the time until an event is processed.This means if a client reads from two different data stores that are up-dated by two different consumers or log partitions, then the values read by the client may be inconsistent with each other. For example, reading the source and destination accounts of a payment may return the source ac-count after the payment has been pro-cessed, but the destination account before it has been processed. Thus, even though the accounts will even-tually converge toward a consistent state, they may be inconsistent when read at one particular point in time
There are two culprits here: 1. Original intent (transfer from Alice to Bob) has been split into different intents (withdraw from Alice, receive to Bob)
I.e., there is no atomicity at the level of original intent, and subsequently there is no isolation.
- Reads are executed on derived views (DBs) that are stale.
I.e., there is no consistency.
Instead we can have 1 event that describes the original intent. It can be complemented by withdrawal approval and receiving approval. This augmented event, in form of
(approvedReception (approvedWithdrawal (transaction Alice->Bob))
can be replicated across the corresponding source account and destination account logs. This way we have holistic view on what happens. This augmented event can be replicated across source account and destination account. Leading to an eventually consistent states there. Since we have requirement on atomicity of writes this event belongs in Alice+Bob log. Where each event relating to both Alice and Bob end up. Alice log and Bob log depend on Alice+Bob log. Thus ensuring that Alice and Bob are consistent with regard to Alice+Bob transactions.To cater for 2), we need to make sure that reads happen on top of the latest log entry. We can represent read as an event that depend on the latest log event. Thus, it'll be processed on DB state as of that dependent event.
-
Splitting a “transaction” into a mul-tistage pipeline of stream processors allows each stage to make progress based only on local data; it ensures that one partition is never blocked waiting for communication or coordination with another partition.
Event logs allow for gradual processing of events.
-
From a data analysis point of view, an event log is more valuable than the state in a database. For example, in an e-commerce setting, it is valuable for business analysts to see not only the final state of the cart at checkout, but also the full sequence of items added to and removed from the cart, since the removed items carry information, too (for example, one product is a sub-stitute for another, or the customer may return to buy a certain item on a later occasion).
Event logs represent deltas. Whereas DB's are snapshots. Deltas may be useful in order to get insight about intentions.
-
In contrast, in a database that sup-ports arbitrary insertions, updates, and deletes, it is much harder to re-cover from incorrect writes, poten-tially requiring the database to be re-stored from a backup.
DB-first approach makes harder to reconcile happened events.
-
Some log-based stream processors such as Apache Flink support so-called exactly-once semantics, which means that even though an event may be pro-cessed more than once, the effect of the processing will be the same as if it had been processed exactly once. This behavior is implemented by manag-ing side effects within the processing framework and atomically commit-ting these side effects together with the checkpoint that marks a section of the log as processed.
Only-once semantics for side-effects by linking side-effects to a checkpoint. Thus we have a kind of a commit with side-effects as of some checkpoint. They are processed only then. And having checkpoint guarantees that it'll be picked from this state, and so side-effects won't be issued twice.
This is not the same as idempotence of side-effects. It merely guarantees that side-effects is expressed only once.
It relies that event processing won't replay events before a checkpoint. This is not the case for all architectures. Some may use time-warp technique. Some may replay from the start.
-
Since it is possible for an event to be processed more than once when re-covering from a failure, state updates must also be idempotent.
Idempotency of side-effects as a solution for multiple invocation of the same side-effect (intent).
-
Thus, if a subscriber crashes and re-starts, it may append duplicate events to other logs.
A solution is to have events idempotent.
E.g., to have events describe intent, and be content-addressable.
Then, the same intent = the same content-based id = noop by others that already received it.
-
A subscriber periodically check-points the latest LSN it has processed to stable storage. When a subscriber crashes, upon recovery it resumes pro-cessing from the latest checkpointed LSN. Thus, a subscriber may process some events twice (those processed between the last checkpoint and the crash), but it never skips any events. Events in the log are processed at least once by each subscriber.
-