Patch application basically just checks that the read state matches, and then substitutes the writes.ĭiff, by contrast, has to calculate, and often in practice guess a good transition from the read state to the write state. String fields might be best line-based, word-based, or perhaps they must always be atomic (as with identifiers). While lots of structured diff problems will be solved by the simplest algorithm, ultimately we need to have a schema that helps to direct the meaning of our diffs. Which of these you want, however, requires semantic direction of the diff algorithm. It will not solve this particular conflict, but it could make text fields much more flexible. Probably gits line-based approach is not what we want here, but rather one that takes words as atoms. We could make the before and after, however, be a text-based patch using a textual diff. Last first will work of course, but then which is more right might need human review, and even worse it might result in both results being interleaved (a likely outcome!). Could this particular problem be resolved in a purely automatic way with a CRDT? Definitely, but it probably will not result in what you want. This conflict can be surfaced to Alice, and Bob can be allowed to go about his business. And it is this problem of data curation that we can solve with the simplest version of JSON diff. In the case of data curation, this is a perfectly reasonable workflow. But we see immediately that the two are in conflict and Alice can be asked to resolve the question by surfacing it. The patch constructed from Alice’s diff might look like this: The structured patch could be determined by looking at the object before Alice submitted it, and after, using diff. Now we can perform the updates in three different places, locally for Alice, locally for Bob, and then finally at a shared server resource. And locks are a massive source of pain, not only because you can’t achieve otherwise perfectly reasonable concurrent operations, but because you risk getting stale locks and having to figure out when to release them.īut what if Sally didn’t submit her whole object for update, but only the part she wanted to be changed? And Bob did the same? In applications, this sort of curation operation is often achieved with a lock on the object. If Alice opens the object in an application and changes the name of the item to “Retro Encabulator Mark II”, it should be possible for Bob to update the suppliers list simultaneously without either stepping on each other’s toes. This is what allows git to be fully multi-master, without requiring or forcing synchronization using any complex protocols (like RAFT). Collisions result in some remedial action being required, but if there are no collisions everything can be merged to obtain a final state which respects all updates, no matter when or where they came from. With this, it’s possible to have distributed updates performed on different parts of source text. Only what it expects to be true of the source, and what it expects to be true after the update. The application of patches happens because we want a certain before state to be lifted to a certain after state. Diff is used to construct a patch that can be applied to an object such that the final state makes sense for some value of makes sense. These foundational operations are what make git possible. A fundamental tool in git’s strategy for distributed management of source code is the concept of the diff and the patch.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |