Designing reliable distributed systems
Dec. 5th, 2024 09:23 pm![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
I appreciate that my job presents me with some interesting design challenges. Many years ago, the distributed systems I had to think about generally involved memory and network, there wasn't much interesting in persistent storage. Later, one of the enterprise webapps I worked on had the relational database hold metadata about files on disk and those files could get out of sync with the metadata. At least the database and filesystem afforded reliable access and prompt updates.
In my current job, I have to create webapps that are backed by a database and a blockchain. The backend services may run in multiple replicas simultaneously so they have to be careful about data that they hold in local memory. Accessing the blockchain is particularly tricky: I can submit a transaction to a node then that transaction may appear soon, later, or never, and a typical node is not a reliable authority on all of what is on the chain. The transactions on an account need consecutive sequence numbers so my service instances have to coordinate just to create the submissions. Services and nodes may be terminated and restarted at any moment.
I would like the backend to receive a web request ordering, say, a cryptocurrency transfer, then to have that transfer occur exactly once. As it is, my approach is to rely on the database for most state around what the application is doing with the blockchain, then to write code that is largely about: first, check if I already noted that I would do the thing then, if so, pick up the pieces of it having been interrupted. Otherwise, note in the database that I plan to do a thing, then try to do it on the blockchain, note how that went, and watch for its outcome once its block is mined. This is complicated by details like, noting that I plan to do something is probably based on the transaction's hash, which changes according to the currently applicable sequence number.
The system of services coordinating via the database is satisfying to figure out, the blockchain interaction is partly a matter of making a best guess given the nature of the beast. In my current project, I finally got to write some simple smart contract code and it worked as hoped. More broadly, it is nice to see things working well but it takes some thought to get there.
In my current job, I have to create webapps that are backed by a database and a blockchain. The backend services may run in multiple replicas simultaneously so they have to be careful about data that they hold in local memory. Accessing the blockchain is particularly tricky: I can submit a transaction to a node then that transaction may appear soon, later, or never, and a typical node is not a reliable authority on all of what is on the chain. The transactions on an account need consecutive sequence numbers so my service instances have to coordinate just to create the submissions. Services and nodes may be terminated and restarted at any moment.
I would like the backend to receive a web request ordering, say, a cryptocurrency transfer, then to have that transfer occur exactly once. As it is, my approach is to rely on the database for most state around what the application is doing with the blockchain, then to write code that is largely about: first, check if I already noted that I would do the thing then, if so, pick up the pieces of it having been interrupted. Otherwise, note in the database that I plan to do a thing, then try to do it on the blockchain, note how that went, and watch for its outcome once its block is mined. This is complicated by details like, noting that I plan to do something is probably based on the transaction's hash, which changes according to the currently applicable sequence number.
The system of services coordinating via the database is satisfying to figure out, the blockchain interaction is partly a matter of making a best guess given the nature of the beast. In my current project, I finally got to write some simple smart contract code and it worked as hoped. More broadly, it is nice to see things working well but it takes some thought to get there.