archives

SPREAD: Authenticated reusable computations

I want to know exactly what my software is doing. Better still, I want cryptographic proof that my software executes each and every computation step correctly.

Wouldn’t you?

Currently, I don’t know what Windows 10 is doing (or it is very hard to find out) and I hate that.

That’s because most of Windows 10:

- is compiled to machine code that bears no resemblance to the original source code,
- hides intricate networks of mutable objects after abstract interfaces,
- and destroys precious machine state after machine state.

Welcome to SPREAD!

In SPREAD, all data, machines states, code and computations are cryptographically authenticated.

To put it mildly, some practical issues had to be resolved to make SPREAD a reality. First and foremost, SPREAD almost completely eradicates mutable state.

Obviously, keeping all the state for authentication purposes would require enormous amounts of storage. SPREAD solves that issue by cryptographically ‘signing’ states incrementally, while still allowing full user-control over which ones need to be signed, and at what level of granularity.

Alternatively, full machine states can also be stored incrementally by SPREAD. In turn, this allows the pervasive re-use of state that was calculated earlier.

So SPREAD kinda acts like a spreadsheet. Because spreadsheets also re-use the previous states of a workbook to optimally recalculate the next.

Unlike SPREAD however, spreadsheets are incapable to keep all their versions around. And typically, Excel plug-ins completely destroy the (otherwise) purely functional nature of spreadsheets. In contrast, SPREAD only allows referentially transparent functions as primitives.

SPREAD builds on top of a recent breakthrough in cryptography called SeqHash. Unfortunately, SeqHash has been unnoticed by most. But hopefully this post will stir some renewed interest. To honour SeqHash, my extension has been called SplitHash:

SplitHash is an immutable, uniquely represented Sequence ADT (Authenticated Data Structure):

- Like SeqHashes, SplitHashes can be concatenated in O(log(n)).
- But SplitHash extends SeqHash by allowing Hashes to also be split in O(log(n)).
- It also solves SeqHash's issue with repeating nodes by applying RLE (Run Length Encoding) compression.
- And to improve cache coherence and memory bandwidth, SplitHashes can be optionally chunked into n-ary trees.

SplitHash is the first known History-Independent(HI) ADT that holds all these properties.

Sorry about the obvious self interest, but I'm just so excited about what I've created that I need to share this.