SpaceCraft SDK: Storage Mappers v2

Background

The SC storage is organized as a simple map, with arbitrary length keys and values. Also to note that one cannot iterate over keys in a smart contract, all keys must be known to the contract.

This abstraction is minimalist and works well, but it becomes unwieldy as the projects grow larger. The evolution of the SC framework has mirrored this need for higher level abstractions.

The first version of the framework only had #[storage_get] and #[storage_set] annotation, for direct reading/writing at a known key. The only variation was that keys could also have parameters, which would be concatenated in order at the end.

We soon figured out that more complex data structures are needed. For instance, indexed arrays, with each written under its own storage keys. The first such structures were handled in code (it was quite cumbersome and prone to errors). Thus came the idea of the more declarative concept of storage mappers, in which we describe the storage layout and then have the framework dealing with it automatically.

Shortcomings the current implementation

There are several problems that we are trying to address in storage mappers v2:

1. Storage ownership

Current storage mappers offer no ownership of the storage, in the Rust sense. They have no mechanism for denying access to a region of the storage that is currently in use. The best way to do this in Rust is via lifetimes and mutable references.

The current implementation does not use mutable references, always assuming internal (run-time) mutability instead. This is not wrong per se, since this is how the interaction with the VM actually works, but it is not helpful in higher-level logic.

Because of this lack of ownership, it is also very dangerous to cache anything storage-related in memory. The dangerous scenario is as follows: a contract caches some storage value in memory, while it is working on it, another part of the code writes to the same storage keys, finally, flushing cache leads to data loss. We need ownership rules to prevent mutable borrowing the same region of storage twice.

Note: the current mitigation is to only have stateless storage mappers.

2. Storage ABI

We want to be able to export a declarative representation of the storage, the same as we do for endpoints and logs. This will help other tools to interact with contract storage easier.

We can also view the storage mappers as the Rust representation. They should be able to allow us to handle storage not only in contracts, but also in tests and interactors.

Just like in the case of endpoints, the storage ABI needs to include the storage ABIs of all contract modules.

3. Storage non-overlap

A long requested feature, the framework should validate that no 2 storage fields have a chance to accidentally map to the same key. This should be easy, once we have the storage ABI.

4. Full composability

Ideally, generic storage mappers should allow their items to not only be single items, but also other storage mappers. For example, a VecMapper should be allowed to contain a MapMapper, out of the box.

We do currently have some composable mappers, but it is the exception, rather than the rule.

5. Struct mappers

To attain full composability, it would be nice to have structure-like mappers, i.e. Rust structs, annotated somehow, which contain several mappers under different keys. At top level, they would be equivalent to having several mappers, one for each field, but the nice addition would be embedding them for instance into a VecMapper, or other mapper. More about this later.

Implementation plan

There is a logical order of steps to implement the missing features. While it might be tempting to start with the most intuitive features (ABI, composability), it is not the most efficient.

We need the new storage built from the foundation up. Failing to do so will result in an endless spiral of refactoring and reimplementation (we’ve been through such experiences, they are not fun).

So we need to figure out ownership before anything else.

This is a huge change in the framework, with the potential to impact everything. Ensuring backwards compatibility will be tricky.

Let’s talk about what this implies in greater detail:

1. Storage model & ownership

1a. Mutable self in contracts

Rust has mutable and immutable references. We are currently only using the immutable variety. It is in fact currently impossible to get a mutable reference to self in contracts, the contract object is modeled as something stateless.

This is currently forbidden by our proc-macros:

   #[endpoint]
   fn add(&mut self, value: BigUint) {
       self.sum().update(|sum| *sum += value);
   }

It needs to be made legal.

This is also the point where for the first time we can make the difference between #[endpoint] and #[view], enforceable at compile-time.(only endpoints should allow mut).

In the initial phase, this is enough. It doesn’t need to have any other effect in code.

Most importantly, we cannot make the mut compulsory, the migration effort would be overwhelming for the entire comunity. It needs to remain allowed to have it immutable for many versions to come. People will change it to mutable whenever they use the new mappers.

1b. Figuring out partial borrow

This might not have been in the top things you thought of when bringing up storage v2, but I will argue it is critical and non-trivial.

What is partial borrow?

It’s taking a reference to something bigger and splitting it into references to disjoint pieces of it.

So, for example, this works:

struct MyStruct {
   a: String,
   b: String,
}

// ...

fn split(s: &mut MyStruct) -> (&mut String, &mut String) {
   (&mut s.a, &mut s.b)
}

Why does it work? On the face of it, it’s a double mutable borrow into the structure MyStruct. Well … the compiler can tell that a and b are disjoint, so it knows that borrowing them mutably at the same time is fine.

This, however, doesn’t work:

fn split(s: &mut [String]) -> (&mut String, &mut String) {
   (&mut s[0], &mut s[1])
}
error[E0499]: cannot borrow `s[_]` as mutable more than once at a time                                                                                                                
 --> contracts/examples/adder/src/adder.rs:42:17                                                                                                                                     
  |                                                                                                                                                                                  
41 | fn split(s: &mut [String]) -> (&mut String, &mut String) {                                                                                                                       
  |            - let's call the lifetime of this reference `'1`                                                                                                                     
42 |     (&mut s[0], &mut s[1])                                                                                                                                                       
  |    ------------^^^^^^^^^-                                                                                                                                                       
  |    ||          |                                                                                                                                                                
  |    ||          second mutable borrow occurs here                                                                                                                                
  |    |first mutable borrow occurs here                                                                                                                                            
  |    returning this value requires that `s[_]` is borrowed for `'1`                                                                                                               
  |                                                                                                                                                                                  
  = help: consider using `.split_at_mut(position)` or similar method to obtain two mutable non-overlapping sub-slices                                                                
  = help: consider using `.swap(index_1, index_2)` to swap elements at the specified indices     

Why? The compiler is not smart enough to figure out they are disjoint.

Full example here:

Clearly the borrow checker will not be smart enough for our storage mappers! This will suffer from the same problem:

   #[storage_mapper_v2("key1")]
   fn field1(&mut self) -> SingleValueMapper<BigUint>;

   #[storage_mapper_v2("key2")]
   fn field2(&mut self) -> SingleValueMapper<BigUint>;

   #[endpoint]
   fn use_both(&mut self, value: BigUint) {
       let m1 = self.field1();
       self.field2().update(|sum| *sum += value);
       m1.update(|sum| *sum += value);
   }

It is not reasonable to expect developers to never use two mappers at the same time, and there’s no reason they shouldn’t: mappers will probably be disjoint via the storage ABI and the disjoint key checker. So we need to give them a method for partially borrowing a mapper, or borrowing at once two or more.

Rust doesn’t necessarily offer the primitives to achieve this out of the box, but by using the type system to the maximum, it is possible to find a good ownership pattern and borrow syntax.

We did several experiments last year, you can find them in this repo (warning, they are sketches):

https://github.com/multiversx/storage-v2-prototype-rs

1c. Creating a new storage mapper trait

The results at point 1b. should lead to the creation of a new storage mapper trait (and all related types). All new storage mappers will need to implement this trait.

2. Storage mapper ABI

2a. Take the new storage mapper trait, and see how we can extract ABI from it

From here on out everything should revolve around the new trait. Deriving a tree-like ABI structure should be straightforward, but what is more important is to find a way to embed into the contract ABI.

The contract should collect all top-level storage mappers and call them in the ABI generator function. Special attention is needed for modules.

2b. Key overlap checker

An algorithm needs to traverse the ABI and check that keys cannot overlap.

It’s not just identical keys, but also keys that are prefixes of one another (e.g. “key” and “k”, since “k” + some suffix from the mapper below might overlap the “ey” from “key”).

Should be easy enough, can be made in parallel with something else.

3. Migrating existing storage mappers

We need to take the existing storage mappers and migrate them one by one.

A few of them need to be created as examples in parallel with the framework, but most of them will happen after the framework reaches a certain maturity.

4. Storage struct

We need a new annotation that is placed over a struct and transforms it into a storage mapper. Each field should be annotated with a key (or we should take the variable name if unspecified?). It should be similar to serde.

#[storage_mapper]
pub struct MyItem {
   a: SingleValueMapper<BigUint>,
   b: SingleValueMapper<ManagedBuffer>,
}

#[storage_mapper]
pub struct MyStorage {
   #[key("my_users")]
   users: UserMapper,

   x: SingleValueMapper<BigUint>,

   data: VecMapper<MyItem>,
}

Our experiments in the prototype indicate that the best solution might be to have a wrapper object with key, lifetime and a Deref implementation that produces a safe reference to the storage mapper field.

5. Unified syntax

Similar to where we send transactions, we also read storage directly in the following contexts:

  • in contracts (same shard),
  • in tests,
  • in interactors.

We need a simple syntax to take an address and query the storage from it. Should work everywhere.

Unsafe version: any storage mapper over any address.

Type-safe version: just like the endpoint proxies, we might want to consider adding the storage to the proxies.

Roadmap

Talks of the new storage mappers have been floating around for more than a year, but the difficulty of the design and the medium-low priority have caused it to be postponed several times. The prototyping has taken more than a month and it is arguably still not complete.

The current plan is to start working on it properly once we finish improvements and upgrades to the Rust VM and the memory model, some time in Q2 2025.

StorageV2 is for having safe storage even when the developer makes a mistake.