Storage

Bit Packing

State variables of contracts are stored in storage in a compact way such that multiple values may use the same storage slot (except for dynamically-sized arrays and mappings). This is known as "Bit Packing". If the previous state variable is less than 32 bytes and the next state variable can fit into the same 32-byte slot, then they will be grouped into the same slot. Otherwise, a new slot is going to be used.

The elements of structs and arrays are stored after each other, just as if they were given as individual values.

For an example, take a look at the contract from Ethernaut Privacy:

// SPDX-License-Identifier: MIT
pragma solidity ^0.6.0;

contract Privacy {

  bool public locked = true;
  uint256 public ID = block.timestamp;
  uint8 private flattening = 10;
  uint8 private denomination = 255;
  uint16 private awkwardness = uint16(now);
  bytes32[3] private data;

  constructor(bytes32[3] memory _data) public {
    data = _data;
  }
  
  function unlock(bytes16 _key) public {
    require(_key == bytes16(data[2]));
    locked = false;
  }

  /*
    A bunch of super advanced solidity algorithms...

      ,*'^`*.,*'^`*.,*'^`*.,*'^`*.,*'^`*.,*'^`
      .,*'^`*.,*'^`*.,*'^`*.,*'^`*.,*'^`*.,*'^`*.,
      *.,*'^`*.,*'^`*.,*'^`*.,*'^`*.,*'^`*.,*'^`*.,*'^         ,---/V\
      `*.,*'^`*.,*'^`*.,*'^`*.,*'^`*.,*'^`*.,*'^`*.,*'^`*.    ~|__(o.o)
      ^`*.,*'^`*.,*'^`*.,*'^`*.,*'^`*.,*'^`*.,*'^`*.,*'^`*.,*'  UU  UU
  */
}

Analysis of the storage:

// slot 0
bool public locked = true;
// slot 1
uint256 public ID = block.timestamp;
// slot 2 part a
uint8 private flattening = 10;
// slot 2 part b
uint8 private denomination = 255;
// slot 2 part c
uint16 private awkwardness = uint16(now);
// slot 3, 4, and 5
bytes32[3] private data;

State Variables in Inheritance

For contracts that use inheritance, the ordering of state variables is determined by the C3-linearized order of contracts starting with the most base-ward contract. If allowed by the above rules, state variables from different contracts do share the same storage slot.

Dynamically-Sized Types

Using reserved slots works well for fixed-size state variables, but it doesn’t work for dynamically-sized arrays and mappings because there’s no way of knowing how many slots to reserve.

If you’re thinking of computer RAM or hard drive as an analogy, you might expect that there’s an “allocation” step to find free space to use and then a “release” step to put that space back into the pool of available storage.

This is unnecessary due to the astronomical scale of smart contract storage. There are 22562^{256} locations to choose from in storage, which is approximately the number of atoms in the known, observable universe. You could choose storage locations at random without ever experiencing a collision. The locations you chose would be so far apart that you could store as much data as you wanted at each location without running into the next one.

Of course, choosing locations at random wouldn’t be very helpful, because you would have no way to find the data again. Solidity instead uses a hash function to uniformly and repeatably compute locations for dynamically-sized values.

Dynamically-Sized Arrays

A dynamically-sized array needs a place to store its size as well as its elements.

contract StorageTest {
    uint256 a;     // slot 0
    uint256[2] b;  // slots 1-2

    struct Entry {
        uint256 id;
        uint256 value;
    }
    Entry c;       // slots 3-4
    Entry[] d;
}

In the above code, the dynamically-sized array d is at slot 5, but the only thing that’s stored there is the size of d. The values in the array are stored consecutively starting at the hash of the slot.

The following Solidity function computes the location of an element of a dynamically-sized array:

function arrLocation(uint256 slot, uint256 index, uint256 elementSize)
    public
    pure
    returns (uint256)
{
    return uint256(keccak256(slot)) + (index * elementSize);
}

Mappings

A mapping requires an efficient way to find the location corresponding to a given key. Hashing the key is a good start, but care must be taken to make sure different mappings generate different locations.

contract StorageTest {
    uint256 a;     // slot 0
    uint256[2] b;  // slots 1-2

    struct Entry {
        uint256 id;
        uint256 value;
    }
    Entry c;       // slots 3-4
    Entry[] d;     // slot 5 for length, keccak256(5)+ for data

    mapping(uint256 => uint256) e;
    mapping(uint256 => uint256) f;
}

In the above code, the “location” for e is slot 6, and the location for f is slot 7, but nothing is actually stored at those locations. (There’s no length to be stored, and individual values need to be located elsewhere.)

To find the location of a specific value within a mapping, the key and the mapping’s slot are hashed together.

The following Solidity function computes the location of a value:

function mapLocation(uint256 slot, uint256 key) public pure returns (uint256) {
    return uint256(keccak256(key, slot));
}

Note that when keccak256 is called with multiple parameters, the parameters are concatenated together before hashing. Because the slot and key are both inputs to the hash function, there aren’t collisions between different mappings.

Combinations of Complex Types

Dynamically-sized arrays and mappings can be nested within each other recursively. When that happens, the location of a value is found by recursively applying the calculations defined above. This sounds more complex than it is.

contract StorageTest {
    uint256 a;     // slot 0
    uint256[2] b;  // slots 1-2

    struct Entry {
        uint256 id;
        uint256 value;
    }
    Entry c;       // slots 3-4
    Entry[] d;     // slot 5 for length, keccak256(5)+ for data

    mapping(uint256 => uint256) e;    // slot 6, data at h(k . 6)
    mapping(uint256 => uint256) f;    // slot 7, data at h(k . 7)

    mapping(uint256 => uint256[]) g;  // slot 8
    mapping(uint256 => uint256)[] h;  // slot 9
}

To find items within these complex types, we can use the functions defined above. To find g[123][0]:

// first find arr = g[123]
arrLoc = mapLocation(8, 123);  // g is at slot 8

// then find arr[0]
itemLoc = arrLocation(arrLoc, 0, 1);

To find h[2][456]:

// first find map = h[2]
mapLoc = arrLocation(9, 2, 1);  // h is at slot 9

// then find map[456]
itemLoc = mapLocation(mapLoc, 456);

Storage Deep Dive

When to use storage vs. memory

When we first load a storage slot it's cold, meaning it’s more expensive at 2100 gas and whenever we call that newly used storage slot again it’s a warm storage slot, meaning it’ll be 100 gas but not as cheap as memory which is at least 3 gas (MLOAD); but can go higher if memory expansion occurs!

For an example, here is an unoptimised contract:

contract C {
    struct S {
        uint256 a;
        uint256 b;
        address c;
    }

    S public s;

    function foo(uint256 input) external {
        // `s.b` is loaded from storage once: warming up the storage!
        if (input < s.b) revert;
        // Second `s.b` SLOAD with warm storage.
        if (s.b > 50) revert;
    }
}

To optimize s.b load, we can store a copy of s.b in memory. This will cost 1 SLOAD and future operations will be MLOADs, therefore the total cost is cheaper than multiple SLOADs. The optimized contract:

function foo(uint256 input) external {
    // Initial storage load to store in memory.
    uint256 b = s.b;
    // Using MLOAD in comparison operations!
    if (input < b) revert;
    if (b > 50) revert;
}

Manually assigning storage

Let's play with inline assembly in order to manipulate storage. Here is a contract that uses struct:

contract C {
    struct S {
        uint16 a;  // 2 bytes,  2 bytes total
        uint24 b;  // 3 bytes,  5 bytes total
        address c; // 20 bytes, 25 bytes total + end of slot 0x01
        address d; // 20 bytes, slot 0x02
    }

    // I've noted the storage slots each state is located at.
    // A single slot is 32 bytes :)
    uint256 boring;              // 0x00
    S s_struct;                  // 0x01, 0x02
    S[] s_array;                 // 0x03
    mapping(uint256 => S) s_map; // 0x04

    constructor() {
        s_struct = S({
            a: 10,
            b: 20,
            c: 0x047b37ef4d76c2366f795fb557e3c15e0607b7d8,
            d: 0x047b37ef4d76c2366f795fb557e3c15e0607b7d8
        });
    }
}

Access uint256 boring at storage slot 0:

assembly {
    let x := sload(0x00);
}

Lets step it up a notch with a bitpacked struct! Bitpacked means storing multiple variables in a single slot (32 bytes) by ordering the byte size of the variables in a way that results the slot being equal or less to 32 bytes. In this case we pack a total of 25 bytes into a single slot at 0x01 using:

  • uint16 a (2 bytes) -> 0x000a

  • uint24 b (3 bytes) -> 0x000014

  • address c (20 bytes) -> 0x047b37ef4d76c2366f795fb557e3c15e0607b7d8

  • address d (20 bytes) -> 0x047b37ef4d76c2366f795fb557e3c15e0607b7d8

s_struct's slots would be:

// 0x01 00000000000000047b37ef4d76c2366f795fb557e3c15e0607b7d8000014000a
// 0x02 000000000000000000000000047b37ef4d76c2366f795fb557e3c15e0607b7d8

Notice how a, b, c are in the same slot. The way we grab any of these values by shifting the bits and using masking to grab a specific string of bits in a slot.

Lets go through a practical example of grabbing s_struct.b:

function view_b() external view returns (uint24) {
    assembly {
        // before: 00000000000000 047b37ef4d76c2366f795fb557e3c15e0607b7d8 000014 000a
        //                                                                         ^
        // after:  0000 00000000000000 047b37ef4d76c2366f795fb557e3c15e0607b7d8 000014
        //          ^
        let v := shr(0x10, sload(0x01))

        // If both characters aren't 0, keep the bit (1). Otherwise, set to 0.
        // mask:   0000000000000000000000000000000000000000000000000000000000 FFFFFF
        // v:      000000000000000000047b37ef4d76c2366f795fb557e3c15e0607b7d8 000014
        // result: 0000000000000000000000000000000000000000000000000000000000 000014
        v := and(0xffffff, v)

        // Store in memory bc return uses memory.
        mstore(0x40, v)

        // Return reads left to right.
        // Since our value is far right we can just return 32 bytes from the 64th byte in memory.
        return(0x40, 0x20)
    }
}

But what if we wanted to change the value of s_struct.b? A little complicated:

//          unused bytes                     c                        b    a
// before: 00000000000000 047b37ef4d76c2366f795fb557e3c15e0607b7d8 000014 000a
//          unused bytes                     c                        b    a
// after:  00000000000000 047b37ef4d76c2366f795fb557e3c15e0607b7d8 0001F4 000a
function set_b(uint24 b) external {
    assembly {
        // Removing the `uint16` from the right.
        // before: 00000000000000 047b37ef4d76c2366f795fb557e3c15e0607b7d8 000014 000a
        //                                                                         ^
        // after:  0000 00000000000000 047b37ef4d76c2366f795fb557e3c15e0607b7d8 000014
        //          ^
        let new_v := shr(0x10, sload(0x01))

        // Create our mask.
        new_v := and(0xffffff, new_v)

        // Input our value into the mask.
        new_v := xor(b, new_v)

        // Add back the removed `a` value bits.
        new_v := shl(0x10, new_v)

        // Replace original 32 bytes' `000014` with `0001F4`.
        new_v := xor(new_v, sload(0x01))

        // Store our new value.
        sstore(0x01, new_v)
    }
}

Next is dynamic array. Accessing s_array[0].d:

// keccak256(array_slot) + var_slot
// keccak256(0x03) + 1
// Remember how `s_struct` takes up 2 slots?
// The `+ 1` indicates the second slot allocation in S
// For the bitpacked slot in S we use don't need the add
// The next element's slot would be `+ 2`
function get_element() external view returns(bytes32) {
    assembly {
        // Store array slot in memory.
        mstore(0x40, 0x03)
        // Keccak does the MLOAD internally so we give the memory location.
        let hash := add(keccak256(0x40, 0x20), 1)
        // Store the return value.
        mstore(0x40, sload(hash))
        // Return `d`.
        return(0x40, 0x20)
    }
}

Next is mapping. Accessing s_map[2].b:

// keccak256(mapping_key . mapping_slot)
// keccak256(0x02 . 0x04)
function v() external view returns(bytes32) {
    assembly{
        // Store map key (the element we want).
        mstore(0, 0x02)
        // Store map slot location.
        mstore(0x20, 0x04)
        // We want `b` in the first slot 0x00.
        let slot := keccak256(0, 0x40)
        // Store our value for return.
        mstore(0, sload(slot))
        return(0, 0x20)
    }
}

Accessing s_map[4].d:

// keccak256(mapping_key . mapping_slot) + i`
// keccak256(0x02 . 0x04) + 1
function v() external view returns(bytes32) {
    assembly{
        // Store map key (the element we want).
        mstore(0, 0x02)
        // Store map slot location.
        mstore(0x20, 0x04)
        // We want `d` which is 1 slot more.
        let slot := add(keccak256(0, 0x40), 0x01)
        // Store our value for return.
        mstore(0, sload(slot))
        return(0, 0x20)
    }
}

string and bytes have identical encoding types that are very annoying to deal with:

  1. When the length of the string is 31 bytes or less it’s stored in a single slot starting from the left side and the length * 2 is stored in the final byte on the right.

// For example, `string below31 = "reeeee";`:
//    string                       unused bytes                    length of string * 2
// 726565656565 00000000000000000000000000000000000000000000000000 0cCopy Me
  1. For anything larger than 31 bytes the storage process is similar to an array. Where the slot of the string stores length * 2 + 1 and the data is stored via keccak256(slot) + i

This way you can see what type of string it is from checking if lowest bit is set (the far right byte).

Last updated