04 - RevertWithError

Goal: Revert with a Solidity-style error message (e.g., like using require with a reason string).

// SPDX-License-Identifier: AGPL-3.0-or-later
pragma solidity ^0.8.13;

import {Test, console} from "forge-std/Test.sol";
import {RevertWithError} from "../src/RevertWithError.sol";

contract RevertWithErrorTest is Test {
    RevertWithError public c;

    function setUp() public {
        c = new RevertWithError();
    }

    function test_RevertWithError() public {
        vm.expectRevert(bytes("RevertRevert"));
        c.main();
    }
}

Background knowledge: https://docs.soliditylang.org/en/latest/abi-spec.html

The ABI encoding for a string is a standardized way to pack dynamic data into a sequence of bytes. In both Solidity and Yul, a string is considered a dynamic type and is encoded as follows:

  1. Length Word:

    A 32‑byte word that indicates the length of the string in bytes.

  2. Data Words:

    The UTF‑8 encoded string data is stored immediately after the length word. The data is "right‑padded" with zeros to occupy a full 32‑byte word if the string’s length is not a multiple of 32.

When the string is an argument for a function call or part of an error payload (like with Error(string)), the encoding adds an additional component:

  1. Function Selector:

    For example, when reverting with an error that includes a string reason, the payload starts with a 4‑byte selector. For the built‑in error type Error(string), this selector is computed as the first 4 bytes of keccak256("Error(string)") (i.e. 0x08c379a0).

    This selector is stored in the high‑order (left‑aligned) 4 bytes of a 32‑byte word.

  2. Offset Pointer:

    After the selector, a 32‑byte word typically contains the offset (in bytes) from the start of the encoding where the actual dynamic data (i.e. the string’s length and content) begins. This is usually set to 32 (0x20) when there is only one dynamic parameter.

For example, if you want to encode the error Error("RevertRevert") you would structure your memory like this:

  • Bytes 0–3:

    The left‑aligned function selector for Error(string), i.e. shl(224, 0x08c379a0) (this shifts the 4‑byte selector to the highest order bytes of a 32‑byte word).

  • Bytes 4–35:

    A 32‑byte word with the offset to the string data, which is usually 0x20 (offset occupies 32 bytes so the length+data part starts from 32 bytes after it).

  • Bytes 36–67:

    A 32‑byte word representing the length of the string. For "RevertRevert", the length is 12 bytes.

  • Bytes 68–99:

    A 32‑byte word containing the UTF‑8 bytes of "RevertRevert" left‑aligned and padded with zeros to fill the remaining space.

// SPDX-License-Identifier: AGPL-3.0-or-later
pragma solidity ^0.8.13;

contract RevertWithError {
    function main() external pure {
        assembly {
            // Get the free memory pointer
            let ptr := mload(0x40)
            // Store the function selector for Error(string) left-aligned.
            // 0x08c379a0 << 224 places 0x08c379a0 in the high-order 4 bytes.
            mstore(ptr, shl(224, 0x08c379a0))
            // Store the offset to the error message. (The error message starts 32 bytes after the selector.)
            mstore(add(ptr, 4), 0x20)
            // Store the string length: 12 bytes ("RevertRevert")
            mstore(add(ptr, 36), 12)
            // Store the string "RevertRevert" left-aligned.
            // Since the string is 12 bytes, shift it left by 160 bits (i.e. (32 - 12) * 8)
            mstore(add(ptr, 68), shl(160, 0x526576657274526576657274))
            // Total length over memory: 4 + 32 + 32 + 32 = 100 bytes (0x64)
            revert(ptr, 0x64)
        }
    }
}

Run test:

forge test --mp test/RevertWithError.t.sol -vvvv

Extra discussion: when is the “offset” part does not exist in ABI encoding?

The offset pointer is part of the standard, tuple-based ABI encoding for dynamic types—but only when the dynamic type is embedded in a tuple (or a list of arguments). Here’s a more detailed explanation:

  1. ABI Encoding as a Tuple

    When you encode function arguments or error data, the encoding is done as if you were encoding a tuple. In a tuple, every element—even a single dynamic type—is represented in two parts:

    • A fixed "head" that is always 32 bytes per element.

      • For dynamic types, the head contains an offset (in bytes) pointing to where the actual data appears in the encoding.

    • A "tail" that contains the actual dynamic data (for a string, this “tail” starts with a 32‑byte word representing the length, followed by the actual data padded to a multiple of 32 bytes).

    For example, when encoding an error with Error(string), you have:

    • A 4‑byte function selector (left‑aligned in a 32‑byte word).

    • A 32‑byte “head” which is an offset pointer (usually 0x20) that tells you where the dynamic data begins.

    • The head is then followed by a 32‑byte word for the length (e.g., 12 for "RevertRevert") and the padded string data itself.

  2. When Offsets Are Not Used

    Offsets are introduced because the ABI encoding framework wraps parameters in a tuple. There are scenarios where you won’t have an offset pointer:

    • If you encode a dynamic type by itself using functions like abi.encode() (for example, abi.encode("hello")), the result is not wrapped as a tuple. In that output, the string is encoded as a dynamic byte array: the first 32 bytes are the length, followed by the actual data, and no offset pointer is inserted.

    • For static types (like uint256, address, etc.), you do not have an offset pointer because their encoding is fixed-size and doesn’t require a “pointer” into another section.

    • Similarly, when you use non-standard (tightly packed) encodings such as with abi.encodePacked(), the data is concatenated directly with no offsets.

Last updated