βœ…EVM Through Huff

Intro

Huff

Huff is a low-level programming language designed for developing highly optimized smart contracts that run on the Ethereum Virtual Machine (EVM). Huff does not hide the inner workings of the EVM and instead exposes its programming stack to the developer for manual manipulation.

The Aztec Protocol (opens new window)team originally created Huff to write Weierstrudel, an on-chain elliptical curve arithmetic library that requires incredibly optimized code that neither Solidity nor Yul could provide.

While EVM experts can use Huff to write highly-efficient smart contracts for use in production, it can also serve as a way for beginners to learn more about the EVM.

If you're looking for an in-depth guide on how to write and understand Huff, check out the tutorials.

Add Two

Sample code:

Define function ABI:

Define the "main function" (macro), specifying it will take 0 things from the stack and return 0 thing back to the stack:

In other words, when entering the contract the stack will be empty. Upon completion we will not be leaving anything on the stack; therefore, takes() and returns() will both be 0.

Get function selector:

Here 0xE0 = 224 = 256bit - 32bit = 32byte - 4byte. This right shift is meant to extract the first 4 bytes of the calldata, which is the function selector.

The expression 0x00 calldataload 0xE0 shr is Huff's standard way of extracting function selector from calldata. You can just memorize it and use it in your project as a convention.

Jump to addTwo if the function selector matches:

__FUNC_SIG() is a Huff built-in function for computing function signature (function selector).

The actual implementation of function ADD_TWO():

calldataload starts from 0x04 since the first 4 bytes is function selector and that part is skipped. The actual parameters of ADD_TWO() starts from the 5th byte.

Hello World

Sample code:

As strings are dynamic types it is not as simple as returning the UTF-8 values for "Hello, world!" (0x48656c6c6f2c20776f726c6421). In the ABI standard, dynamic types are encoded in 3 parts, each which takes a full word (32 bytes) of memory:

  1. Offset in memory (a pointer) -> left padded

  2. Length of the string -> left padded

  3. The actual content of the string -> right padded

Suppose we are working with the string "Hello, world!", then the memory will be looking like:

Once you understand this construction, the main macro code is self-explanatory.

Moving one step further, there is a way to merge this 3-part construction into 2. This method is called the "Seaport method". Recall that in the "normal method" we have left padded, left padded, and right padded. This means the 2nd and the 3rd entry have adjacent non-zero data. The Seaport method combine the 2nd and the 3rd entry into a single left padded entry.

For example, suppose we are working with the string "TKN". Pictorially:

Seaport method

Simple Storage

Sample code:

Huff implements the FREE_STORAGE_POINTER() keyword for us to keep track of storage slots. For example:

Later on we can use STORAGE_SLOT0, STORAGE_SLOT1, and STORAGE_SLOT2 to refer to different storage slots.

SET_VALUE() macro:

The square bracket is the "reference" operator. It means get the address of storage slot VALUE.

GET_VALUE() macro:

Here we take a thing from storage, put it into memory and return it.

MAIN macro:

This is just a function dispatcher that tries to match SET_VALUE() or GET_VALUE() based on the calldata. If nothing matches it is going to revert.

Function Dispatching

Linear Dispatching

Sample code:

This is basically a large jump table that redirects the control flow to each function if the function selector in calldata matches one of the functions.

One important thing to note is the following line of code:

The idea is similar to switch statements in C: if you don't add break; between each case, then all the code after that line will be executed line by line. Without 0x00 dup1 revert, all the macros will be executed until a return condition is found.

Linear dispatching seems naive, however this is exactly how Vyper and Solidity* implement linear dispatching. If you want it to be cheaper to call, just move it higher up in the contract!

* Solidity only implements this method if there are less than 4 functions in a contract.

Binary Search Dispatching

Sample code:

The idea is dividing the jump table into several parts using "pivots". Binary search dispatching is great when you have many many functions in the contract. Otherwise, stick to linear dispatching since it is a lot easier to implement.

Fallback and Receive Functions

Fallback function

Suppose we are implementing a fallback function that returns 1:

In the MAIN macro, this fallback function should be inserted at the end of all lookups. For example:

Receive function

If you want to implement receive function on top of the fallback function, do a callvalue check before FALLBACK():

foundry-huff

If you have an existing Foundry project, you can simply install the necessary dependencies by running:

You also must add the following line to your foundry.toml file to ensure that the foundry-huff library has access to your environment in order to compile the contract:

You can then use HuffDeployer contract to compile and deploy your Huff contracts for you using the deploy function. Here's a quick example:

Extra Mile Challenge

In Devtooligan's presentation at Spearbit, he covered the "Collatz Puzzle" challenge in QuillCTF:

EVM Through HUFF: Devtooligan

Here is my writeup for this challenge:

Collatz Puzzle

Last updated

Was this helpful?