ABI

A call to a function with the signature f(uint256,uint32[],bytes10,bytes) with values (0x123, [0x456, 0x789], "1234567890", "Hello, world!") is encoded in the following way:

We take the first four bytes of keccak("f(uint256,uint32[],bytes10,bytes)"), i.e. 0x8be65246. Then we encode the head parts of all four arguments. For the static types uint256 and bytes10, these are directly the values we want to pass, whereas for the dynamic types uint32[] and bytes, we use the offset in bytes to the start of their data area, measured from the start of the value encoding (i.e. not counting the first four bytes containing the hash of the function signature). These are:

  • 0x0000000000000000000000000000000000000000000000000000000000000123 (0x123 padded to 32 bytes)

  • 0x0000000000000000000000000000000000000000000000000000000000000080 (offset to start of data part of second parameter, 4*32 bytes, exactly the size of the head part)

  • 0x3132333435363738393000000000000000000000000000000000000000000000 ("1234567890" padded to 32 bytes on the right)

  • 0x00000000000000000000000000000000000000000000000000000000000000e0 (offset to start of data part of fourth parameter = offset to start of data part of first dynamic parameter + size of data part of first dynamic parameter = 4*32 + 3*32 (see below))

After this, the data part of the first dynamic argument, [0x456, 0x789] follows:

  • 0x0000000000000000000000000000000000000000000000000000000000000002 (number of elements of the array, 2)

  • 0x0000000000000000000000000000000000000000000000000000000000000456 (first element)

  • 0x0000000000000000000000000000000000000000000000000000000000000789 (second element)

Finally, we encode the data part of the second dynamic argument, "Hello, world!":

  • 0x000000000000000000000000000000000000000000000000000000000000000d (number of elements (bytes in this case): 13)

  • 0x48656c6c6f2c20776f726c642100000000000000000000000000000000000000 ("Hello, world!" padded to 32 bytes on the right)

All together, the encoding is (newline after function selector and each 32-bytes for clarity):

0x8be65246
  0000000000000000000000000000000000000000000000000000000000000123
  0000000000000000000000000000000000000000000000000000000000000080
  3132333435363738393000000000000000000000000000000000000000000000
  00000000000000000000000000000000000000000000000000000000000000e0
  0000000000000000000000000000000000000000000000000000000000000002
  0000000000000000000000000000000000000000000000000000000000000456
  0000000000000000000000000000000000000000000000000000000000000789
  000000000000000000000000000000000000000000000000000000000000000d
  48656c6c6f2c20776f726c642100000000000000000000000000000000000000

Let us apply the same principle to encode the data for a function with a signature g(uint256[][],string[]) with values ([[1, 2], [3]], ["one", "two", "three"]) but start from the most atomic parts of the encoding:

First we encode the length and data of the first embedded dynamic array [1, 2] of the first root array [[1, 2], [3]]:

  • 0x0000000000000000000000000000000000000000000000000000000000000002 (number of elements in the first array, 2; the elements themselves are 1 and 2)

  • 0x0000000000000000000000000000000000000000000000000000000000000001 (first element)

  • 0x0000000000000000000000000000000000000000000000000000000000000002 (second element)

Then we encode the length and data of the second embedded dynamic array [3] of the first root array [[1, 2], [3]]:

  • 0x0000000000000000000000000000000000000000000000000000000000000001 (number of elements in the second array, 1; the element is 3)

  • 0x0000000000000000000000000000000000000000000000000000000000000003 (first element)

Then we need to find the offsets a and b for their respective dynamic arrays [1, 2] and [3]. To calculate the offsets we can take a look at the encoded data of the first root array [[1, 2], [3]] enumerating each line in the encoding:

0 - a                                                                - offset of [1, 2]
1 - b                                                                - offset of [3]
2 - 0000000000000000000000000000000000000000000000000000000000000002 - count for [1, 2]
3 - 0000000000000000000000000000000000000000000000000000000000000001 - encoding of 1
4 - 0000000000000000000000000000000000000000000000000000000000000002 - encoding of 2
5 - 0000000000000000000000000000000000000000000000000000000000000001 - count for [3]
6 - 0000000000000000000000000000000000000000000000000000000000000003 - encoding of 3

Offset a points to the start of the content of the array [1, 2] which is line 2 (64 bytes); thus a = 0x0000000000000000000000000000000000000000000000000000000000000040.

Offset b points to the start of the content of the array [3] which is line 5 (160 bytes); thus b = 0x00000000000000000000000000000000000000000000000000000000000000a0.

Then we encode the embedded strings of the second root array:

  • 0x0000000000000000000000000000000000000000000000000000000000000003 (number of characters in word "one")

  • 0x6f6e650000000000000000000000000000000000000000000000000000000000 (utf8 representation of word "one")

  • 0x0000000000000000000000000000000000000000000000000000000000000003 (number of characters in word "two")

  • 0x74776f0000000000000000000000000000000000000000000000000000000000 (utf8 representation of word "two")

  • 0x0000000000000000000000000000000000000000000000000000000000000005 (number of characters in word "three")

  • 0x7468726565000000000000000000000000000000000000000000000000000000 (utf8 representation of word "three")

In parallel to the first root array, since strings are dynamic elements we need to find their offsets c, d and e:

0 - c                                                                - offset for "one"
1 - d                                                                - offset for "two"
2 - e                                                                - offset for "three"
3 - 0000000000000000000000000000000000000000000000000000000000000003 - count for "one"
4 - 6f6e650000000000000000000000000000000000000000000000000000000000 - encoding of "one"
5 - 0000000000000000000000000000000000000000000000000000000000000003 - count for "two"
6 - 74776f0000000000000000000000000000000000000000000000000000000000 - encoding of "two"
7 - 0000000000000000000000000000000000000000000000000000000000000005 - count for "three"
8 - 7468726565000000000000000000000000000000000000000000000000000000 - encoding of "three"

Offset c points to the start of the content of the string "one" which is line 3 (96 bytes); thus c = 0x0000000000000000000000000000000000000000000000000000000000000060.

Offset d points to the start of the content of the string "two" which is line 5 (160 bytes); thus d = 0x00000000000000000000000000000000000000000000000000000000000000a0.

Offset e points to the start of the content of the string "three" which is line 7 (224 bytes); thus e = 0x00000000000000000000000000000000000000000000000000000000000000e0.

Note that the encodings of the embedded elements of the root arrays are not dependent on each other and have the same encodings for a function with a signature g(string[],uint256[][]).

Then we encode the length of the first root array:

  • 0x0000000000000000000000000000000000000000000000000000000000000002 (number of elements in the first root array, 2; the elements themselves are [1, 2] and [3])

Then we encode the length of the second root array:

  • 0x0000000000000000000000000000000000000000000000000000000000000003 (number of strings in the second root array, 3; the strings themselves are "one", "two" and "three")

Finally we find the offsets f and g for their respective root dynamic arrays [[1, 2], [3]] and ["one", "two", "three"], and assemble parts in the correct order:

0x2289b18c                                                            - function signature
 0 - f                                                                - offset of [[1, 2], [3]]
 1 - g                                                                - offset of ["one", "two", "three"]
 2 - 0000000000000000000000000000000000000000000000000000000000000002 - count for [[1, 2], [3]]
 3 - 0000000000000000000000000000000000000000000000000000000000000040 - offset of [1, 2]
 4 - 00000000000000000000000000000000000000000000000000000000000000a0 - offset of [3]
 5 - 0000000000000000000000000000000000000000000000000000000000000002 - count for [1, 2]
 6 - 0000000000000000000000000000000000000000000000000000000000000001 - encoding of 1
 7 - 0000000000000000000000000000000000000000000000000000000000000002 - encoding of 2
 8 - 0000000000000000000000000000000000000000000000000000000000000001 - count for [3]
 9 - 0000000000000000000000000000000000000000000000000000000000000003 - encoding of 3
10 - 0000000000000000000000000000000000000000000000000000000000000003 - count for ["one", "two", "three"]
11 - 0000000000000000000000000000000000000000000000000000000000000060 - offset for "one"
12 - 00000000000000000000000000000000000000000000000000000000000000a0 - offset for "two"
13 - 00000000000000000000000000000000000000000000000000000000000000e0 - offset for "three"
14 - 0000000000000000000000000000000000000000000000000000000000000003 - count for "one"
15 - 6f6e650000000000000000000000000000000000000000000000000000000000 - encoding of "one"
16 - 0000000000000000000000000000000000000000000000000000000000000003 - count for "two"
17 - 74776f0000000000000000000000000000000000000000000000000000000000 - encoding of "two"
18 - 0000000000000000000000000000000000000000000000000000000000000005 - count for "three"
19 - 7468726565000000000000000000000000000000000000000000000000000000 - encoding of "three"

Offset f points to the start of the content of the array [[1, 2], [3]] which is line 2 (64 bytes); thus f = 0x0000000000000000000000000000000000000000000000000000000000000040.

Offset g points to the start of the content of the array ["one", "two", "three"] which is line 10 (320 bytes); thus g = 0x0000000000000000000000000000000000000000000000000000000000000140.

Last updated