a tutorial by NeoResearch Community
(forked Nick Morgan's
"EasyForth!") and based on nvm-learn tutorial.
This tutorial intends to teach you the basics of Neo Blockchain Virtual Machine, usually called NeoVM (or NVM). NeoVM is a stack-based machine for processing Smart Contract applications (and also transaction verifications) on Neo Blockchain, inspired by many successful stack languages like Bitcoin Script, Microsoft CIL, Java Virtual Machine and, finally, the FORTH language.
FORTH is older of these languages, being proposed in 1970 by Chuck Moore, currently maintained by Forth Inc and many independent implementations, such as Gforth and also this nice website/javascript implementation by Nick Morgan called EasyForth. Stack languages have many practical applications, specially due to their simplicity, and easier verification for code correctness (specially on a deterministic environment such as a blockchain).
(*) If you are new on the Stack Programming world, we strongly recommend reading more on Stack Data Structure first. If you are curious on FORTH stack machine, we also recommend EasyForth tutorial.
(**) The NVM/FORTH implementation of this tutorial is taken from NeoResearch nvm-forth project.
NeoVM supports seven different types of stack items (UNDER UPDATE).
The (Big)Integers on NeoVM3 are represented as little-endian bytearrays with unlimited size (up to computer memory). The serialization is performed as follows:
Since MSB zeroes are always stripped, the representation size of value zero is in fact 0 (0x’’ empty bytearray).
A NeoVM program is called script (or NeoVM script), which is composed by several operations (called opcodes). Each opcode has a unique number and a unique name, in order to identify the operation.
For example: opcode named PUSH1
, number 81 (in hex, 81 = 0x51
), generates the number 1; opcode named ADD
, number 147 (in hex, 147 = 0x93
) adds two numbers.
All information needs to be on NeoVM stack in order to be processed.
For example, if you want to add two numbers, first you need to put them on the stack.
One difference from other languages is that you need to first put the operands, and then put the operation, e.g., operation 1 + 2
is done as 1 2 +
on a stack machine (put one, put two, then sum the top two elements).
Where is the result of the operation stored? Again, the result of the operation is also put back on the stack.
push1-push16
(opcodes 0x51-0x60
)Let’s try it on practice! Type (don’t copy-paste) the following into the
interpreter, typing Enter
after each line.
PUSH1
PUSH2
PUSH3
What happened? Every time you type a line followed by the Enter
key, the NVM opcode is executed (state HALT
means that no errors happened during NeoVM execution).
You should also notice that as you execute each line, the area at the
top fills up with numbers.
That area is our visualization of the stack. It
should look like this:
Now, into the same interpreter, try the opcode ADD
followed by the Enter
key. The top two
elements on the stack, 2
and 3
, have been replaced by 5
.
At this point, your editor window should look like this:
Type ADD
again and press Enter
, and the top two elements will be replaced by 6
.
If you try ADD
one more time NVM would abort execution, because it will try to pop the top two elements off the
stack, even though there’s only one element on the stack! This results in a
FAULT
state on NeoVM:
You can also write everything in a single line and press Enter
:
push10 push3 add
The main stack now looks like this:
push0
and pushm1
(opcodes 0x00
and 0x4f
)Other useful push opcodes are push0
, that pushes 0
to stack, and pushm1
, that pushes -1
to stack. Values over 16
(up to 32 bytes) can be pushed in other ways (see NVM pushdata
for more information).
This style, where the operator appears after the operands, is known as
Reverse-Polish
notation.
Let’s try to calculate 10 * (5 + 2)
.
You will need also need a multiplication opcode, which is called MUL
(opcode number 149 = 0x95
).
Try the following:
push5 push2 add push10 mul
The execution in NeoVM follows the order of opcodes in the given NVM script. This way, no parentheses is needed to perform any kind of arithmetic operation.
You can play with basic arithmetic operations:
0x93
): adds two numbers on stack0x94
): subtracts two numbers on stack0x95
): multiplies two numbers on stack0x96
): divides two numbers on stack (integer division)0x97
): divides two numbers on stack and puts the rest on stackExercise: How do you calculate (3 * (5 - 6) + 15 / 2) mod 3
on NeoVM? What is the expected result?
Solution: result should be 1
.
First step is: (5 - 6)
. It can be computed using opcodes: push5 push6 sub
.
Result -1
should be put on stack, after consuming operands 5
and 6
.
Second step is: 3 * (5 - 6)
. Since (5 - 6)
is already on stack, you just need to push number 3 and multiply: push3 mul
. Stack should contain -3
now.
Third step: let’s leave -3
there, because we need to compute 15 / 2
now.
How do we do that? Simple, just push15 push2 div
. Our stack must have two values now: -3
and 7
(integer division of 15 and 2).
Fourth step: expression (3 * (5 - 6) + 15 / 2)
can be computed now.
We already have the result of 3 * (5 - 6)
and 15 / 2
on stack, so we just add
.
Number 4
should be on top of the stack now.
Finally, we perform the division by 3 and take the rest: push3 mod
. Result should be 1
.
Let’s review the whole operations (step-by-step) and the result on stack:
push5 push6 sub push3 mul push15 push2 div add push3 mod
Exercise: How do you calculate 15 / (12 / 4)
using NeoVM opcodes?
Solution: result should be 5
.
Try to solve it and make sure this is well understood before proceeding to next section ;)
How to push a number bigger than 16? How to push a string to the stack?
On NeoVM, opcode PUSHBYTES1
(code 0x01
) pushes a single byte to the stack.
For example, PUSHBYTES1 21
(or 0x0121
) would push hex value 21
(decimal 33
) to the stack.
There are versions of this opcode until PUSHBYTES75
(code 0x4B
), that allows pushing 75 bytes to the stack. After that, one can use PUSHDATA1
(opcode 0x4C
), that pushes up to 255
bytes; PUSHDATA2
(opcode 0x4D
), that pushes up to 2^16
bytes; and PUSHDATA4
(opcode 0x4E
), up to 2^32
bytes.
In our web implementation of nvm-forth
, we can use FORTH string push (instead of pushbytes
or pushdata
), which works in the following way:
The text is pushed in ASCII, which is equivalent to a sequence of bytes on NeoVM.
(*) one difference not covered here is that on NeoVM all byte arrays can be converted to integers, but here they are only used as text.
Next section: this invites us to a next section, string operations.
Useful string operations on NeoVM are: cat
, left
, right
, substr
.
cat
(opcode 0x7e
)cat
concatenates top stack elements. Example:
s" hello" s" world" cat
Resulting stack:
(*) this opcode and other string operators are still not implemented in this web mode for testing. See neo-vm for more information.
NeoVM inherits several operations from Forth language, specially those connected to stack operations. Let’s take a look at some of them.
dup
(opcode 0x76
)dup
duplicates the top element of the stack. Example:
push1 push2 push3 dup
Resulting stack:
drop
(opcode 0x75
)drop
removes the top element of the stack. Example:
push1 push2 push3 drop
Resulting stack:
swap
(opcode 0x7c
)swap
simply swaps the top two elements of the stack. Example:
push1 push2 push3 push4 swap
Resulting stack:
over
(opcode 0x78
)over
takes the second element from the top of the
stack and duplicates it to the top of the stack. Example:
push1 push2 push3 over
Resulting stack:
rot
(opcode 0x7b
)rot
applied a “rotation” the top three elements of the stack. Example:
push1 push2 push3 push4 push5 push6 rot
Resulting stack:
Exercise: try to calculate 15 / (12 / 4)
, starting with push12
operation.
Exercise: practice stack opcodes a little bit.
In next sections, we will see stack opcodes that receive parameters from stack.
pick
(opcode 0x79
)pick
reads n
and copies n
-th element to top stack. Example:
push1 push2 push3 push4 push2 pick
Resulting stack:
roll
(opcode 0x7a
)roll
reads n
and moves n
-th element to top stack. Example:
push1 push2 push3 push4 push2 roll
Resulting stack:
Exercise: Consider stack 1 2 3 4 5 6 7 <- top
, how can you put 2
after 6
(and before 7
)? How many stack operations are needed?
Result should be 1 3 4 5 6 2 7 <- top
.
Solution: this can be done on two operations.
First, roll
the 5
-th element to the top stack.
Second, swap
top elements.
Exercise: Is there any other way to do it, even if it costs more operations? (challenge: try to use tuck
instead of swap
)
NeoVM includes a double-stack design, so we need operations also for both the Evaluation Stack (which we’ve seen already) and the Alternative Stack.
On FORTH language, this stack is called return stack, and it is only used for loops and other temporary storages, since it also stores the execution return pointer. This limits the return stack usage, since loops can interfere in stack operations, that’s why FORTH also includes a global storage map. Although, NeoVM doesn’t have a global storage map, it can successfully execute operations only using two stacks and by using arrays as temporary storage (will be explored in later sections).
Next sections will show important double-stack operations.
toaltstack
(opcode 0x6b
)toaltstack
moves top element (from Evaluation Stack) and moves it to the Alternative Stack. Example:
push1 push2 push3 toaltstack
Evaluation Stack:
Alternative Stack:
fromaltstack
(opcode 0x6c
)fromaltstack
moves top element from Alternative Stack and moves it to the Evaluation Stack (the Main Stack). Example:
push1 push2 toaltstack push3 fromaltstack
Evaluation Stack:
Alternative Stack:
dupfromaltstack
(opcode 0x6a
)dupfromaltstack
copies top element from Alternative Stack to the Evaluation Stack. Example:
push1 push2 toaltstack push3 dupfromaltstack
Evaluation Stack:
Alternative Stack:
Exercise: Consider an altstack with value 1
(just run push1 toaltstack
). What does the instructions fromaltstack dup toaltstack push5 add
do? Try to re-write them with a small number of opcodes.
Arrays are fundamental tools for complex NVM scripts. They allow you to store indexed access, which is very useful to implement the behavior of local and global variables.
How do you create an Array?
newarray
(opcode 0xc5
)newarray
gets top element n
(from Evaluation Stack) and creates a n
-sized array. Example:
push2 newarray
Wait, what is this number pushed on stack? Our example considers 1578
(but it may be different) as it represents a memory pointer where to find the array (in local memory).
arraysize
(opcode 0xc0
)What can you do with an Array/pointer? First, you can push its size to stack. Example:
push2 newarray arraysize
Note that the array is consumed during the process, so you may need to dup
it before using this.
Practice a little bit:
How do we store information on a array?
setitem
(opcode 0xc4
)Opcode setitem
consumes an Array pointer p
, an index i
and a value v
from stack, and performs the following attribution: (*p)[i] = v
.
So, after that, value v
is stored at index i
of p
, and nothing is put back on stack (note that array is consumed in this operation, remember to dup
it).
How do we get information from an Array position?
pickitem
(opcode 0xc3
)Opcode pickitem
consumes an Array pointer p
and an index i
stack, and puts back on stack the value stored at position i
of the Array.
Exercise: Try to create a 2-sized array, and store values 10
and 15
in its first positions.
Solution:
First step is push2 newarray
, so you get an array on stack.
Let’s verify its size, so perform dup arraysize
and you should get 2
on top stack.
You can drop
value 2
now.
Second step is to define the position and value to store on array for a setitem
.
Since array is consumed during setitem
, we need to dup it first.
So, to store 10
on position 0
(over existing array on stack) we do: dup push0 push10 setitem
.
Looks like nothing happened, but trust me, value 10
is somehow inside this array pointer ;)
Let’s repeat the process for value 15
on position 1
(try to do it by yourself before proceeding):
Ok, a summary of what we have done until now:
Now, let’s play a little bit, and extract values from the array.
For example, getting value at index 0
is easy using pickitem
: dup push0 pickitem
.
We should get a nice 10
on top of the stack:
Let’s drop the 10
and try to get the value at index 1
(try by yourself before proceeding):
Final solution (step by step):
Exercise: Try to put values 10
and 15
in the array (like previous example), and then swap both elements inside it (position 0
will have value 15
, and position 1
will have value 10
).
Next section: Let’s see how an Array can be useful for storing data on multiple stacks.
One interesting property of the Array stack item is its ability to allow side-effects, since it is a reference type (in constrast to the value types of Integer stack items).
When an Integer is copied on NeoVM, both become completely independent entities:
If you add one to the top integer, nothing will happen to the other one (in fact, a new integer is always created after any arithmetic operation):
However, when an array is copied, in fact, its pointer is copied:
So, this is called a reference type, because any change performed on top array will reflect automatically on the other array (remember last Array exercise on how pickitem
and setitem
work).
One pattern found in compilers is how they manage to pack and store parameters in local variables. Consider the following C# function:
void test(int x1, int x2);
The function has two parameters, and suppose it was invoked as test(10, 15)
, thus generating two local variables for x1
and x2
in a internal array.
One solution for this, is having the following opcodes: PUSH2 NEWARRAY TOALTSTACK DUPFROMALTSTACK PUSH0 PUSH2 ROLL SETITEM DUPFROMALTSTACK PUSH1 PUSH2 ROLL SETITEM
.
Feel free to try it (remeber to push15 push10
first):
Step by step, this is what happens:
evstack | pending commands | altstack
_______________|________________________________|____________
15 10 | push2 newarray ... |
15 10 [_,_] | toaltstack dupfromaltstack ... |
15 10 [_,_] | push0 push2 ... | [_,_]
15 10 [_,_] 0 2| roll ... | [_,_]
15 [_,_] 0 10 | setitem ... | [_,_]
15 | dupfromaltstack push1 push2 ...| [10,_]
15 [10,_] 1 2 | roll ... | [10,_]
[10,_] 1 15 | setitem ... | [10,_]
| ... | [10,15]
To inspect what happened, use the following commands to get array from altstack and pick inside it:
dupfromaltstack 0 pickitem
(gets first element), drop
, dupfromaltstack 1 pickitem
(gets second element).
Exercise: replace the code above (PUSH2 NEWARRAY TOALTSTACK DUPFROMALTSTACK PUSH0 PUSH2 ROLL SETITEM DUPFROMALTSTACK PUSH1 PUSH2 ROLL SETITEM
) by a much simpler operation.
Let’s see a practical application of having local parameters. Consider C# function:
void test(int x1, int x2)
{
x1 = 2*x1 + x2; // compute 2 times x1 plus x2 and store on x1
}
Suppose local variables (x1=10
, x2=15
) are already loaded on altstack via push15 push10 push2 pack toaltstack
.
The desired operation (x1=2*x1+x2
) could be performed using the following opcodes: PUSH2 DUPFROMALTSTACK PUSH0 PICKITEM MUL DUPFROMALTSTACK PUSH1 PICKITEM ADD DUPFROMALTSTACK PUSH0 PUSH2 ROLL SETITEM
.
Exercise: execute the following code, step by step (remember to load parameters on altstack first).
Solution:
First step, push2
pushed value 2
to evaluation stack (it will be used later on multiplication).
Then, DUPFROMALTSTACK PUSH0 PICKITEM
extract first parameter from the array at altstack, with value 10
.
Opcode mul
performs the multiplication:
Second step, after calculation of 2*x1
, we will calculate 2*x1 + x2
. We need to get x2
:
DUPFROMALTSTACK PUSH1 PICKITEM
(value of x2
is 15
).
Opcode add
performs the sum:
Finally, we need to store the result of 2*x1 + x2
back on x1
. Opcodes DUPFROMALTSTACK PUSH0 PUSH2 ROLL SETITEM
do the job.
Summary: to run the complete example, type this:
To check that result was actually stored on the array, just do:
Stack should contain 35
:
Challenge: This code can be actually compiled and tested on NeoCompiler Eco (neocompiler.io), generating the following opcodes (in hex): 52-c5-6b-6a-00-52-7a-c4-6a-51-52-7a-c4-52-6a-00-c3-95-6a-51-c3-93-6a-00-52-7a-c4-61-6c-75-66
.
Use the disassembly options from the website to inspect and understand how compilation process work for NeoVM.
NeoVM is a lightweight stack computing engine, used to interoperate with higher levels of Neo blockchain.
Interoperability can be achieved by means of syscall
opcode (code 0x68
), that receives a interop command as parameter. This is implemented here as a string literal syscall"
.
Neo blockchain defines several interop calls, which are not available here in this simplified tutorial (just the NeoVM part is covered here). However, we illustrate such capability with some interop commands (just available in NVM Learn platform):
syscall" Neo.Learn.Notify"
syscall" Neo.Learn.Log"
syscall" Neo.Learn.Sleep"
syscall" Neo.Learn.Random"
syscall" Neo.Learn.Call"
(see next section on Snake game)Neo.Learn.Notify"
takes one element from stack and prints it on the web browser via alert system.
A very interesting “Hello World” example can be done in the following way:
Note the space between s"
and Hello World!"
, this is necessary for string parsing on FORTH stack language (used in this tutorial).
If you have a desktop-based browser, you can also try Neo.Learn.Log
, which puts content on browser console log system.
Other interesting syscalls here are: Neo.Learn.Sleep
, that reads an input from stack and sleeps during this given time (in milliseconds); and Neo.Learn.Random
, that generates a random number between 0
and stack top value.
Example: try pushing 16
to the stack (via push16
) and counting how many times syscall" Neo.Learn.Random"
takes to make this value become zero ;)
A possible outcome is:
Time to have (even more) fun! Let’s use NeoVM to run a Snake game ;)
The code of the game is pre-loaded (which is somewhat large), and it contains instructions from both NeoVM and FORTH language (still an ongoing work to completely remove all FORTH opcodes). The game is originally made by Nick Morgan as part of the EasyForth tutorial.
The game works by binding some NVM local/global variables into graphics and key operations (in a similar way that Neo syscalls work).
To start the game you will need to call the start
function (a FORTH definition).
To achieve that, push string start
to stack using s" start"
and invoke it, via syscall" Neo.Learn.Call"
. You can use the arrow keys to move the snake ;)
Simply type below (don’t copy and paste):
This tutorial intends to teach only the basics of NeoVM, in a interactive way, but you can always dig more directly on NeoVM project. One important topic is not covered, but an explanation is given below.
Conditionals and loops are implemented on NeoVM via the use o jumps. Basically, a jump changes the current instruction counter and moves it to another position in the NVM script, affecting which operations will be read after that. Since this is not directly related to the stack execution model, it will only be covered in future tutorials.
This tutorial is a non-standard implementation of both NeoVM and also Forth language, which are both much more powerful than presented here. In special, the Snake game is a hybrid NVM/FORTH adaptation of the one proposed on EasyForth tutorial, and we are greatly grateful for this to be made available on GitHub. If you are a Forth or NeoVM expert, feel free to contribute!
Follow us on GitHub and Twitter ;)
NeoResearch Community