Stack Machines: Calls
fundamentals « rpn-calculator « shunting-yard « io « jumps « conditionals « comments « calls « variables « stack-frames « heap « compilers
Hey I just met you
And this is crazy
But here’s my number
So call me maybe— Carly Rae Jepsen
We have been talking about stack machines (this is a series of posts, go back and read the other posts if you feel lost right now) without ever talking about calls. This is rather odd, as most of the time when discussing stacks in programming, we only talk about the call stack.
Modularity
A call, sometimes refered to as procedure call, subroutine call or function application, is a means of transfering control to an other sub-program. This implies that we have sub-programs to begin with.
By breaking programs into procedures the execution context or scope can be reduced, which makes reasoning about a particular piece of code easier. In other words, you get modularity.
Once a procedure has completed its task, it is able to return control back to the caller.
Procedures
What is a procedure, really? A procedure is just a piece of code that follows a certain “interface” or pattern, allowing it to be called by other code.
Calling conventions
The anatomy of a procedure call will vary by language, operating system and CPU architecture. There will usually be standard conventions for the instructions involved in performing them, known as calling conventions.
This generally means having some sort of “call stack” that stores the execution context of the caller, so that control can be returned back later on. The most important part of program state representing the execution context is the instruction pointer.
In most actual architectures, the data stack and the call stack are combined. In this machine, we will afford ourselves the luxury of separating them. The calling convention will have two parts: argument passing and state storage.
Argument passing
Arguments are passed via the stack. The caller pushes the arguments for the procedure onto the stack, the procedure pops them off.
In this forth-like stack machine, the data stack holds arguments implicitly. In other words, procedures can consume as many values as they need from the stack, and they can also produce as many “output” values as they want. This means that unlike traditional calling conventions, a procedure can return many values.
State storage
The only state we will store for now will be the instruction pointer $ip
. It will be stored in the call stack which will be a separate stack next to the data stack. There are two instructions involved with procedure calls.
The first is call(procedure)
, which takes a procedure name to call. This call will simply push the instruction pointer onto the call stack, then jump to the memory address referenced by the label that is the procedure name. That’s right, a procedure is just a piece of code identified by a label.
The second instruction is ret
, which will allow returning from a procedure call. Ideally every procedure will have a ret
at the end. What this instruction does is just pop the previous instruction pointer from the call stack and jump back to it.
That's it!
Example
What this means in practice is that you can take some code containing jumps, replace the jmp
with a call
and put a ret
at the end of the labeled code, and get modular code that can be separated and shipped as a library!
Let’s take the output loop example from the conditionals post:
0 10 100 108 114 111 119 32 44 111 108 108 101 104
label(loop)
dup jz(end)
.
jmp(loop)
label(end)
The loop can be extracted out into a printstr
procedure, but we must also jump over the library code to the actual application code, which will then set up the data and call printstr
:
jmp(start)
label(printstr)
label(loop)
dup jz(end)
.
jmp(loop)
label(end)
ret
label(start)
0 10 100 108 114 111 119 32 44 111 108 108 101 104
call(printstr)
Of course, procedures can call other procedures. Or they can call themselves recursively.
Implementation
First we need a call stack to store the return addresses.
$calls = new SplStack();
Next, we need a call
instruction that stores $ip
on the call stack and jumps to the address of the procedure.
if (preg_match('/^call\((.+)\)$/', $op, $match)) {
$label = $match[1];
$calls->push($ip);
$ip = $labels[$label];
continue;
}
And finally, we need a ret
instruction that pops the return address from the call stack and jumps back to it.
switch ($op) {
// ...
case 'ret':
$ip = $calls->pop();
break;
}
Summary
Adding calls to an RPN calculator allows code to be made modular, reusable and independent from the caller.
fundamentals « rpn-calculator « shunting-yard « io « jumps « conditionals « comments « calls « variables « stack-frames « heap « compilers