Aragain falls
You are near the top of Aragain Falls. The only path leads north. A solid rainbow spans the falls to the west.
> go north
Sandy beach
You are on a large beach on the east shore of the river, which flows by quickly. A path runs south along the river, and a sand-filled passage leads northeast.
There is a shovel here.
> take the shovel
Taken.
> go northeast
Code for this and the next several episodes is at https://github.com/ericlippert/flathead/tree/blog9
All right, we have the opcode and we have a list containing the types of the operands - that is, whether each operand is a one byte integer, a two byte integer, or a one-byte code that refers to a variable. I want to make a list of the actual operands.
The time has come to say how one byte describes a variable. The convention is as follows:
- 0 means to pop the evaluation stack and use the popped value. (We'll see later that the "store" portion of the instruction uses the same convention to mean "push the created value onto the stack", so this is nicely symmetrical.)
- 1 through 15 identifies a local variable in the current routine activation. There can be no more than 15 local variables in any given routine.
- 16 through 255 identifies a global variable; the story maintains a block with 240 words that are reserved for use as global variables.
In all cases a variable stores a word.
Of course I want to capture this in the type system, rather than representing these things as naked integers. Each of these concepts gets its own type:
type local_variable = Local of inttype global_variable = Global of inttype variable_location = | Stack | Local_variable of local_variable | Global_variable of global_variable
And now we have enough gear to describe an operand:
type operand = | Large of int | Small of int | Variable of variable_location
Notice that I'm getting pretty deep in my wrapper types here. An operand could be a Variable Local_variable Local 10 for example. But I live in fear of accidentally mixing that up with the small number 10, the address 10, and so on. These wrapper types are cheap.
Anyways, we have a list of operand types, and we have the operands encoded in the instruction bytes. I want a list of operands.
A couple helper functions deal with turning integers into variables; later on I will need to go the other way, so I'll just implement them both now:
let decode_variable n = let maximum_local = 15 in if n = 0 then Stack else if n <= maximum_local then Local_variable (Local n) else Global_variable (Global n)let encode_variable variable = match variable with | Stack -> 0 | Local_variable Local n -> n | Global_variable Global n -> n
All right, now we can take that list of operand types and the address of the start of the operands in the instruction, and go to town.
This method is not tail recursive but the maximum number of operands is eight, so I really don't care.
let rec decode_operands operand_address operand_types =
Here for the first time we are doing pattern matching on lists. It's a bit surprising that it took this long to get here because this is a very normal thing to do in OCaml. We handle the list of types recursively.
match operand_types with
First the base case. If the type list is empty then the operand list is empty.
| [] -> []
Otherwise, there must be a first operand type, followed by a tail list with the remaining types. There are three possible cases for the head, and we cover each of them with a pattern. Either there is a large operand followed by a tail, or a small operand followed by a tail, or a variable operand followed by a tail.
| Large_operand :: remaining_types -> let w = read_word (byte_addr_to_word_addr operand_address) in let tail = decode_operands (inc_byte_addr_by operand_address word_size) remaining_types in (Large w) :: tail | Small_operand :: remaining_types -> let b = read_byte operand_address in let tail = decode_operands (inc_byte_addr operand_address) remaining_types in (Small b) :: tail | Variable_operand :: remaining_types -> let b = read_byte operand_address in let v = decode_variable b in let tail = decode_operands (inc_byte_addr operand_address) remaining_types in (Variable v) :: tail
If we omitted Omitted, ha ha, then OCaml would give a warning that the set of patterns do not 100% cover all the possibilities. So to keep it quiet, even though you and I know that we never pushed an omitted operand onto the operand type list, we handle the case with a failure.
An alternative approach would have been to push omitteds onto the list of types, and then deal with them here, by, say, bailing out when we got there. I can see the elegance of such an approach but I don't have a strong opinion one way or the other. I think this is fine.
| Omitted :: _ -> failwith "omitted operand type passed to decode operands"
And we're going to need to know how many bytes we consumed in operands. Two for every large operand, and one for every small or variable operand. (Again, this is not tail recursive, and again, I don't care. It's a list of 8 or fewer items.)
let rec get_operand_length operand_types = match operand_types with | [] -> 0 | Large_operand :: remaining_types -> word_size + (get_operand_length remaining_types) | _ :: remaining_types -> 1 + (get_operand_length remaining_types) in
All right, summing up: we now have the opcode, and we have all the operands tagged with whether they are large constants, small constants, local variables, global variables or stack pops. We know how many bytes the opcode, the operand types and the operands took up. Next time on FAIC: we'll decode the store, if there is one.