End of rainbow
You are on a small, rocky beach by the Frigid River, below the falls. A rainbow crosses over the falls to the east and a narrow path continues to the southwest.
> go east
Can you walk on water vapor?
> wave the sceptre
The rainbow solidifies and is now walkable (the stairs and bannister are the giveaway). A shimmering pot of gold appears at the end of the rainbow.
> take the pot of gold
Taken.
> go east
Code for this and the next several episodes is at https://github.com/ericlippert/flathead/tree/blog9
Last time we managed to get as far as extracting the form, operand count (0, 1, 2 or VAR) and opcode from the first byte of an instruction. (And second byte, in the case of the EXT opcodes.) Now we need to extract information about the type of each operand; there are between 0 and 8 operands.
I'll continue quoting from the specification and then showing the code; my goal is to make the code clearly implement the spec.
There are four 'types' of operand. These are often specified by a number stored in 2 binary digits:
- 00: Large constant, 2 bytes
- 01: Small constant, 1 byte
- 10: Variable, 1 byte
- 11: Omitted altogether, 0 bytes
To be clear here, what we're getting at is that an operand is a number, but that number can either be interpreted as a constant number, or it can be interpreted as a number which identifies a variable that contains a number. Which it is, we encode in the "type" of the operand. We'll see a little later how to identify which variable a given byte refers to.
Let's give each of those a type constructor:
type operand_type = | Large_operand | Small_operand | Variable_operand | Omitted
and now we can implement the logic above as:
let decode_types n = match n with | 0 -> Large_operand | 1 -> Small_operand | 2 -> Variable_operand | _ -> Omitted in
Back to the spec. How do we get the types of each of 0 to 8 operands?
Next, the types of the operands are specified.
- In short form, bits 4 and 5 of the opcode give the type.
- In long form, bit 6 of the opcode gives the type of the first operand,
bit 5 of the second. A value of 0 means a small constant and 1 means a variable.- In variable or extended forms, a byte of 4 operand types is given next. This contains 4 2-bit fields: bits 6 and 7 are the first field, bits 0 and 1 the fourth. The values are operand types as above. Once one type has been given as 'omitted', all subsequent ones must be.
- In the special case of the "double variable" VAR opcodes VAR:236 and VAR:250 a second byte of types is given, containing the types for the next four operands.
Again this is a tad confusing. The logic is actually more clearly expressed by first calling out the 0OP and 1OP cases:
- If the count is 0OP then there are no operand types.
- If the count is 1OP then bits 4 and 5 of the opcode give the type
- In long form the count is 2OP; bit 6 "
Anyway, let's first make a helper function to deal with the VAR case, as that's the tricky one.
let decode_variable_types type_byte = let rec aux i acc = if i > 3 then acc else let type_bits = fetch_bits (Bit_number (i * 2 + 1)) size2 type_byte in match decode_types type_bits with | Omitted -> aux (i + 1) acc | x -> aux (i + 1) (x :: acc) in aux 0 [] in
All right, we can take apart a byte into four bit pairs. We go from back to front, so when we hit "omitted" types we simply never push them onto the accumulator. We end up with a list of types corresponding to each operand.
That solves the problem for the one-byte VAR cases. The more general problem is solved as follows:
let decode_operand_types address form op_count opcode = match (form, op_count, opcode) with | (_, OP0, _) -> [] | (_, OP1, _) -> let b = read_byte address in [decode_types (fetch_bits bit5 size2 b)] | (Long_form, _, _) -> let b = read_byte address in (match fetch_bits bit6 size2 b with | 0 -> [ Small_operand; Small_operand ] | 1 -> [ Small_operand; Variable_operand ] | 2 -> [ Variable_operand; Small_operand ] | _ -> [ Variable_operand; Variable_operand ]) | (Variable_form, _, VAR_236) | (Variable_form, _, VAR_250) -> let opcode_length = get_opcode_length form in let type_byte_0 = read_byte (inc_byte_addr_by address opcode_length) in let type_byte_1 = read_byte (inc_byte_addr_by address (opcode_length + 1)) in (decode_variable_types type_byte_0) @ (decode_variable_types type_byte_1) | _ -> let opcode_length = get_opcode_length form in let type_byte = read_byte (inc_byte_addr_by address opcode_length) in decode_variable_types type_byte
We are really pulling out the stops here as far as OCaml features are concerned! Some interesting points:
The outer match uses a three-element tuple pattern, which we have not seen before. I note that in the patterns the parentheses around the tuple patterns are optional, but I think it reads more easily if they're left in.
We have a nested match; note that it is surrounded in parentheses, because otherwise OCaml would not know where the nested match ended. This might be a good candidate for a helper method, to make the code easier to read. I'm not super thrilled with this nesting.
We use the compact "bracket" syntax for creating fixed-size lists of various sizes.
And we have a new operator that we've not seen before in this series; @ is the list concatenation operator. List concatenation is not very efficient, but we know that neither list will have more than four items, so I am not concerned.
And one last thing: later on I'm going to need to know how many bytes past the opcode were consumed by type information:
let get_type_length form opcode = match (form, opcode) with | (Variable_form, VAR_236) | (Variable_form, VAR_250) -> 2 | (Variable_form, _) -> 1 | _ -> 0
Summing up: we now have a list of operand types, one for every operand. So we now know not only exactly how many operands there are in this instruction, we also know whether each one should be interpreter as a byte constant, word constant, or a reference to a variable. Next time on FAIC: we'll decode the operands.