Let’s Have Fun With Interpreters and Bytecode VMs— Chapter 1

Bytecode is the target representation for our emulator, but it is not the only representation.Let’s write some codeStarting with the low-hanging fruit, some data structures to represent our opcodes.Instead of converting opcodes to plain english (i.e “draw sprite to…”) we’re going to use something called a mnemonic.To use the example from earlier,Set Register 1 to the value stored in Register 1 XORed with the value stored in Register 2Now becomes:XOR $1, $20×8123 -> XOR $1, $2 0x1234 -> JMP 234h//much easier to comprehendNote: we’re going to use h suffixes to indicate that a number is in hexadecimal and $ prefixes for registers..This doesn’t matter now, but it will later.We should also implement some structures to encapsulate this data as tokens..This will come in handy for reasons explained later, but in short: Tokens are units of strings..Given some raw input, we lex it into tokens and then parse it into some other intermediate abstraction (eventually byte code).Remember: how we represent this data is flexible, as long as it represents the rules in a complete way.We could use any Mnemonic in the world..For example, we could use SPRITE instead of DRAW (or even PINECONE for whatever reason).Finally, we’ll want to organize these structures together in a single class, we can call it Opcode.Our class Opcode will take a number as it’s only constructor argument and parse the opcode into an OpcodeBytes object..It will look up and store the Mnemonic instruction associated with the opcode as an Instruction Token, and then extract the encoded variables based on the instruction as Integer or Register tokens.In order to elegantly extract these variables, we’ll have to look at the structure of the instruction set..We already know that there can be variables x, y, kk, nnn but if you look carefully at the above instruction set, you’ll see that we can abstract all instructions into the following categories:Args x, y, z: DRAWArgs x, KK : SKEQ, SKNE, MOV, RANDArgs x, y : SKEQ, SKNE, MOV, OR, AND XOR, ADD, SUB, RSBArgs nnn : JMP, JSR, MVI, JMIFinally, we’ll need to define a toString method that will output the opcode in it’s disassembled form.With this knowledge, we’re two switch statements and some organization away from a complete implementation..The switch statements make the class a bit long and not the most pretty, you can see the full code here..But the most salient parts are:We now have a nice (somewhat compact) class to disassemble any valid Chip8 opcode..No more cross-referencing!const opcode = new Opcode(0xD123);console.log(opcode.toString()); // DRAW 1h, 2h, 3hconst opcode2 = new Opcode(0x1ABC);console.log(opcode2.toString()); // JMP ABChThat’s it for today.Next time, we’ll look at what we need to disassemble an entire Chip-8 program..We’ll explore a naive solution first, look at why that solution doesn’t exactly fit the bill and then move on to a more comprehensive solution..We’ll also go over the Chip-8 Specifications in greater detail and begin writing the emulator.. More details

Leave a Reply