Archive for May, 2016

In the previous installment, I introduced a mystery class `NeuralNetwork` which is capable of calculating different results depending on the data that you train it with. In this article, we’ll take a closer look at this neural network and crunch a few numbers.

Neural networks overview

A neural network is a graph of neurons made of successive layers. The graph is typically split in three parts: the leftmost column of neurons is called the “input layer”, the rightmost columns of neurons is the “output layer” and all the neurons in-between are the “hidden” layer. This hidden layer is the most important part of your graph since it’s responsible for making the calculations. There can be any numbers of hidden layers and any number of neurons in each of them (note that the Kotlin class I wrote for this series of articles only uses one hidden layer).

Each edge that connects two neurons has a weight which is used to calculate the output of this neuron. The calculation is a simple addition of each input value multiplied by its weight. Let’s take a look at a quick example:

This network has two input values, one hidden layer of size two and one output. Our first calculation is therefore:

``````w11-output = 2 * 0.1 + (-3) * (-0.2) = 0.8
w12-output = 2 * (-0.4) + (-3) * 0.3 = -1.7``````

We’re not quite done: the actual outputs of neurons (also called “activations”) are typically passed to a normalization function first. To get a general intuition for this function, you can think of it as a way to constrain the outputs within the [-1, 1] range, which prevents the values flowing through the network from overflowing or underflowing. Also, it’s useful in practice for this function to have additional properties connected to its derivative but I’ll skip over this part for now. This function is called the “activation function” and the implementation I used in the `NeuralNetwork` class is the hyperbolic tangent, `tanh`.

In order to remain general, I’ll just refer to the activation function as `f()`. We therefore refine our first calculations as follows:

``````w11-output = f(2 * 0.1 + (-3) * (-0.2))
w12-output = f(2 * (-0.4) + (-3) * 0.3)``````

There are a few additional details to this calculation in actual neural networks but I’ll skip those for now.

Now that we have all our activations for the hidden layer, we are ready to move to the next layer, which happens to be the ouput layer, so we’re almost done:

``````output = f(0.1 * w11-output - 0.2 * w12-output
= 0.42``````

As you can see, calculating the output of a neural network is fairly straightforward and fast, much faster than actually training that network. Once you have created your networks and you are satisfied with its results, you can just pass around the characteristics of that network (weights, sizes, …) and any device (even phones) can then use that network.

Revisiting the xor network

Let’s go back to the `xor` network we created in the first episode. I created this network as follows:

``NeuralNetwork(inputSize = 2, hiddenSize = 2, outputSize = 1)``

We only need two inputs (the two bits) and one output (the result of `a xor b`). These two values are fixed. What is not fixed is the size of the hidden layer, and I decided to pick 2 here, for no particular reason. It’s interesting to tweak these values and see whether your neural network performs better of worse based on these values and there is actually a great deal of both intuition and arbitrary choices that go into these decisions. These values that you use to configure your network before you run it are called “hyper parameters”, in contrast to the other values which get updated while your network runs (e.g. the weights).

Let’s now take a look at the weights that our `xor` network came up with, which you can display by running the Kotlin application with --log 2:

``````Input weights:
-1.21 -3.36
-1.20 -3.34
1.12 1.67
Output weights:
3.31
-2.85``````

Let’s put these values on the visual representation of our graph to get a better idea:

You will notice that the network above contains a neuron called “bias” that I haven’t introduced yet. And I’m not going to just yet besides saying that this bias helps the network avoid edge cases and learn more rapidly. For now, just accept it as an additional neuron whose output is not influenced by the previous layers.

Let’s run the graph manually on the input (1,0), which should produce 1:

``````hidden1-1 = 1 * -1.21
hidden1-2 = 0 * -1.20
bias1     = 1 * 1.12
output1 = tanh(-1.21 + 1.12) = -0.09
hidden2-1 = 1 * -3.36
hidden2-2 = 0 * -3.34
bias2     = 1 * 1.67
output2 = tanh(-3.36 + 1.6) = -0.94
// Now that we have the outputs of the hidden layer, we can caculate
// our final result by combining them with the output weights:
finalOutput = tanh(output1 * 3.31 + output2 * (-2.85))
= 0.98``````

We have just verified that if we input `(0,1)` into the network, we’ll receive `0.98` in output. Feel free to calculate the other three inputs yourself or maybe just run the `NeuralNetwork` class with a log level of 2, which will show you all these calculations.

Revisiting the parity network

So the calculations hold up but it’s still a bit hard to understand where these weights come from and why they interact in the way they do. Elucidating this will be the topic of the next installment but before I wrap up this article, I’d like to take a quick look at the parity network because its content might look a bit more intuitive to the human eye, while the `xor` network detailed above still seems mysterious.

If you train the parity network and you ask the `NeuralNetwotk` class to dump its output, here are the weights that you’ll get:

``````Input weights:
0.36 -0.36
0.10 -0.09
0.30 -0.30
-2.23 -1.04
0.57 -0.58
Output weights:
-1.65
-1.64``````

If you pay attention, you will notice an interesting detail about these numbers: the weights of the first three neurons of our hidden layer cancel each other out while the two inputs of the fourth neuron reinforce each other. It’s pretty clear that the network has learned that when you are testing the parity of a number in binary format, the only bit that really matters is the least significant one (the last one). All the others can simply be ignored, so the network has learned from the training data that the best way to get close to the desired output is to only pay attention to that last bit and cancel all the other ones out so they don’t take part in the final result.

Wrapping up

This last result is quite remarkable if you think about it, because it really looks like the network learned how to test parity at the formal level (“The output of the network should be the value of the least significant bit, ignore all the others”), inferring that result just from the numeric data we trained it with. Understanding how the network learned how to modify itself to reach this level will be the topic of the next installment of the series.

It’s hard not to hear about machine learning and neural networks these days since the practice is being applied to an ever increasingly wide variety of problems. Neural networks can be intimidating and look downright magical to the untrained (ah!) eye, so I’m going to attempt to dispel these fears by demonstrating how these mysterious networks operate. And since there are already so many tutorials on the subject, I’m going to take a different approach and go from top to bottom.

Goal

In this first series of articles, I will start by running a very simple network on two simple problems, show you that they work and then walk through the network to explain what happened. Then I’ll backtrack to deconstruct the logic behind the network and why it works.

The neural network I’ll be using in this article is a simple one I wrote. No TensorFlow, no Torch, no Theano. Just some basic Kotlin code. The original version was about 230 lines but it’s a bit bigger now that I broke it up in separate classes and added comments. The whole project can be found on github under the temporary “nnk” name. In particular, here is the source of the neural network we’ll be using.

I will be glossing over a lot of technical terms in this introduction in order to focus on the numeric aspect but I’m hoping to be able to get into more details as we slowly peel the layers. For now, we’ll just look at the network as a black box that get fed input values and which outputs values.

The main characteristic of a neural network is that it starts completely empty but it can be taught to solve problems. We do this by feeding it values and telling it what the expected output is. We iterate over this approach many times, changing these inputs/expected parameters and as we do that, the network updates its knowledge to come up with answers that are as close to the expected answers as possible. This phase is called “training” the network. Once we think the network is trained enough, we can then feed it new values that it hasn’t seen yet and compare its answer to the one we’re expecting.

The problems

Let’s start with a very simple example: `xor`.

This is a trivial and fundamental binary arithmetic operation which returns 1 if the two inputs are different and 0 if they are equal. We will train the network by feeding it all four possible combinations and telling it what the expected outcome is. With the Kotlin implementation of the Neural Network, the code looks like this:

``````with(NeuralNetwork(inputSize = 2, hiddenSize = 2, outputSize = 1)) {
val trainingValues = listOf(
NetworkData.create(listOf(0, 0), listOf(0)),
NetworkData.create(listOf(0, 1), listOf(1)),
NetworkData.create(listOf(1, 0), listOf(1)),
NetworkData.create(listOf(1, 1), listOf(0)))
train(trainingValues)
test(trainingValues)
}``````

Let’s ignore the parameters given to `NeuralNetwork` for now and focus on the rest. Each line of `NetworkData` contains the inputs (each combination of 0 and 1: (0,0), (0,1), (1,0), (1,1)) and the expected output. In this example, the output is just a single value (the result of the operation) so it’s a list of one value, but networks can return an arbitrary number of outputs.

The next step is to test the network. Since there are only four different inputs here and we used them all for training, let’s just use that same list of inputs but this time, we’ll display the ouput produced by the network instead of the expected one. The result of this run is as follows:

Running neural network xor()

``````[0.0, 0.0] -> [0.013128957]
[0.0, 1.0] -> [0.9824073]
[1.0, 0.0] -> [0.9822749]
[1.0, 1.0] -> [-2.1314621E-4]``````

As you can see, these values are pretty decent for such a simple network and such a small training data set and you might rightfully wonder: is this just luck? Or did the network cheat and memorize the values we fed it while we were training it?

One way to find out is to see if we can train our network to learn something else, so let’s do that.

A harder problem

This time, we are going to teach our network to determine whether a number is odd or even. Because the implementation of the graph is pretty naïve and this is just an example, we are going to train our network with binary numbers. Also, we are going to learn a first important lesson in neural networks which is to choose your training and testing data wisely.

You probably noticed in the example above that I used the same data to train and test the network. This is not a good practice but it was necessary for `xor` since there are so few cases. For better results, you usually want to train your network on a certain population of the data and then test it on data that your network hasn’t seen yet. This will guarantee that you are not “overfitting” your network and also that it is able to generalize what you taught it to input values that it hasn’t seen yet. Overfitting means that your network does great on the data you trained it with but poorly on new data. When this happens, you usually want to tweak your network so that it will possibly perform less well on the training data but it will return better results for new data.

For our parity test network, let’s settle on four bits (integers 0 – 15) and we’ll train our network on about ten numbers and test it on the remaining six:

``````with(NeuralNetwork(inputSize = 4, hiddenSize = 2, outputSize = 1)) {
val trainingValues = listOf(
NetworkData.create(listOf(0, 0, 0, 0), listOf(0)),
NetworkData.create(listOf(0, 0, 0, 1), listOf(1)),
NetworkData.create(listOf(0, 0, 1, 0), listOf(0)),
NetworkData.create(listOf(0, 1, 1, 0), listOf(0)),
NetworkData.create(listOf(0, 1, 1, 1), listOf(1)),
NetworkData.create(listOf(1, 0, 1, 0), listOf(0)),
NetworkData.create(listOf(1, 0, 1, 1), listOf(1)),
NetworkData.create(listOf(1, 1, 0, 0), listOf(0)),
NetworkData.create(listOf(1, 1, 0, 1), listOf(1)),
NetworkData.create(listOf(1, 1, 1, 0), listOf(0)),
NetworkData.create(listOf(1, 1, 1, 1), listOf(1))
)
train(trainingValues)
val testValues = listOf(
NetworkData.create(listOf(0, 0, 1, 1), listOf(1)),
NetworkData.create(listOf(0, 1, 0, 0), listOf(0)),
NetworkData.create(listOf(0, 1, 0, 1), listOf(1)),
NetworkData.create(listOf(1, 0, 0, 0), listOf(0)),
NetworkData.create(listOf(1, 0, 0, 1), listOf(1))
)
test(testValues)
}``````

And here is the output of the test:

``````Running neural network isOdd()
[0.0, 0.0, 1.0, 1.0] -> [0.9948013]
[0.0, 1.0, 0.0, 0.0] -> [0.0019584869]
[0.0, 1.0, 0.0, 1.0] -> [0.9950419]
[1.0, 0.0, 0.0, 0.0] -> [0.0053276513]
[1.0, 0.0, 0.0, 1.0] -> [0.9947305]``````

Notice that the network is now outputting correct results for numbers that it hadn’t seen before, just because of the way it adapted itself to the training data it was initially fed. This gives us good confidence that the network has configured itself to classify numbers from any input values and not just the one it was trained for.

Wrapping up

I hope that this brief overview will have whetted your appetite or at least piqued your curiosity. In the next installment, I’ll dive a bit deeper into the `NeuralNetwork` class, explain the constructor parameters and we’ll walk through the inner working of the neural network that we created to demonstrate how it works.

I have a special connection to this game since I tried (unsuccessfully) to crack the original something like thirty years ago. My knowledge of the 6502 and of the Apple ][ ROM was not sufficient back then, but today, I have the Internet and a rabid hunger for revenge.

The game comes on two sides, of which only side A is protected. Side B is not a DOS disk but it follows the standard D5 AA 96 encoding, so it can be copied easily.

Side A is very non standard, using AD DA D0 which is then followed by something that’s nowhere like the sector content one would expect. Let’s get first stage in.

@a2_4am‘s AUTOTRACE captures BOOT0 and then stops there. No RWTS and no IOB. Not surprising.

The plan: get all the successive stages in memory, save them one by one and see if we can get a full crack this way. No DEMUFFIN will be used in this endeavor.

\$801 starts with some standard stuff and then reads stage 2 in:

``````; Look for AD DA DD
0833-   BD 8C C0    LDA   \$C08C,X
0836-   10 FB       BPL   \$0833
083A-   D0 F7       BNE   \$0833
083C-   BD 8C C0    LDA   \$C08C,X
083F-   10 FB       BPL   \$083C
0841-   C9 DA       CMP   #\$DA
0843-   D0 F3       BNE   \$0838
0845-   BD 8C C0    LDA   \$C08C,X
0848-   10 FB       BPL   \$0845
084A-   C9 DD       CMP   #\$DD
084C-   D0 EA       BNE   \$0838``````

What follows is a bit non standard in my view and was kindly elucidated to me by Qkumba, an Apple ][ expert.

``````; Performs a continuous read.  There is only one header and then one
; really big sector.  The value in \$FA is the number of pages in
; that sector.  The value in \$F8 is the high byte of the address.
0850-   BD 8C C0    LDA   \$C08C,X
0853-   10 FB       BPL   \$0850
0855-   38          SEC
0856-   2A          ROL
0857-   85 F6       STA   \$F6
0859-   B0 12       BCS   highBitSet
``````
``````; not a timing nibble, read again
085B-   BD 8C C0    LDA   \$C08C,X
085E-   10 FB       BPL   \$085B
0860-   38          SEC
0861-   2A          ROL
0862-   85 F6       STA   \$F6
0864-   C8          INY
0865-   D0 06       BNE   highBitSet
0867-   E6 F8       INC   \$F8
0869-   C6 FA       DEC   \$FA
086B-   F0 0F       BEQ   \$087C
``````

Going back to the track content with a nibble editor does shed some light on why this code is looking this way but I didn’t capture it so I’m not going to linger on this. If I get my way, the crack will end up discarding all this part anyway.

I just want to find out where the next stage starts so I can stop it. And here it is:

``````087C-   BD 8C C0    LDA   \$C08C,X
087F-   10 FB       BPL   \$087C
0881-   25 F6       AND   \$F6
0883-   45 F5       EOR   \$F5
0885-   D0 A4       BNE   \$082B
0887-   4C 0C A0    JMP   \$A00C
``````

So the next stage starts at \$A00C. Time to ruin the party.

Clone the boot procedure in \$9600 and force the disk to reboot instead of launching stage two:

``````*9600<C600.C6FFM
*96F8L
96F8-   A9 4C       LDA   #\$4C
96FA-   8D 87 08    STA   \$0887
96FD-   A9 0A       LDA   #\$0A
96FF-   8D 88 08    STA   \$0888
9702-   A9 97       LDA   #\$97
9704-   8D 89 08    STA   \$0889
9707-   4C 01 08    JMP   \$0801
970A-   A9 C5       LDA   #\$C5
970C-   8D DC A0    STA   \$A0DC
970F-   4C 0C A0    JMP   \$A00C
*BSAVE REBOOT2,A\$9600,L\$110
\$9600G
``````

I simply replace the JMP \$A00C with a jump to my own code, which instead, reboots to my work disk.

Main disk boots, grinds a bit then my work disk boots. Let’s welcome our new guest.

``````*A00CL
; Standard reset hijacking: the badlands for this stage is \$A353
A00C-   A0 00       LDY   #\$00
A00E-   84 FD       STY   \$FD
A010-   84 F1       STY   \$F1
A012-   84 F2       STY   \$F2
A014-   AD FF 7F    LDA   \$7FFF
A017-   8D 88 A3    STA   \$A388
A01A-   A9 53       LDA   #\$53
A01C-   8D F0 03    STA   \$03F0
A01F-   8D F2 03    STA   \$03F2
A022-   8D FC 03    STA   \$03FC
A025-   8D FE 03    STA   \$03FE
A028-   A9 A3       LDA   #\$A3
A02A-   8D F1 03    STA   \$03F1
A02D-   8D F3 03    STA   \$03F3
A030-   8D FD 03    STA   \$03FD
A033-   8D FF 03    STA   \$03FF
A036-   49 A5       EOR   #\$A5
A038-   8D F4 03    STA   \$03F4
; This next section is mysterious. Store some values in \$200 (the keyboard
; buffer). Not sure what this is for but let's make a note of it
; for later.
A03B-   A9 5D       LDA   #\$5D
A03D-   8D 00 02    STA   \$0200
A040-   A9 43       LDA   #\$43
A042-   8D 01 02    STA   \$0201
A045-   A9 A2       LDA   #\$A2
A047-   85 22       STA   \$22
A049-   A9 4C       LDA   #\$4C
A04B-   85 23       STA   \$23
; Display @ signs on the screen, this is what we see during the boot
A04D-   A9 A0       LDA   #\$A0
A04F-   99 00 04    STA   \$0400,Y
A052-   99 00 05    STA   \$0500,Y
A055-   99 00 06    STA   \$0600,Y
A058-   99 00 07    STA   \$0700,Y
A05B-   C8          INY
A05C-   D0 F1       BNE   \$A04F
; Switch graphic mode
A05E-   AD 51 C0    LDA   \$C051
A061-   AD 54 C0    LDA   \$C054
A064-   AD 52 C0    LDA   \$C052
A067-   AD 57 C0    LDA   \$C057
; And now some real work starts, loading more tracks for stage 3
A06A-   A9 02       LDA   #\$02
A06C-   A6 FB       LDX   \$FB
A06E-   8E F8 05    STX   \$05F8
A06E-   8E F8 05    STX   \$05F8
A071-   A0 05       LDY   #\$05
A073-   8D 01 A0    STA   \$A001
A076-   8C 00 A0    STY   \$A000
``````

Not going to bore you with the details. I followed the logic here and figured out most of it until we reach the place where stage 2 continues into stage 3:

``````*A0B9L
A0B9-   A9 E3       LDA   #\$E3
A0BB-   8D F0 03    STA   \$03F0
A0BE-   8D F2 03    STA   \$03F2
A0C1-   8D FC 03    STA   \$0
A0C4-   8D FE 03    STA   \$03FE
A0C7-   A9 80       LDA   #\$80 ;; bad land: \$80a5
A0C9-   8D F1 03    STA   \$03F1
A0CC-   8D F3 03    STA   \$03F3
A0CF-   8D FD 03    STA   \$03FD
A0D2-   8D FF 03    STA   \$03FF
A0D5-   49 A5       EOR   #\$A5
A0D7-   8D F4 03    STA   \$03F4
A0DA-   4C 00 40    JMP   \$4000
``````

This code does a new hijacking of the reset vectors (probably because the previous one might have been overwritten by this stage). That new one is not doing anything surprising: wiping everything, etc…

But the important part is, of course:

``````A0DA-   4C 00 40    JMP   \$4000
``````

Back to my boot loader and let’s prevent this code from calling into \$4000. I can
do this simply by replacing the value in \$A0DC with \$C5 so that it becomes:

``````A0DA-   4C 00 C5    JMP   \$C500
``````

which will reboot into my work disk:

``````*BLOAD REBOOT2,A\$9600
``````

``````96F8-   A9 4C       LDA   #\$4C
96FA-   8D 87 08    STA   \$0887
96FD-   A9 0A       LDA   #\$0A
96FF-   8D 88 08    STA   \$0888
9702-   A9 97       LDA   #\$97
9704-   8D 89 08    STA   \$0889
9707-   4C 01 08    JMP   \$0801
970A-   A9 C5       LDA   #\$C5
970C-   8D DC A0    STA   \$A0DC
970F-   4C 0C A0    JMP   \$A00C
``````

The new code is in \$970A.

``````*BSAVE REBOOT3,A\$9600,L\$120
*9600G
``````

Boot, grind, grind, grind. The title screen appears, then the screen turns to white, asks me to flip to side B and… boom, reboot to my work disk. Everything is going according to plan.

Hello stage 3:

``````*4000L
4000-   A9 BA       LDA   #\$BA
4002-   85 C2       STA   \$C2
4004-   A2 00       LDX   #\$00
4006-   86 C0       STX   \$C0
4008-   20 AA 51    JSR   \$51AA
400B-   A9 0D       LDA   #\$0D
400D-   85 C1       STA   \$C1
400F-   20 F5 40    JSR   \$40F5
4012-   D0 CC       BNE   \$3FE0
4014-   C5 C1       CMP   \$C1
4016-   D3          ???
4017-   C5 A0       CMP   \$A0
4019-   C9 CE       CMP   #\$CE
401B-   D3          ???
401C-   C5 D2       CMP   \$D2
401E-   D4          ???
401F-   A0 D9       LDY   #\$D9
4021-   CF          ???
4022-   D5 D2       CMP   \$D2,X
4024-   A0 C3       LDY   #\$C3
*
``````

Ok, now I’m getting a bit worried. This starts as genuine code but it soon becomes garbage. Not good. I predict some nasty stack swizzling happening downstream.

Looking more carefully at the garbage, I actually recognize ASCII letters. Could it be… “PLEASE INSERT YOUR COPY”… Yup, that’s it. That’s encouraging.

But let’s not get ahead of ourselves. First, save this portion at \$4000 (by the way, I’ve been saving all the previous stages as well) and who knows, maybe we’ll get lucky?

Switch to graphics, then

``*4000G``

The screen switches to white (yay!) but instead of the message, I get garbage. And the code soon drops me off in the monitor with a beep. Would have been too good to be true.

Time to roll my sleeves and disassemble this mess.

I’ll spare you the details but it took a while to go through this, and the code doesn’t pull any punches with multiple indirections and obfuscations. My attention is first drawn to code that stores stuff in \$(00),Y which ends up storing garbage in what is otherwise regular code (which ends up being called, hence the abrupt exit).

Cause identified, now we need to find the culprit.

Next step: find out what code stores addresses in \$(00) and after additional (and painful) disassembling, I stumble upon:

``````51FF-   BD C0 09    LDA   \$09C0,X
5202-   85 00       STA   \$00
5204-   BD 00 09    LDA   \$0900,X
5207-   85 01       STA   \$01
5209-   A0 00       LDY   #\$00
520B-   C0 08       CPY   #\$08
520D-   B0 10       BCS   \$521F
520F-   B1 04       LDA   (\$04),Y
5211-   49 FF       EOR   #\$FF
5213-   29 7F       AND   #\$7F
5215-   A4 C0       LDY   \$C0
5217-   91 00       STA   (\$00),Y
5219-   EE 0A 52    INC   \$520A
521C-   4C F9 51    JMP   \$51F9``````

Well hello there, \$(00) stomper. But wait… these values are coming from \$900 and \$9C0. So loaded from a previous stage. Looking there is not reassuring, it’s mostly garbage:

``````0900-   00          BRK
0901-   AD 43 4D    LDA   \$4D43
0904-   44          ???
0905-   D0 31       BNE   \$0938
0907-   32          ???
0908-   39 CD 43    AND   \$43CD,Y
090B-   48          PHA
090C-   45 43       EOR   \$43
090E-   4B          ???
090F-   53          ???``````

I must have missed something.

Taking a hard look at this garbage, and again, it looks familiar. It’s actually BASIC. Wait… what? Pretty sure there’s no BASIC in sight here, so the only other option is that I polluted it myself. And then I captured this pollution when I saved the \$800 stage and it’s been following me ever since.

And then it hits me: this is the HELLO of the working disk! It’s @a2_4am’s own startup, which is very useful to get things started but it then stomps all over what could be useful stuff from stage 2.

Time for the nuclear option:

``*DELETE HELLO``

I don’t need it any more anyway. Let’s back track to stage 2 and see what we get this time.

``````*BLOAD REBOOT2,A\$9600
*9600G``````

Boot, grind, boot to work disk, exit right away this time (no more HELLO).

``````*900L
0900-   20 24 28    JSR   \$2824
0903-   2C 30 34    BIT   \$3430
0906-   38          SEC
0907-   3C          ???
0908-   20 24 28    JSR   \$2824
090B-   2C 30 34    BIT   \$3430
090E-   38          SEC
090F-   3C          ???``````

Well… that’s different garbage, but it’s still garbage. Unless…

I was getting garbage on the screen because the routine in charge of displaying each letter one by one was apparently hitting the wrong regions of memory. Do you notice anything about the code above? It looks like locations in the interleaved graphic memory, each incremented by 4. This is very promising.

Let’s save this precious new addition to our puzzle (up to \$2000 for good measure):

``````*BSAVE A900,A\$900,L\$16FF
``````

``````*BLOAD A4000,A\$4000
``````

Fingers crossed:

``````\$4000G
``````

Yes! The screen turns to white and the message “PLEASE INSERT…” is correctly displayed. I feverishly insert side B, press a key and… I get the first image of the game.

Victory!

Or… is it?

I go “N”, get the second image along with its description and then… BEEEP! And back to the monitor. Kicked out of my own party. Rats. Still not quite there yet.

Time for more disassembly. I trace the tortuous and contrived code (but this time, we’re in the heart of the game engine) until I come across something that is obviously going to crash:

``````43F2-   6C 00 02    JMP   (\$0200)
43F5-   00          BRK``````

\$200 is the keyboard buffer, and of course, it contains everything but a valid address where to jump (actually, it contains “4000G” since these are the latest keys I pressed).

Okay, I’ve identified the problem, now I need to find out which stage puts a valid address at that location. And of course, I’m forbidden to use the keyboard from now on or my keys will destroy its content.

So I go back to the drawing board and go through all the past stages, one by one, for a second, hard look at all the stages, until… eureka! Remember this from stage 2?

``````A03B-   A9 5D       LDA   #\$5D
A03D-   8D 00 02    STA   \$0200
A040-   A9 43       LDA   #\$43
A042-   8D 01 02    STA   \$0201
A045-   A9 A2       LDA   #\$A2
A047-   85 22       STA   \$22
A049-   A9 4C       LDA   #\$4C
A04B-   85 23       STA   \$23``````

It didn’t make sense then but it does now. This store \$435D in \$200 (and checking it confirms it’s valid code). This snippet also stores a couple more values in \$22,\$23. Not sure what these are for but obviously, I’ll have to preserve them just in case.

So things are shaping up now: I need to load the chunks at \$900 and \$4000 and then preserve these values before calling into \$4000. I’m not in the mood to do it the hard way so let’s go BASIC on this one:

``````10  PRINT  CHR\$ (4)"BLOAD A900,A\$900"
40  POKE 49232,0: POKE 49236,0:POKE 49234,0:POKE 49239,0
50  POKE 512,93: POKE 513,67: POKE 34,162: POKE 35,76
100  CALL 16384
``````

I’m loading \$A0000 too just to be safe but we might not need it. We do need \$900, though (the scan lines) and, obviously, \$4000. Then switch the correct graphic mode, put the right values in \$200, \$22 and \$23 and launch.

I insert side B of the game and this time, I can successfully move from location to location without any crash.

Looks like we’re done this time.

The only thing missing to a proper crack is to restore the title screen but that’s pretty trivial.

It’s good to feel closure after thirty years.