Tuesday, 20 September 2016

The I76 Mission File - The Single Player Scenarios

The Mission File - Scripts

First off a huge Thanks to Karl Meissner and Kurt Arnlund, of the original I76 development team, for suggestions and hints as to what we're looking at in this section, and how to understand it.

Also a warning: this information is (mostly) still at the level of guesswork - unlike the earlier file formats where I  could extract and modify complete sets I'm still at the stage of tweaking around the edges here. Hopefully I'll be able to refine a few guesses later, but for now here is where I am.

The Data Wrapper

The single player mission script implementations are held in the "ADEF" section of this mission file - this is another nested BWD2 structure, and the first element in it is a revision tag "AREV", which is 12 bytes long, and always reports a revision number of "1".

Following this is the FSM tag, which starts with "FSM " and a 4 byte size, which occupies the remaining content of the ADEF field: it will always be the length of the ADEF field minus 28 bytes (version header, and two trailing EXIT tags)

The data in the FSM region gives us the key information about the mission state, and is arranged in chunks of "4 byte size" followed by data. The size is given as "number of entries"

The FSM section contains:
  • Action Table: An array of 40 byte strings, defining actions that the game engine can perform.
  • Entity Table: An array of 48 byte entries, each of which contains
    • a 40 byte label
    • a corresponding 8 byte object name
  • Clip Table: An array of 40 byte file names, each of which is a sound file
  • Path Table: An array of paths, where each path is:
    • a 40 byte path label
    • a four byte count of points, followed by
      • a 12 byte point, of 3 four byte floats, for each point
  • Machine Table: a number of 42 byte machine definitions
  • Variable Table: a number of 4 byte integers
  • ByteCode: a number of 8 byte entries
The last three entries are particularly involved, so we'll deal with those separately.

The action table is always the same (same size, same entries, same order). There are 95 actions, and this is a list of instructions shared with the executable, also viewable as readable strings in the binary. So there are entries like:
Action 65:isWithinEnemy
Action 66:isAttacked
Action 67:isShot
Action 68:isRammed
Action 69:hpLesser
Action 70:isDead
The Entity table has a number of descriptive names associated with the compiled (i.e. masked) versions of the object names, so this has entries like (e.g. from Trip 1):
Entity 0: speedsign -> Object sl50x_1
Entity 1: groove -> Object vppirna1
Entity 2: taurus -> Object t01js01
Entity 3: enemy1 -> Object t01al01
The clip table contains sound clips used during the level. The first 105 clips (0-104) are always the same set of common sounds (same files, same order), and the later entries vary between levels, e.g. from Trip 1 these are the clips used in the opening communications between Taurus and Groove with annotations added by me to define which is which:
Clip 105 :01sta66.wav (T: Ok, follow me in formation. That means...)
Clip 106 :01sta09.wav (T: Ok Baby, here's the school ... this car's been Mod-i-fied [First clip])
Clip 107 :01sgc06.wav (G: Yeah, I got it. [1st groove])
Clip 108 :01sta10.wav (T: Follow me to seagraves [clip after first pan to sign])
Clip 109 :01sgc07.wav (G: Got it.)
The path table is a handily labelled set of paths and points which trigger events, cars follow, etc. These are in world co-ords in meters. Again, from Trip 01:
Path 9: loop 3535,0,40215 3525,0,40135 3545,0,40075 3625,0,40085 3640,0,40115 3700,0,40160 3690,0,40195 3670,0,40245 3590,0,40245
Path 10: theway 4545,0,38675 4490,0,38680 4250,0,38680 4200,0,38680 4065,0,38680 3895,0,38680 3895,0,38685 4200,0,38685 4200,0,38510 4190,0,38365 4580,0,38350 4665,0,38615
Path 11: tohome 3730,0,40230 3720,0,40290
Path 12: torise 3860,0,38830 3790,0,38905 3670,0,39045 3640,0,39165 3675,0,39245 3825,0,39325 3910,0,39345 4030,0,39435 4050,0,39565 4035,0,39675 3995,0,39810 3930,0,39980 3905,0,40035
Now, for those big tables.

The Compiled Script Tables

Again a warning: the remainder here is mostly guesswork - I'm not at the stage where I fully understand the execution of the bytecode itself, and modifications have a tendency to crash.

This section of the file is used to run a number of tasks on a set of basic Virtual Machines; the game scripts were compiled to bytecode and these bytecode blocks are run on the internal VM. 

Of the three tables, the third is the actual bytecode, the first table is a set of execution hooks into this bytecode, and the second is some form of global constant/variable array.

My first guess was that due to the limited number of opcodes this would be a stack machine. For the moment I'm assuming that each running "machine" has an incoming argument list and a local stack used to marshal arguments which go to the game engine actions.

The first table, which I'm calling the machine table, has a start point (i.e. a location inside the third bytecode block) and then a count for the number of arguments and the arguments themselves. Obviously by giving different arguments then some script sections can apply to different objects, so for the tutorial level, a01.msn, then this table has the entries:
Block1/1 Start Address: 2238 Initial args Sz: 6 args: [1 7 8 9 10 14 ]
Block1/2 Start Address: 2238 Initial args Sz: 6 args: [2 8 9 10 7 14 ]
Block1/3 Start Address: 2238 Initial args Sz: 6 args: [3 9 10 7 8 14 ]
Block1/4 Start Address: 2238 Initial args Sz: 6 args: [4 10 7 8 9 14 ]
So, this is the control of the first four drones (entities 1,2,3 & 4) which follow the loop path set out by the four points 7, 8, 9 & 10. Each drone is given the points with a different start in order to ensure they are neatly queued.

ByteCode and operations Guesswork

Looking initially at the compiled bytecode: each 8 byte entry appears to form a pair of two 4 byte numbers. The first number is always a fairly low value (<16), and I'm guessing that this means we're looking at a 4 byte operation and a 4 byte argument pair. The operation codes run from "1" up to "14", and opcodes 2,3 and 11 are never used in the default i76, but 3 & 11 are used in the Nitro pack. I'll focus on the original i76 missions to simplify things.

Looking at the range of values associated with the opcodes we can make some initial guesses.

OpCodes 8, 9 and 10 are Jump style instructions which have a valid destination address as an argument. Given the last entry in all the bytecode tables is opcode 10 this is almost certainly the unconditional branch (and we can split up the bytecode using the ranges in the machine table and see this idea is confirmed).

OpCode 13 has the correct range of values to cover the table of actions, and substituting in we get some "sensible" patterns.

Opcode 5 appears to line up with passed arguments: I view this a stack copy from an input argument stack, however the value appears to be one value deeper than would be expected from the initial values, indicating a possible addition.

As a result, and looking at the way functions are entered currently I believe that leaves 8 as a "Call with arguments", and 9 as a "Jump if Zero". The significance of 8 as a call is that it potentially modifies the stack in line with our expectations.

Also by looking at the audio clip list and correlating the clips we know about with the actions in the game it becomes clear that opcode "1" is used to push audio clips which are then used by functions like "cbPrior" to play sounds. More generally this opcode is used to supply function parameters, so we treat opcode 1 as a stack push.

The Bytecode Proper

The first 2180 entries of the bytecode table match between the files, and presumably form a common set of operations and conditions used in the single player game.

Looking at one of the unique blocks in Trip 3 then the machine table has a section which invokes this logical flow:
Block1/20 Start Address: 3368 Initial arg Sz: 2 arg: [0 1 ]
Block3/3368 CALL: 3369
Block3/3369 COPY: -3
Block3/3370 ACTION(isDead)
Block3/3371 JZERO: 3385
Block3/3372 PUSH: 0x75:[117]
Block3/3373 (?)4: 1
Block3/3374 PUSH: 0x1:[1]
Block3/3375 (?)4: 2
Block3/3376 ACTION(cbPrior)
Block3/3377 (?)7: 0x2:[2]
Block3/3378 PUSH: 0x5:[5]
Block3/3379 (?)4: 1
Block3/3380 PUSH: 0x1:[1]
Block3/3381 (?)4: 2
Block3/3382 ACTION(failAllObj)
Block3/3383 (?)7: 0x2:[2]
Block3/3384 (?)12: 0x2:[2]
Block3/3385 COPY: -2
Block3/3386 ACTION(isDead)
Block3/3387 JZERO: 3407
Block3/3388 PUSH: 0x6:[6]
Block3/3389 (?)4: 1
Block3/3390 PUSH: 0x3:[3]
Block3/3391 (?)4: 2
Block3/3392 ACTION(failAllObj)
Block3/3393 (?)7: 0x2:[2]
Block3/3394 PUSH: 0x76:[118]
Block3/3395 (?)4: 1
Block3/3396 PUSH: 0x1:[1]
Block3/3397 (?)4: 2
Block3/3398 ACTION(cbPrior)
Block3/3399 (?)7: 0x2:[2]
Block3/3400 PUSH: 0x77:[119]
Block3/3401 (?)4: 1
Block3/3402 PUSH: 0x3:[3]
Block3/3403 (?)4: 2
Block3/3404 ACTION(cbPrior)
Block3/3405 (?)7: 0x2:[2]
Block3/3406 (?)12: 0x2:[2]
Block3/3407 ACTION(null)
Block3/3408 JMP: 3369
The Trip 3 entity table has:
Entity 0: user -> Object vppirna1
Entity 1: taurus -> Object t03js01
And looking at the bytecode we can see the "obvious" flow of:
Block3/3368 CALL: 3369
So now the argument stack list may have three values "0 1 <return address>"
Block3/3369 COPY: -3
Block3/3370 ACTION(isDead)
Block3/3371 JZERO: 3385
i.e. copy the "0" value from the argument list to the stack and call "isDead" to check if entity 0 (the player) is alive. Assuming isDead returns false then we'd follow the jump and get to
Block3/3385 COPY: -2
Block3/3386 ACTION(isDead)
Block3/3387 JZERO: 3407
Which is essentially the same check for Entity #1 (Taurus). Assuming isDead returns false again then:
Block3/3407 ACTION(null)
Block3/3408 JMP: 3369
Which is a "do nothing" then a return to the entry point to repeat this check. So we would loop around checking that the player and Taurus are alive.

If, however, the isDead on the player came back non-zero then we'd instead follow this branch:
Block3/3372 PUSH: 0x75:[117]
Block3/3373 (?)4: 1
Block3/3374 PUSH: 0x1:[1]
Block3/3375 (?)4: 2
Block3/3376 ACTION(cbPrior)
Block3/3377 (?)7: 0x2:[2]
Block3/3378 PUSH: 0x5:[5]
Block3/3379 (?)4: 1
Block3/3380 PUSH: 0x1:[1]
Block3/3381 (?)4: 2
Block3/3382 ACTION(failAllObj)
Block3/3383 (?)7: 0x2:[2]
Block3/3384 (?)12: 0x2:[2]
This looks to push audio clip #117 for the player death, and instructs the engine to play the clip using "cbPrior" (CB Audio, Priority?). Then it calls into failAllObj. Opcode 12 probably halts this execution flow at this point.

The case where Taurus dies is similar, but this plays two audio clips (#118 & #119) before failAllObj.

Looking up the clip table references then clip# 117 is a "Groove Dies" and clip #118 is a "Taurus Dies" clip, with clip #119 Groove goes "Uh Oh".  Everything ties up there, and we can change the audio clip reference in the mission file to confirm our ideas so far. Also we can modify the arguments to failAllObj and get different reports from the mission NPT file, change Groove dies to Taurus dies cases and vice versa.

From looking at this instruction flow, at this point OpCode 4 looks to be setting the stack pointer manually after each push, and Instruction 7 could be stack cleanup (POP) - removing the earlier arguments.  Although the stack set isn't necessary, since it's clear from other blocks there's an auto increment on PUSH, it's not unreasonable to think that marshalling up the arguments here might take care to force explicit values.

However this isn't consistent enough to be certain, and in particular the difference between the way in which the stack state combined COPY and PUSH counts line up with the POP is odd when we generalise this over some functions and actions. In addition I've been (deliberately) vague about how the return value from actions is actually stored, and the whole machine table call stack thing is a little thin.

Right now I've only convinced myself and the cat that this is close to true.
Yeah. This is my "Convinced Face". No, Really.
Maybe not the cat then, who seems particularly unconvinced by the call stack stuff.

Clearly some more work on reverse engineering this section is required, but that's all for now....

Saturday, 3 September 2016

And yet more I76 - Roads, Objects and Related Guesswork

Early on I mentioned that we would need to decode the RDEF section of the mission file to get more detail on the roads.

This turns out to be fairly straightforward for the basics. There are a couple of loose ends around the details which I've yet to decode, however here's the progress so far.

Parsing the RDEF

The RDEF section is another nested BWD2-style tag/length/data section, it opens with the size of the field and a revision tag, RREV, which for the I76 I have here is always “1”.

Following this there are a number of “RSEG” declarations, each of which contains a declaration for a piece of road, and has the data:
  • 4 byte “RSEG” label
  • 4 byte Data Length (uint32_t)
  • 4 byte Segment Type
  • 4 byte Segment Pieces Count
  • Followed by a number of 24 byte segment pieces.

The segment type tells you the type of road, 0 is “paved highway”, 1 is “dirt track” and 2 is (I believe) the rarer “river bed” type. T05 also has a segment of type 3072, but that has no actual pieces associated with it (so I'm being optimistic and assuming that's an artefact of some four lane highway stuff that was cut from the game, as opposed to a bug on my part).

The segment count tells you the number of 24 byte segment pieces remaining in this block of data, which describe this section of road. (So the overall RSEG data size will be ((24*'segment pieces count')+16).

Each of these 24 byte entries is 6 floats, arranged as two 3 byte vertex co-ordinates, and these outline the road edges. One thing to note is that these values are in absolute game co-ordinates, in meters (so they run from 0,0 to  51995,51995). We can treat these values as “XZY” triplets (for the convention of X vs Y on a plane and Z as vertical), and by placing vertices we end up with a road path which we can simply mesh directly:

One oddity is the 'Z' values – they are almost all zero or near zero, which makes sense since the road will follow the terrain height map for vertical values, however the end of some paths, particularly those at junctions have high values.

However I have no idea quite why these values are high – there may be some hint to the render engine, and the actual Z value groups around particular junctions and junction objects in a suspiciously deliberate way, but I have no idea why the values do what they do. It may be they're masked values, or not actually floats, but for looking at imported road meshes I simply ignore them for now.

Also I think the road definition is just a texturing cue to the render engine, since the actual behaviour of the roads depends on the values in the terrain height field higher order bits, but that's also something of a guess at the moment.


Just a quick note on the objects: these are in the ODEF section of the mission file – again this is a nested BWD2-style section of the file, and has an OREV revision tag of “3” in my case.

Each object starts with the tag “OBJ “ and a 4 byte field length and is always 108 bytes long. The data is
  • 8 byte raw label
  • 2 byte integer
  • 2 byte integer
  • Followed by “at least” 11 bytes of floating point value
The object label is an odd mash of the Class Name string (as per the asset bible) masked with a unique id value to prevent collisions – if you mask out the high bit with a piece of code like
    for (int i=0; i < 8; i++)
      unsigned char v = object.rawlabel.at(i);
      if (v > 0x7f)
        labelhigh = (labelhigh <<1) |0x01;
        labelhigh = (labelhigh <<1) & 0xfe;
      v = v &0x7f;
      if (v != 0)
        object.label += v;
Then you wind up with a set of ASCII strings and associated ID values, i.e. from M01 separating the labels and labelhigh values of the spawn points gets:
Object Label "spawn" High Bits "0"  Of length  "108"
Object Label "spawn" High Bits "16"  Of length  "108"
Object Label "spawn" High Bits "15"  Of length  "108"
Object Label "spawn" High Bits "14"  Of length  "108"
Object Label "spawn" High Bits "13"  Of length  "108"
etc. Quite why this wasn't just split down into a label with a trailing ID number originally isn't entirely clear, although it could be that parts of the engine had to work inside the 8 byte label restriction (another guess).

Although I haven't gone into too much detail on what the object fields actually do (i.e. how ClassID, size and rotation is encoded) the X & Y co-ordinates appear to be from the 9th and 11th floating point value, and we can place them directly on the road render – just dropping these on as simple planes around the target point we get:
t01 roads and objects
To help decode what's being displayed we can use the .obj format to add names to the objects – before declaring the face simply add a line beginning with “o” and followed by the name string, which we can from from the base label and the ID value we extracted above e.g.:
  s = "o " + obj.label +"_"+ QString::number(obj.id);
And you get this:
Highlighting the Red Deacon Fireworks stand from t01