Shenmue Animation Debug Thread

uNDE8oE.jpg


It feels like we've made a little bit of progress but that was the easy part. Using the image above as a reference, we start with Ryo standing. When we push up on the controller, it looks like the game initializes the animation where it reads the instructions and block 2 value. The next part we need to get to is the actual animation where transformation values are read for each node on each frame. And when running the animation the game bounces back and forth between blocks 3 and 4 before magically jumping to a key frame value.

Code:
Reading 1 bytes @ 0x0075 // block 03 0x05
Reading 1 bytes @ 0x0135 // block 04 0x00
Reading 1 bytes @ 0x0135 // block 04 0x00
Reading 1 bytes @ 0x0075 // block 03 0x05
Reading 1 bytes @ 0x0136 // block 04 0xff
Reading 1 bytes @ 0x0136 // block 04 0xff
Reading 1 bytes @ 0x0075 // block 03 0x05
Reading 1 bytes @ 0x0136 // block 04 0xff
Reading 1 bytes @ 0x0076 // block 03 0x07
Reading 1 bytes @ 0x0136 // block 04 0xff
Reading 1 bytes @ 0x0077 // block 03 0x09
Reading 1 bytes @ 0x0137 // block 04 0xff
Reading 1 bytes @ 0x0078 // block 03 0x10
Reading 1 bytes @ 0x0137 // block 04 0xff
Reading 1 bytes @ 0x0079 // block 03 0x11
Reading 1 bytes @ 0x0137 // block 04 0xff
Reading 1 bytes @ 0x007a // block 03 0x13
Reading 1 bytes @ 0x0137 // block 04 0xff
Reading 1 bytes @ 0x007b // block 03 0x18
Reading 1 bytes @ 0x0138 // block 04 0xff
Reading 1 bytes @ 0x007c // block 03 0x1a
Reading 1 bytes @ 0x0138 // block 04 0xff
Reading 1 bytes @ 0x007d // block 03 0x1b
Reading 1 bytes @ 0x0138 // block 04 0xff
Reading 1 bytes @ 0x007e // block 03 0x1e
Reading 2 bytes @ 0x01d6 // block 05 -0.22144
Reading 2 bytes @ 0x01d8 // block 05 -0.22144
Reading 2 bytes @ 0x01da // block 05 1.12891
Reading 2 bytes @ 0x01dc // block 05 -0.14490
Reading 2 bytes @ 0x01de // block 05 -0.14490
Reading 2 bytes @ 0x01e0 // block 05 1.11230

Above is the root bone position. We can see that the game ping-pongs back and forth between block 3 and 4 before knowing where to read values from in block 5. What we can do know is try making a list of possibilities as to how this works. In the initialize state of the animation we read a single value from Block 2, and then we read a few values from Block 4, we still don't exactly know what those do and if they have an impact on anything or not. For Block 3 it looks like we start reading from the start of the block, but it doesn't look like we read in order, as the location being read from Block 3 tends to jump around. And Block 4 actually refers to a look up table, so I should probably try adding the Look up tale values into the log to see if that provides any hits for how far things are moving around.
 
Nice work you have done so far, really interesting ways of reversing the format.

Here is some of my private stash from last year if you are interested...
Latest code for reading the MOTN format:

Overview:
shenmue_mot_overview.jpg

Sequence Data:
shenmue_mot_data.jpg

Sequence Data as Animation Data (Colors match the graph above):
shenmue_mot_anim.jpg

Example animation graphs of the walking animation (No Bezier):
root.jpghip.jpg
right_foot_ik.jpgright_foot.jpg

Hope it helps and keep going!
 

Attachments

  • shenmue_mot_overview.jpg
    shenmue_mot_overview.jpg
    359.1 KB · Views: 1
  • shenmue_mot_data.jpg
    shenmue_mot_data.jpg
    739.7 KB · Views: 1
  • shenmue_mot_anim.jpg
    shenmue_mot_anim.jpg
    733.1 KB · Views: 1
From some of those C files found on the original disk image for DC of Shenmue I, I was able to piece together the original animation names, which are the exact same names as stored in MOTN files themselves, but with a MN_ prefix. Then I figured out where the game decides on using certain animations, specifically for Ryo.. stuff for the blending of all Ryo's locomotion. Some animation IDs:

1594576940487.png

What's weirder though is that for any animations outside of MOTION.BIN, seemingly including the other coldboot MOTN file M_MBAS.BIN (motion base? seems to only contain base poses and basic walk animations and door stuff, similar to MOTION.BIN), the game uses IDs based on the last animation found in MOTION.BIN. There are a total of 1560 animations found in MOTION.BIN, but some IDs have been found to be around the 65000 range.

I'm pretty sure this means that all of the animations were once all in one big motion database and when it came to shipping, the actual MOTN files are split up and housed in their own directory, but are all identified by their index in the motion database itself.
 
Nice work you have done so far, really interesting ways of reversing the format.

Here is some of my private stash from last year if you are interested...
Latest code for reading the MOTN format:

Overview:
View attachment 8237

Sequence Data:
View attachment 8238

Sequence Data as Animation Data (Colors match the graph above):
View attachment 8239

Example animation graphs of the walking animation (No Bezier):
View attachment 8240View attachment 8241
View attachment 8242View attachment 8243

Hope it helps and keep going!

So, I know when we were both looking at this together that I previously said that the foot positions and especially limb bends in MOTN were most likely IK, purely due to the fact that Ryo has foot planting and that there are some keyframes which are position-only on end bones in a chain, for example the nodes we named "LeftFootIKTarget" and "LeftHandIKTarget".

I did some tests in-game since then when we looked, and realised that rotations for a chain, so Hip -> Upper Leg -> Lower Leg -> Foot are more common than translation keyframes for bones in a chain, which leads us to think that actually the animations are more than likely FK. IK would have less rotation keyframes and more translation keyframes, because the output pose is derived from a single point in space (an effector or goal position) which would bend the bones we see being rotated in MOTN, without any rotation keyframes. FK on the other hand seems a lot more likely because it looks like in the walk animation that every bone in what would be an IK chain (from Hip -> Upper Leg -> Lower Leg) is rotated, with only the end bone being translated. Furthermore, I was able to confirm that NPCs don't seem to have footplanting, which points to FK being the animation type in MOTN.

FK or forward kinematics is exactly what this looks like. You rotate all of the individual bones from beginning to end in order to create the pose. This means that in a given animation you would need a rotation keyframe for each bone in the chain to get the pose you want (which is exactly what we're seeing in MOTN).

IK or inverse kinematics is the exact reverse of FK, whereby you only need to give an end effector a position value and then the pose is calculated for you iteratively by walking the IK chain in reverse (from Foot to Lower Leg to Upper Leg to Hip).

We can see just by the output of the animation that it all points to FK, because as previously stated, any bones which would be in an IK chain are being rotated in the walk animation.

I wonder, were you ever able to get a complete 1:1 output of keyframe values in the walk animation? Just having that alone would help tremendously, because we can try to use that data outside of the game and attempt to replicate the animation first before looking deeper into exactly what is going on.
 
Only used IK animation data for it.

Maybe play around with the rig yourself to get an feeling of how basic rigs with IK work.
You can still rotate the leg to change the direction of the knee bending.
They didn't use a pole target which would have done that job instead.

The rig is still wonky af but thats my best guess of how it works
and it could still be complete bullshit so take it all with a grain of salt.

That should be all i had in my archives sooo... peace out and good luck!
 

Attachments

  • shenmue_rig_walk.zip
    86.2 KB · Views: 4
Only used IK animation data for it.

Maybe play around with the rig yourself to get an feeling of how basic rigs with IK work.
You can still rotate the leg to change the direction of the knee bending.
They didn't use a pole target which would have done that job instead.

The rig is still wonky af but thats my best guess of how it works
and it could still be complete bullshit so take it all with a grain of salt.

That should be all i had in my archives sooo... peace out and good luck!

Very interesting to see you got that result for sure just from the end bones of a chain, as always I appreciate the work. I still need to get a true complete keyframe output for all frames in the walk animation from the last stuff you sent. Do you know if there are any remaining issues in the parser logic? I tried dumping some keyframe values last night, but got some weird values:

Code:
boneIdx 36 
keyframe val = 0.0 
keyframe val = -0.01064300537109375 
keyframe val = -0.01064300537109375 
keyframe val = -0.01064300537109375 
keyframe val = 0.0 
keyframe val = 0.016998291015625 
keyframe val = 0.016998291015625 
keyframe val = 3.0517578125e-05 
keyframe val = 3.0517578125e-05

You can still change the direction of the knee bending for example, but there'd be almost no point in rotations on the Upper and Lower Leg bones, especially given the actual walk animation in-game (Ryo's legs stay pretty straight). But I also learned recently that a foot bone can still bend a knee, even in FK. There still could be IK for sure though. It's still a WIP :) Again, appreciate all the work you've done so far!
 
My brain needed about a week to take a break and come back. One of the things that kept confusing me is that blocks 1 and 5 made sense, but block 2, 3 and 4 were a complete mystery. And a lot of that has been helped by the log that LemonHaze provided on discord.

Walk Log File (Raw):
I went ahead and converted the output to JSON which can be found here:

Walk Log File (JSON) :
And what I did was try and convert the data 1:1 to be able to get a better idea of the structure and which bone have which key values. One thing I noticed was reads like this:

Code:
Frame: 0 (RAW)
Keyframe 1: [3.0517578125e-05, 0.2464599609375] @ 956474
Keyframe 1: 1.11328125 @ 956478

Code:
        "Pos Y": [
            {
                "frame": 0,
                "time": 0,
                "vals": [
                    [ 0.000030517578125, 0.2464599609375 ], <-- this
                    1.11328125

In terms of reading the bone Y position, the value should be around 1.1. That's probably the actual key frame value, I'm really not sure what the other two values are being read for. Right now my best guess would be to adjust the bezier curve interpolation between key frames, or something like that. We can come back to minor details like that, so I made a third JSON file that take out these values to make it easier to get a view of which key values are assigned to what frames.

Walk File Log (JSON clean)

After cleaning up the JSON a lot of things that didn't make sense about the file finally started to click.

Screenshot from 2020-07-17 04-02-01.png

I threw in some numbers to make it easier to see.

1) Frames always end on 37 or 0x25 in hex which matches up with the first byte in the animation. I had speculated this was the animation length in frames, but it's nice to see some confirmation.

2) Bone Y position has 14 frames overall, and there is a 12 listed in Block 2. It looks like Frames 0 and 37 are implicit and the 0x0c stands to the number of intermediate key frames. One issue is that when a 0 is declared in block 2 sometimes there are no frames like Root Bone Pos X, and other times there are two values like Root Bone Pos Z. But otherwise this pattern mostly seems to hold, so it seems highly plausible.

3) What adds to the plausibility is the frame numbers from Root Bone Pos y position match up with Block 3. We have 5, 7, 9, 16 (0x10) and goes up to 35 (0x23). So this really seems to fit.

And that kind of takes care of the meanings of Blocks 2 and 3. Block 2 is the number of intermediate frames per bone-axis and Block 3 are the specific key frames. What I'm still in the dark on is a simple explanation for how Block 4 acts as a lookup table for the key values in Block 5, but we have the log output so we're not going to think too hard about that right now.

One thing that I was really caught up on was I was thinking that the game would be reading an XYZ value for each bone, each frame and that's not how it works at all. Each bones-axis will have a start and end key frame, and then a different number of key frames declared in between. And then the transformation matrix for each bone is probably calculated by using interpolation to get the axis value for each frame.

For right now I guess I'll start with the rotation values to see how close to the animation we can get.
 
Last edited:
So I took a look at the new MOTN parser code Phil kindly open-sourced, there seems to be only a single bug present whereby some keyframe values aren't read:

Code:
IKBoneID.HandIKTarget_L_Dupe @ 957710 
Pos X: 
Frame: 0 
Frame: 3 
Frame: 7 
Frame: 13 
Frame: 18 
Frame: 35 
Frame: 37 
Val: 0.2474365234375 @ 957734 
Pos Y: 
Frame: 0 
Frame: 6 
Frame: 11 
Frame: 16 
Frame: 17 
Frame: 20 
Frame: 25 
Frame: 29 
Frame: 34 
Frame: 37 
Pos Z: 
Frame: 0 
Frame: 9 
Frame: 16 
Frame: 22 
Frame: 35 
Frame: 37

From here you can do some simple math and do 957734 - 957710 = 24. Another test on the keyframe_block_count variable shows:-

Code:
IKBoneID.HandIKTarget_L_Dupe @ 957710
Pos X:
Keyframe BlkCnt: 2
Frame: 0
Frame: 3
Frame: 7
Frame: 13
Keyframe BlkCnt: 1
Frame: 18
Frame: 35
Frame: 37
Val: 0.2474365234375 @ 957734
Pos Y:
Keyframe BlkCnt: 3
Frame: 0
Frame: 6
Frame: 11
Frame: 16
Keyframe BlkCnt: 2
Frame: 17
Frame: 20
Frame: 25
Frame: 29
Keyframe BlkCnt: 1
Frame: 34
Frame: 37
Pos Z:
Keyframe BlkCnt: 2
Frame: 0
Frame: 9
Frame: 16
Frame: 22
Keyframe BlkCnt: 1
Frame: 35
Frame: 37

Again, that 24 can be worked out by doing 2+1+3+2+1+2+1 = 12 * 2 = 24. These values all seem to be for this node (LeftHandIKTarget_Dupe), so they're definitely being 'missed' in the new code. I also checked the old code and the values were missing there, too.

1595085887117.png
 
I also did a couple small tests on some rotations, namely the Hip rotation:


(important to note that it seems like there's some missing keyframe values here, too.)

Other than that, seems like good stuff! I'm still hesitant to completely ruling out FK. This is mainly due to the fact that a lot of rotations seem to be 'creating the pose' as opposed to altering a pole target or even an end effector/goal position in terms of IK (which would typically happen after a keyframe which sets an end effectors position, and not the opposite way around). A good example is this:

1595086285479.png

This rotation is clearly moving the left leg backwards.. indicative of FK. I'm not ruling out either FK/IK at this point, but I'd say it's definitely more than likely FK considering the sheer number of rotations involved.
 
This rotation is clearly moving the left leg backwards.. indicative of FK. I'm not ruling out either FK/IK at this point, but I'd say it's definitely more than likely FK considering the sheer number of rotations involved.

If it were pure FK, then we probably would only see position defined except for the root bone. What I think is going on if you have hip rotation, foot rotation, and then I think IK being used to solve the knee. And possibly a similar story with the arms, set shoulder rotation, hand rotation and then IK solve for the elbow. Though I'm not saying this definitively, I was completely wrong on my assumption that game was reading x, y, z values for each bone for each frame. I'm always happy to be wrong as it's a sign of progress.

Right now I'm generally happy that I have a better idea of what's going on with Blocks 2 and 3 now. I'm going to try to go back to reicast and add in writing out the contents of the registers to the log. If the game has the bone id and frame number stored in the registers then a log might help with tracing through block 4 so we have have the correct key frame values to work from.
 
Last edited:
If it were pure FK, then we probably would only see position defined except for the root bone. What I think is going on if you have hip rotation, foot rotation, and then I think IK being used to solve the knee.

I'm not so sure on that statement, the whole idea of forward kinematics is that you walk the bone hierarchy forwards, from Hip down to Foot, rotating each one as necessary, until you get to the end, you still might make use of position keyframes in FK, just in the same way that root motion is pretty much used across the board in all 3D games as a means of moving a skeleton through the world.
 
Last edited:
Discord_E2kslw6AEF.png

IKBoneID.HandIKTarget_L_Dupe @ 957710 Pos X: Frame: 0: 0.247437 Pos Y: Frame: 0: 0.952637 Pos Z: Frame: 0: -0.270752

3dsmax_XrVIWZsplZ.png

Now I think it's just FK. It seems like there's this big subroutine which loops through all of the nodes and rotates each bone on the way with the values from MOTN, until it reaches the end bone where it might transform it with these position values we see in MOTN. This seems to apply for all chains, so Hip -> Foot and Shoulder -> Hand.
 
I've been kind of quiet for a little while. I guess I'll take the opportunity to write out my thoughts and get collected on where we are, and provides some suggestions on what the next steps should be.

In terms of where we are I would say that the biggest difference is that we now know what block 2 and 3 are. It was really frustrating before when blocks 2 and 3 were a seemingly random jumble of bytes can be now easily described as the number of intermediate frames for block 2 and then specifically which frames for block 3. Now what remains is block 4, which seems to act as a mechanism for finding a specific value for a key frame. But I still don't have a simple explanation for how it works. LemonHaze has definitely been on the vanguard with this one in terms of reading through the game's disassembled code.

Right now we have two issues before us. The first is to log and replicate the key frame values for each key frame value from the game, and then from there figure out how to interpret those values out of game in order to recreate the animations. In terms of trying to get better log values I kind of went on a tangent to try and print out the register values in reicast thinking that the node id and frame id might be stored in the registers to allow for higher certainty when it comes to replicating how the game looks up key values.

I wasn't able to figure out how to reference the registers from the C++ code and eventually skmp took pitty on me and created a branch with how to print out the registers (link). Not my proudest moment, but luckily for me pride is like brain cells, it's in short supply. In terms of getting key frame values from the game, it looks like the reference to use is PhilYeahz's python code.

Code:
                last_keyframe_value = 0.0
                keyframe_index = 0

                while keyframe_block_count:

                    binary_stream.seek(self.keyframe_block_types_offset)
                    keyframe_block_type = sread_byte(binary_stream)
                    keyframe_block_size = self.calc_keyframe_block_size(keyframe_block_type)

                    binary_stream.seek(self.float_data_offset)

                    if keyframe_block_type & 0xFF:

                        # keyframe 1 of block
                        if keyframe_block_type & 0xC0:
                            frame = keyframe_frames[keyframe_index]
                            keyframe = self.KeyFrame(frame, frame * self.seconds_per_frame,
                                                     self.bone_index, last_keyframe_value)
                            keyframe_index += 1

                            if keyframe_block_type & 0x80:
                                val = sread_hfloat(binary_stream)
                                keyframe.linear[0] = val
                                val = sread_hfloat(binary_stream)
                                keyframe.linear[1] = val
                                dbg_print("Keyframe 1: ", keyframe.linear, " @ ", binary_stream.tell() - 4)

                            if keyframe_block_type & 0x40:
                                val = sread_hfloat(binary_stream)
                                last_keyframe_value = val
                                keyframe.set_value(val)
                                dbg_print("Keyframe 1: ", keyframe.value, " @ ", binary_stream.tell() - 2)

                            if ignore_linear and keyframe.has_value or not ignore_linear:
                                keyframes.append(keyframe)

                        # keyframe 2 of block
                        if keyframe_block_type & 0x30:
                            frame = keyframe_frames[keyframe_index]
                            keyframe = self.KeyFrame(frame, frame * self.seconds_per_frame,
                                                     self.bone_index, last_keyframe_value)
                            keyframe_index += 1

                            if keyframe_block_type & 0x20:
                                val = sread_hfloat(binary_stream)
                                keyframe.linear[0] = val
                                val = sread_hfloat(binary_stream)
                                keyframe.linear[1] = val
                                dbg_print("Keyframe 2: ", keyframe.linear, " @ ", binary_stream.tell() - 4)

                            if keyframe_block_type & 0x10:
                                val = sread_hfloat(binary_stream)
                                last_keyframe_value = val
                                keyframe.set_value(val)
                                dbg_print("Keyframe 2: ", keyframe.value, " @ ", binary_stream.tell() - 2)

                            if ignore_linear and keyframe.has_value or not ignore_linear:
                                keyframes.append(keyframe)

                        # keyframe 3 of block
                        if keyframe_block_type & 0x0C:
                            frame = keyframe_frames[keyframe_index]
                            keyframe = self.KeyFrame(frame, frame * self.seconds_per_frame,
                                                     self.bone_index, last_keyframe_value)
                            keyframe_index += 1

                            if keyframe_block_type & 0x08:
                                val = sread_hfloat(binary_stream)
                                keyframe.linear[0] = val
                                val = sread_hfloat(binary_stream)
                                keyframe.linear[1] = val
                                dbg_print("Keyframe 3: ", keyframe.linear, " @ ", binary_stream.tell() - 4)

                            if keyframe_block_type & 0x04:
                                val = sread_hfloat(binary_stream)
                                last_keyframe_value = val
                                keyframe.set_value(val)
                                dbg_print("Keyframe 3: ", keyframe.value, " @ ", binary_stream.tell() - 2)

                            if ignore_linear and keyframe.has_value or not ignore_linear:
                                keyframes.append(keyframe)

                        # keyframe 4 of block
                        if keyframe_block_type & 0x03:
                            frame = keyframe_frames[keyframe_index]
                            keyframe = self.KeyFrame(frame, frame * self.seconds_per_frame,
                                                     self.bone_index, last_keyframe_value)
                            keyframe_index += 1

                            if keyframe_block_type & 0x02:
                                val = sread_hfloat(binary_stream)
                                keyframe.linear[0] = val
                                val = sread_hfloat(binary_stream)
                                keyframe.linear[1] = val
                                dbg_print("Keyframe 4: ", keyframe.linear, " @ ", binary_stream.tell() - 4)

                            if keyframe_block_type & 0x01:
                                val = sread_hfloat(binary_stream)
                                last_keyframe_value = val
                                keyframe.set_value(val)
                                dbg_print("Keyframe 4: ", keyframe.value, " @ ", binary_stream.tell() - 2)

                            if ignore_linear and keyframe.has_value or not ignore_linear:
                                keyframes.append(keyframe)

I should probably try replicating this code to see if it will produce the same output from the game, and if it does that means we're on to the next step of trying to implement the key frame values to figure out how they're actually being interpreted so we could possibly (dare i say it) start exporting these animations.

When it comes to implementing animations, human animations are often the most complicated and it's generally not something that you can do all at once. In general with animations it's preferable to start with things like chest or simple enemies to gain an understanding of how the programmers encoded the animations and work your way up, but it's still possible to implement human animations if you break the process down into smaller steps.

One approach is to start with the root bone and work your way up. In Shenmue, the game will stop reading from block 1 when it comes across a 0000 value. What that means is you can effectively put in break points for which bones you want the game to interpret. You can start with the root bone and read position Y, which should be slightly bouncing up and down, and root bone position z which should start at 0 and move a small increment forward to translate Ryo during his walking animation.

Once the values have been interpreted and implemented you can move the instruction to the next word, and implement the first two bones and work your way up so you're only ever working with one set of values at a time.

A debug method that works really well is abusing the idle animation. In Shenmue the game likely has an animation fade effect that will cross fade one animation into another. That means that when switching animations, some of one animation might bleed into another making it difficult to figure out what exactly is causing the model to behave in a certain matter. But with the idle animation you can create a pretty simple environment for testing where you can put in set values for a specific bone axis and see how the game reacts.

5df6efa3d46be.jpg

One way to calibrate is by taking a bone like the shoulder right and zeroing out z and y positions and then only setting a single x value for all of the frames. This should cause the arm to stay fixed in a single location like the positions on a clock to get a frame of reference for which coordinates are being used. If 90 degrees will cause the right hand to stand out straight, then check to see if 270 degrees will cause the left arm to stand out straight. And comment out the hand position so that's not something that interferes at first, and once you can get the arm to stick out straight in a given direction, then you can enable the hand position and make notes of how the game reacts to different positions.

In the case the game has two idle poses, where Ryo shifts from one foot the other, the approach for this is to copy one of the animations into its own file, edit that and then write a small script that will copy the edited animation to overwrite both animations in memory. That way you don't have to worry about which one is playing, both of them should be using your edited debug animation.
 
Last edited by a moderator:
A lot of the heavy lifting has been handled by LemonHaze who has managed to find the location in memory where matrices for the virtual rig are calculated. We can use this to try and produce an animation by capturing the matrices for each bone for each frame and converting that into an animation.

This is animation json file where all of the matrices for each pose have been decomposed into a keyframe. The next step will be trying to apply it to the rig for the MT5. Admittedly this is something i've been lazy about, thinking "i can do that we we move passed reading files and get into animation", and it looks like we've gotten into animation, so I guess I'll have to get on that.

In terms of acting as the verification party, I've also ported motn.py to Javascript. I think I have a better grasp of how blocks 4 and 5 work now. Before it looked needlessly complicated, but thankfully in retrospect everything generally ends up being simple. To explain I'll provide a quick review.

Block 1 contains a list of instructions for which bones have which transformations. Each instruction has 9 bits for the node id, 3 bits for translation and 3 bits for rotation. Each bit for translation or rotation corresponds to a different axis, x, y, z. But so far in the animations I've seen when either translation or rotation is set all three axis will be set for that transformation.

Block 2 contains the number of intermediate key frames for each node-transformation-axis (ie. node 2 / translation / y). Each node-transformation-axis will automatically have key frames for the first and last frame, block 2 provides the number of key frames in-between. The total number of key frames will be this number plus 2. For example bone 0 translation x has 0 set, so the total number of frames is 2. Node 0 translation Y has 12 set, so there are 14 frames.

Block 3 provides the specific key frame indexes for which key frames are defined for node-transformation-axis. Which brings us to block 4. Since I assumed this block would provide some mechanism for how to look up which key frames to use in Block 5. And it turns out that Block 5 contains all of the key frames in order. The aspect that made reading Block 5 difficult is it also contains values for bezier curve adjustment.

In Block 5, it's not just the key frames, but for almost every keyframe and often even when key frames aren't declared, the game will read two half-float values from Block 5. These are likely the tangents for the key frames.

Tangents2.PNG

From the image above, we can say that the two values are likely the lSlope and rSlope. While individuals are the actual key frame values. In case when no key frame is declared, this is likely an adjustment in the curve on specific frame. So to go back to Block 4, what Block 4 is, is an encoded description of Block 5 for the order of Keyframe, and tangent values. Block 4 is a list of bitflags that describe how to read Block 5. Each byte can contain up to 4 possible key frame values. The bit 0x80 indicates to read a curve value and 0x40 is a bit to read a key frame. 0x80 and 0x40 are linked to a single frame.

In the same fashion 0x20 and 0x10 form a pair, 0x08 and 0x4, and last 0x02, 0x01. Each Node, axis, transformation starts by reading a byte, and reads the number of bytes for the total number of frames that axis has divided by 4. And then you simply go through and read the values to generate the dope sheet.

Now we're at the point where we have the values from the matrices, and the values from the dope sheet, so the next steps will be to find how the virtual skeleton is mapped onto the mt5 model skeleton and then start breaking down the values and see how to interpret the values correctly, so that they can be replicated outside of the game.
 
Now that we're at the point where we can read the key values from the game, next step is to start working our way through the bones to start replicating the mechanics from the game. I guess we're going to need a big stupid table for this. I'm not too crazy about the idea of managing a big table in a forum post. I might have to make a page and link to it.

Normally with animation it's better to start out with something simple like a chest opening, or simple enemies and work your way up from there. In the case of Shenmue, pretty much everything is human. And even things that aren't human share the same rig. Which means there probably isn't animation that's easy to start out with that's not out of the way. Which means the alternative is to work up one bone at a time.

Screenshot from 2020-07-28 17-03-27.png

In Block 01 the game uses 0x000 as a terminator to stop instructions for the skeleton. That means we can put in 0x0000 after an instruction so that we can start isolating the bone movement at a given point and work our way up the skeleton. If we start at the first instruction and comment out everything else, then we get the following.


And since the Root Bone movement is generally the easiest to work with, we can replicate this pretty easily. (if no autoplay click to view on imgur).


Which means we can move up an instruction to the next instruction.


We get the same movement of the root bone with the addition of the hips swinging back and forth. The node Id is number 1. With this information, we need to do two things. 1) Figure out the bone id for the MT5 rig, and 2) figure out how to interpret the rotations so that the legs swing back and forth. To figure out the MT5 bone id we parse the MT5 skeleton and tell the parser to stop on a specific id number until we find the right one.

As for the values. It looks like X and Z rotation are fixed, and the movement is made by setting a single middle frame in the Y-axis.

Code:
{
  "id": 1,
  "rot": {
    "x": [
      {
        "frame_index": 0,
        "curve": [
          0,
          -0.020233154296875
        ],
        "val": 0
      },
      {
        "frame_index": 22,
        "curve": [
          -0.0020465850830078125,
          0.0027713775634765625
        ]
      },
      {
        "frame_index": 37,
        "curve": [
          0,
          -0.020233154296875
        ],
        "val": 0
      }
    ],
    "y": [
      {
        "frame_index": 0,
        "curve": [
          0,
          0.00576019287109375
        ],
        "val": 15623.761596679688
      },
      {
        "frame_index": 18,
        "val": 17023.740234375
      },
      {
        "frame_index": 37,
        "curve": [
          0,
          0.00576019287109375
        ],
        "val": 15623.761596679688
      }
    ],
    "z": [
      {
        "frame_index": 0,
        "curve": [
          0,
          0.026092529296875
        ],
        "val": 0
      },
      {
        "frame_index": 22,
        "curve": [
          0.00160980224609375,
          -0.0021800994873046875
        ]
      },
      {
        "frame_index": 37,
        "curve": [
          0,
          0.026092529296875
        ],
        "val": 0
      }
    ]
  }
}

Edit:

For finding the MT5 bone number. I went through and stopped the bone parser at specific bone numbers and checked out the parent bone. It looks like the upper thigh for each leg have bone 029 as their parent. So i guess that could count as the hips (even through it looks like the belly button).

Screenshot_2020-07-28 Shenmoo.png

And this is what it looks like with the animation applied.


It looks like bone 29 is has an inherent z rotation of 180. The animation value for z rotation is 0, which causes the legs to spin around in the opposite direction. In the case of the Y rotation for the legs swinging back and forth, it looks a little over-stated, but otherwise the effect looks really similar.

Edit 2:

Still not exact. I cheated by adding 180 degrees to the z and y axis, but it looks close. So we can continue on to the next bone for more data points.

 
Last edited:
Next bone. It looks like either the right or left leg sticks out straight back depending on what foo Ryo has his weight shifted on to.


This is pretty interesting, since the values for the animation are all the same. That means we need to compare against the transformation of the original bone.

Code:
{
  "id": 5,
  "rot": {
    "x": 

[
      {
        "frame_index": 0,
        "val": 1358.979263305664
      },
      {
        "frame_index": 37,
        "val": 1358.979263305664
      }
    ],
    "y": [
      {
        "frame_index": 0,
        "val": -127.37305641174316
      },
      {
        "frame_index": 37,
        "val": -127.37305641174316
      }
    ],
    "z": [
      {
        "frame_index": 0,
        "val": -363.99444580078125
      },
      {
        "frame_index": 37,
        "val": -363.99444580078125
      }
    ]
  }
}

Edit:
Only managed to write part of this post and then had to be away from the computer for a while. What's confusing about this is the position of the foot. Normally with any model rotation, the orientation of all of the child bones should remain the same relative to the parent bone. Which means the foot would normally be pointing down.

For a moment, I thought that it might be the foot position moving up to the back and then affecting the parent bones up to a certain node, but I don't think that's the case. It looks like it's probably the thigh rotation and then the foot it being adjusted after the animation to make it parallel to the ground.

What's still confusing is why the walk animation causes one leg to move backwards like this for the duration of the animation. This is not something I would expect or have really seen from other animations. We'll have to break down and analyze the animation relative to the source bone to see if there are any hints.

shenmue_position.png

Edit 2:

Decided to try and not over think it and see how my parser handled the information. And it looks like that's how the bone's encoded. So far it looks like the hips is the only thing I'd had to adjust manually to replicate the game's result. I guess we'll have to move onto the next bone to see what the game does with one legs sticking out behind.

 
Last edited:
Looks like we have some shenanigans going on here.


Okay, got a chance to come back and write some thoughts about this. First thought: "waaaaaaaaaaaaaaaaaat?" This is definitely new for me. We're only four bones into this and we have one foot walking. Not only do we have one foot walking, but it seems to be "undoing" the rotation of the previous thigh. With an animation normally what I would be expecting is for the thigh rotation to go back and forth, and then the knee to be bent, and then for foot rotation. This looks like the game is managing the position of the foot, and then going back and solving for the knee.

I'm a little relunctant to say this but IK confirmed? From what I conventionally know about animation is that if the thigh bone is the parent of the lower bones, then transformations on the lower bones shouldn't override the transformation of a parent bone. Assuming that we have the thigh rotation, no knee rotation, and then foot position, the normal transformation for foot position would look like this:

stretched_foot.png

Basically the thigh would still be out back, the knee would still be straight and then the foot would be stretched out to what ever position is defined. Either way, we'll try to work with the information available to us and see what it takes to replicate this functionality outside of the game. I can try passing it into my parser as-is and see what happens. And when that breaks horribly we can go cry in a corner and try to rethink the approach.


And this is how the foot looks like in action with normal animations. One thing that I definitely find concerning is the foot seems to be going back and forth in the x direction, and not in the y direction as i would normally expect. I might take some more time to disable one axis at a time in the game, and further break down how this animation plays in game.
 
Last edited:
Back
Top