Author Topic: FF7 Crisis Core - File Format and Data Investigation  (Read 200355 times)

Karlislie

  • *
  • Posts: 11
    • View Profile
In 00004 there is a pattern,

t the beginning at column 9 there is an increase when scrolling down
at 00006320g to 00006350g there is a sequence from efgh to u

I use Ultraedit...

By the way, in the wiki there is no information on the ATEL and MBD docs, could anyone post them?

Edit:

Also in the 8703 there is this pattern;

      0 @      4@      @@
    1@       5@      A@
  2@       6@      B@
3@      7@      C@

koral

  • Guest
Karlislie check the first post, that is where I had initially dumped all my findings before I had received access to the Wiki.
Everything there isn't in the wiki, and vice-versa.
(in other words, all the known ATEL and MBD findings should still be there)


Sorry for bumping, but incase my previous post got ignored  :lol:

I have released a new version of RINOA (v0.5beta) which can show fully textured map-models.
For more info, check link: http://forums.qhimm.com/index.php?topic=8163.msg101036#msg101036

BlitzNCS

  • *
  • Posts: 889
  • Master of nothing in particular
    • View Profile
    • My Youtube
Sweet, but I can't load my own maps, but oddly i can load my own characters. only map i can load is 02109, banora, the one that came with the RAR...

Karlislie

  • *
  • Posts: 11
    • View Profile
As mentioned in my post earlier toy can see in 00004 there is a pattern

And in this screenshot you can see another alphanumeric pattern

By the way koral, would you mind sharing Rinoa's sourcecode? I know I sure would love to develop it further.
« Last Edit: 2009-04-19 16:50:36 by Karlislie »

koral

  • Guest
Sorry Karlislie, but RINOA will remain closed source as my own personal project  :wink:
But all the information (as we have discovered) is out in the open, so you shouldn't have difficultly writing your own tool(s) for specific purposes.

Those screens look like [MBD] scripts, ignitz posted the codes for them here: http://forums.qhimm.com/index.php?topic=8163.msg98499#msg98499

But they could just as easily be numbered indices or offsets or something else. :-P

Karlislie

  • *
  • Posts: 11
    • View Profile
hello koral  :lol:

Thanks for the link and for answering, never hurts to ask   :-)

With that post almost of the files are documented except for [ATEL] itself....

Could you post how to open [ATEL]? or how those bytes, dwords on [ATEL]  :?
Since some of the games interesting content seems to inside them :-P
« Last Edit: 2009-04-20 00:14:28 by Karlislie »

MrAdults

  • Guest
*Exhibition.

Ahh... every time I come through Tech-Related I get a bit sad about my inability to rip things like some of the rest of you... I have so many games I want investigated, but no time, or skill, necessary.

Koral and MrAdults specifically, you two are my model decoding heroes. I've never seen anyone else get so much done so fast. Is there a secret you have? Like being aliens?
I used to wonder how people did it, too, back in my early programming days. :) There's no real trick, it's just a matter of having a wide base of knowledge to narrow down the possibilities. If you know what you're looking for before you start looking, you'll find it faster. On the other hand, you can't be too focussed on a single possibility, or you'll waste a lot of time looking for the wrong thing.

Some practical examples on what to look for in terms of vertex or triangle data (assuming your data set is not compressed/encrypted to start out with, which is another different set of issues):
Usually you can find triangle lists by searching a file in 1-byte-offsets for shorts or ints backed up against each other, all within a specific sanity range (where the maximum value would be the highest vertex index). If that fails (triangle indices may not be back-to-back) you can can try searching for specific geometry bits in different pattern orders, like groups of 3/4 indices. The list goes on into all kinds of different and weird/more obscure indexing modes.

Vertices are usually trickier. If UV's or positions are in floating point, it's pretty easy. Just search for valid floating point value ranges in a specific stride, or back-to-back in the case that each vertex component is in its own array. Bytes, shorts, and ints, on the other hand, can blend in with all kinds of data when utilizing their full ranges. In that case, you can try some patterns based on known index values, or if you don't have known index values, you can check value ranges and failures to comply with those value ranges outside of a given stride. If all of that fails, you usually have to start manually looking through the hex and interpreting values, and just trying to find patterns and/or consistently recognizable values in the interpreted data. The biggest thing to know, regarding the whole picture, is that the more pieces of a file you document, the easier it becomes to narrow down the rest. Since if you already know what some piece of data is, it can't be the other piece of data you're looking for. :)

Then outside of scanning for raw values, you have cases where there is some obvious header data with offsets to specific pieces of data (like the Crisis Core models), or cases where the file is broken up into chunks with a common chunk header, like the Dissidia/GMO models. Those things are pretty easily visible by interpreting the data as raw shorts/ints/etc. and just taking a look at the kind of data that's at the offsets (finding offsets is, I guess, mostly a matter of intuition and trial-and-error - if it's in a storage type that can hold up to the filesize, and the value is less than the number of bytes in the file, then maybe it's an offset :)). If you're lucky enough to have that kind of data at your disposal, it's much easier to interpret each specific chunk/offset of the file for a specific type of data, rather than having to bruteforce through the whole thing. On the other hand, there are also much more complicated methods of breaking files up into sections, and games will often have complete sub-filesystems within files for given types of files, or bits of files that are compressed, or all kinds of other things. Dealing with all of the scenarios just requires a comprehensive base of knowledge, so it's one of those things that you just get better at doing with the more you learn about storage techniques and other areas of development (for mesh data specifically, knowledge of rendering, more-so of rendering on the model's target platform(s), is most useful).

I guess I'm getting a bit off-topic there. I just wanted to take the opportunity to de-mystify the process a little bit for anyone interested. :)

koral

  • Guest
Mr Adults
Your reply is surely worth wikifying somewhere!  :lol:
I could never explain such things as clearly and precisely as you do, I tend to just speak (type) my mind, never becoming too formal when posting stuff.

My mantra is more along the lines of "trial and error", logically filtering through all possibitilies of the data (as I understand it) until either I reach a conclusion asto its purpose, or I ignore it altogether.
So far I have been lucky, but with Ehrgeiz I realised my methodology isn't failsafe  :-D


Karlislie
I dont know anything about the [ATEL] files really  :-P
All I found out was that if I scanned the first 8-bytes every 16-bytes, I would come across known file-types eventually (such as MBDs and SSCFs).

The MBDs are the most interesting so far, because they contain the game scripts and are organised in such a way that we know what kind of data might be near it.
For example, during a cut-scene event, an NPC might say something and perform a specific animation, so we can find the precise MBD which shows us the file and the location where that dialogue was said, and we can look around to find the other information related to that event.

There would be camera information somewhere too!

Come to think of it, maybe those [&&] files were actually Animation data?!
There were hundreds of them mixed in with the ATEL files!!

Karlislie

  • *
  • Posts: 11
    • View Profile
Well I meant the code about ATEl, because RINOA must have some sort of code that allows it to understand and see what the ATEL contains, I was hoping that you could post that snippet...

koral

  • Guest
I was talking about [ATEL] files, but I dont know anything about it so I ended-up talking about MBDs and animations :lol:

I dont know how much you know about c++, but a minimalistic program to scan an ATEL file for MBD chunks would be something like this:

Code: [Select]
#include <stdlib.h>
#include <stdio.h>
#include <fstream>
#include <iostream>

void main()
{
// Open the file
std::ifstream *File = new std::ifstream( "ATEL_File.raw", std::ios::binary);


// Read bytes until End of File
while( !File->eof() )
{
// read in 16 bytes of this line
unsigned char c[16] = {0};
File->read( (char *)(c), sizeof( unsigned char ) * 16 );

// check for [MBD] file start by comparing first three bytes
if( (c[0] == 0x4d) && (c[1] == 0x42) && (c[2] == 0x44) )
{
printf( "found [MBD] at offset: %ld", unsigned int(File->tellg()) - 16 );
}
}

// Close and Delete
File->close();
delete File;
}

// Job done!!  ^_^

I just wrote this quickly right now from the top of my head, so it may not compile or work as expected.  :oops:

But all the essential functions are there, from reading in a file, recursing through its data, then closing it again.
What you do with that data (and how you show it in a meaningful way to the person running it) is something I cant really help you with.  :-)

Karlislie

  • *
  • Posts: 11
    • View Profile
Thanks koral! :-D

By the way I think the && files are models+animations, they can be viewed in RINOA.

koral

  • Guest
I thought so!  :-o
Thankyou for the news, those && files are now number one on my hitlist.

Karlislie

  • *
  • Posts: 11
    • View Profile
hey koral,

A good && to try is 1482 chunk# 10 (remember 0 is counted)
What 1482 is, a treasure chest.

Since its only 37KB it might be easy, plus it has to have an animation since Zack opens them.

Edit: if you dont want to try that one use 1995, its a Cactar, the cactus. :-P
« Last Edit: 2009-04-22 00:39:56 by Karlislie »

MrAdults

  • Guest
Oh, the && files are basically !! with animations, then? That's quite convenient! :) Let me know if you'd like me to have a look at any particular file/piece of data, koral. I'll otherwise probably just continue doing what I'm doing. ;) I do still plan to go back and add location model support, and now perhaps "proper" skeleton support, to mesh2rdm, though, once all the findings are wikified.

koral

  • Guest
I seem to have gotten distracted again  :lol:

this time with FF8 models!

So far I have only got as far as getting mch (field) models to render correctly with RINOA. The skeletal heirarchy is intact, but the joint positions are still pretty much AWOL.
Worst-comes-to-worst, I will just guess the positions for some of the "high-poly" field model joints and use text-files to parse them in. That would also make it easy for anyone to fix (or even manually pose) those models before exporting them out.

Battle-models are still iffy, nobody has yet gone the full distance to analyse and document their structures.
I hope I will be able to do it someday! The wiki could do with a new FF8 section too  :-D



Karlislie: I had a look at the files you mentioned, and there seems to be nothing "out of the ordinary".
I have added ATEL file parsing support back into RINOA, and I realised that I did know more about ATEL files which I forgot to post about:

The complete list of ATEL sub-chunks (embedded-files) which I currently know of are:
Code: [Select]
TEX
GT
!
SSCF
MBD
FEP
MDL
VTL
ANM
&&

TEX, GT, !, SSCF and MBD are perfectly viewable and understood, but the other 5 still open to investigation.

And if the names are anything to go by, then it is highly probably that the ANM chunks would contain some sort of Animation data, whether it be for simple Treasure-chest opening, or for skeletal NPC animations.

Which puts && files onto the back-seat for now  :wink:



Mr Adults
You dont have to get involved just yet, we need to find those animations first  :lol:
Which reminds me, I have yet to complete the wiki entry for the exp-models... yet another thing stuck somewhere in my todo list  :-P

Karlislie

  • *
  • Posts: 11
    • View Profile
I thought the && files were different since they are used in battles and use animations, how much different are they from the ! models?
By the way, if you were to experiment and try to get the animations, wouldn't it be easier to use a very small ! or &&  file? and compare 1 or 2 other smaller files from there?

And have you noticed that in the ATEL files there is a GT image at the top? Perhaps that is the map file for that specific place where the event is happening? (You have to chunk it, then restart RINOA to see it, it looks garbled abit)
And if you were to take a look at an event, (have you noticed that the actual event data is ALWAYS the last MBD, not the first 1st or 2nd?)

Anyway, not to get sidetracked, lets take 01473 as an example,

<TEXT>
Enemy attack! It's SOLDIER!
Don't let him get through!
<END>


What happened to the part that goes here? What calls the battles? What calls the camera position?
What is defining that? Is there some part of the MBD that isn't being properly displayed, or is missing?

<TEXT>
Were you able to get inside the fortress?
<END>
<TEXT>
Piece of cake! I could have done it blindfolded!
<END>
<TEXT>
Obtained <VAR> gil!
<END>


But yes there are lots of unanswered questions, I have a number on my list.
Perhaps FEP could be the camera data?

Or MDL is for MODEL?
And ANM is ANIMATION?

Ah maybe MDL + ANM are used at the same time???

Could there be something, to match and compare patters in these files, like agaist all other files? Not something like TRID.

By the way, I was toying with Xpert 2 under the relinker option, has anyone noticed the columns?
DecStartLba, DecFileSize, LBAPos, FileID HexStartLBA, HexFileSize, HashString,

Maybe there is a way to search and find a file from one of the above options and see how it is being used a good place to start would be in a Memory Dump of CC running.
I just made a memory dump, here: http://www.mediafire.com/download.php?ny1kmmzywnd

Its Zack in the church, right after meeting Aerith.


Lucleonhart

  • *
  • Posts: 40
    • View Profile
    • Lucleonhart.de
this time with FF8 models!
Jeha! FF8 FTW!!! *ggg*
Kepp on your grrrrrrrrrrrrreat work! :)

squallff8

  • *
  • Posts: 222
    • View Profile
Sorry for offtop

2 koral
If you still interested,i will upload FF7 Bonus Disc files for you :-)
here is image of original Bike model. File name is bwfd in field/char.lgp

« Last Edit: 2009-05-05 12:24:38 by squallff8 »

deadlyxvalentine

  • Guest
Well, my first post! I read early on about how someone wanted these for JK3 and I'm wondering if anyone knows how to rig these to Valve's .smd format and to rig it for Half Life, more specifically, The Specialists 3.0 which is a Half Life 1 mod. If anyone would know and could help me I'd very much appreciate it.

Landarma

  • *
  • Posts: 152
    • View Profile
Did anyone test it with data from Japanese version?  I wonder.

MrAdults

  • Guest
Well, my first post! I read early on about how someone wanted these for JK3 and I'm wondering if anyone knows how to rig these to Valve's .smd format and to rig it for Half Life, more specifically, The Specialists 3.0 which is a Half Life 1 mod. If anyone would know and could help me I'd very much appreciate it.
Well, mesh2rdm.exe exports the models straight to SMD, but it sounds like you want to use them with a different skeleton. I can tell you that you will need to import your source skeleton into the modeling program of choice, then import the FF7:CC model (in either smd or obj format, doesn't matter if it has weighting data) and manually re-weight it to the Half-Life skeleton. If you don't know how to do that, you'll have to find out in the documentation for your modeling app of choice. But be warned that re-distributing the models with Half-Life mods is a good way to draw unwanted attention to yourself and get a C&D, and it is illegal in most regions (depending on how you pull it off).

Did anyone test it with data from Japanese version?  I wonder.
Should in theory work fine, the model data sets would be no different, only offsets to the files (and presumably G's extractor handles this fine).

deadlyxvalentine

  • Guest
Well when most people release models they have ripped and re-rigged, they usually give credits with it. Like, one ripped and rigged DMC4 Nero for TS and gave creds to DMC4 Modeling team and you know, anything else they did they said they did and they rightfully own. They said the work they did aswell, and if the anims weren't theirs, then they'd say would was the original animator as well. So, I'm guessing this is how people get around the whole getting into trouble thing?

MrAdults

  • Guest
You're distributing copyrighted content, so giving credit to the authors of the work (who may or may not even own the copyright(s)) makes no difference in terms of legalities. Whether you are in violation of actual copyright law depends on where you are located. If you were to construct a likeness yourself, that's another, different set of legal issues that can be sidestepped with some twists and turns. But the bottom line is, if you draw attention to your project using content/characters from a popular IP, you risk getting a C&D regardless of whether you're within your legal rights. Just something to consider. If only a group of 20 people are going to play your mod, practicality dictates you have nothing to worry about. But before going ahead with anything like that, you should be well aware of the legal constraints.

deadlyxvalentine

  • Guest
Woah woah woah. This isn't MY mod. This mod was made years ago buddy. Here's a link: http://www.specialistsmod.net/

Landarma

  • *
  • Posts: 152
    • View Profile
In fact, I happened to get Japanese CC data, and most of extracted files were not recognized by CC viewer nor mesh2rdm.  I don't think it's fault of G's extractor, for I could recognize some files, like 00000.raw which has [ATEL] header, and movies and sound data(identified by header).  Maybe offset matters, but it's somewhat confusing, for 00001.raw(which would be GT image), [GT] is at 80h, while 00002.raw(also can be GT image), I can see [GT] header at 40h.