Here are forseeable problems and isssues:
1) The new engine should support both the PSX and PC datasets. The PC version is simply too rare while you can pick up the PSX verion new as a "greatest hits". The PSX disks however are not in ISO-9660 format and will not mount in linux. (At least not on my computer, I don't know how to mount a "Mode 2" disk) This means we *MAY* have to write a direct filesystem accessor within the new kernel. (Which, now that I think about it, we will have to do anyway as most of the datasets have to be extracted/decompressed/decoded to even be looked it) If not dual format right away, al least make the system modular enough where support can be added later.
I'm guessing that when you are refering to having to extract/decompress/decode the data files, you're refering to what both the PC
and PSX versions of game have to do to access their files, since they're both using compressed formats, right?
If so, then if what I'm about to say is old ground that was already covered, feel free to beat me with the
"OLD!" stick -- mainly because I haven't been very attentive in regards to this board's progress regarding the reverse-engineering of FF7pc.
But hear me out anyway, in case I wind up giving any of you ideas, even if this a bunch of old ground....heck, for all we know, this might allow us to get at the the specific models inside the LGP without needing to decompress to separate files first.

What if there is no actual large-scale decompression going on in the PC version (and possibly the PSX version)? I mean, we know that the PC version was coded in C++....so what if they decided to use C++ member functions like "seekp" and "seekg" to do non-sequencial read access?
Now think about this for a moment. If they tried to model on the PC how the PSX does its data lookups (direct sector reading, and not using CDFS format) -- which really only works when you can be positive that the data on the disc isn't gonna move around -- then using seekp and seekg to point toward specific bytes of whatever LGP file they are using, then they can essentially copy that portion of the file into a struct array, and then decompress it without needing to decompress the whole LGP file.
Now, this would also explain why things can go wacky when you try to alter the file and come up with a different size -- for example, let's say the game tries to pull up Cloud's battle model, so it goes to the appropriate LGP file and uses the "seekg" function.....I'll do some code off the top of my head to show what I mean:
(please note that all the data types, byte offsets, and file sizes have been grabbed out of thin air and do not accurately reflect what types they should be...this is more half-psuedocode, basically)
#include <iostream>
#include <fstream>
using namespace std;
//define model data structure
struct model
{
int Polys[InsertArraySizeHere];
int Vertices[InsertArraySizeHere];
int Textures[InsertArraySizeHere];
};
//function prototypes
void getPolyData(model *);
void getVertexData(model *);
void getTextureData(model *);
void decompressData(model *);
int main()
{
*ptrCloud = new model;
//grab cloud's poly, vertex and texture data
getPolyData(*ptrCloud);
getVertexData(*ptrCloud);
getTextureData(*ptrCloud);
//now decompress everything into memory -- and pray the
//struct that you've set up that will hold the decompressed
//data is large enough :)
decompressData(*ptrCloud);
return 0;
}
void getPolyData(model *cloud)
{
fstream file;
fstream *position;
fstream *end_struct;
//open file for random access of data
fstream file("battle.lgp", ios::in | ios::binary);
//jump to end of Cloud's model data, and flag it with a
//pointer
*endofstruct = file.seekg(40L, ios::beg);
//we know that Cloud's compressed poly data occupies bytes 20
//through 40, so we jump to byte 20
*position = file.seekg(-20L, ios::cur);
//increment 'till we hit the end of this model's compressed
//poly data, or the end of the LGP
do
{
for (int i = 0; position < end_struct; i++)
{
file.get(cloud.Polys[i]);
position++;
}
} while (!file.eof());
file.close();
}
void getVertexData(model *cloud)
{
blah blah blah...
}
void getTextureData(model *cloud)
{
yada yada yada.....
}
void decompressData(model *cloud)
{
*insert file bloating sounds here*
}
....using numbered offsets to set the read position to the correct part of the file -- or what it THOUGHT was the correct part of the file, if we are using a drastically changed file whose size is totally different, and would thus make the game convulse if it doesn't have a way of dynamically recalculating the byte offsets in a changed file.
Anyway...thoughts on my code? Oldness, or am I on to something?

-edit-
Ack, the comments got all out of line when it resized the text boarders. Fixed it -- I think....

-2nd Edit-
Did general code cleanup and formatting that I felt like doing.
