Qhimm.com Forums

Miscellaneous Forums => Scripting and Reverse Engineering => Topic started by: halkun on 2004-12-27 02:42:47

Title: Gears updated ^_^
Post by: halkun on 2004-12-27 02:42:47
Using my Modly powers to help keep my own spamming clean. This is just a refresh of an old post. I like to keep the necromancy down, but sometimes there is fresh data to add to an old post.

However if I would of posted to the old thread, I would of been guilty of both Necomancy and double-posting. The last thing I want to do is look like a hypocrite.

Anyways, there is a new version of Gears. Here's a run down of additions.

1) I wanted to really flesh out the PC field file format, but it was getting really difficult to follow what you guys were saying. I tried to "table-ize" some of the headers and formats in the field file, but I might have the offsets wrong. It's kind of tricky when you say "integer" and "word" which is very platfrom dependant. For example, when you discribe a PSX data unit in the context of a windows enviroment while all my hacking tools are linux based. ^_^  It gets a little confusing. One of the janitoral jobs that will be surfacing soon is a unifrom notation and data structure. Also all the tables will need to be unifrom in thier discription as well.

2) I added the PSX battle model format in the battle section. I cleaned it up a little and made it somewhat easier to read.

3) I had a goal to work on Kernel.bin, but ran out of steam trying to decipher Kero's camera matrix section (WOW! That was a doozy) I didn't touch the walkmesh as my head was already spinning. I was just wondering,  in section 4 of kernel.bin (The character starting data),  is that copied directly from kernel.bin into the charater record savemap? That will make the format much easier to decipher.

Also I like Qhimm's Field Script command layout. I might be stealing^H^H^H^H^H using a format like that for Gears as well.

You can get the new Gears at the same old place.

http://the-afterm.ath.cx/mailorder.png

naaa, I'm just kidding, it's really here... ^_^

http://the-afterm.ath.cx/gears/
Title: Gears updated ^_^
Post by: Qhimm on 2004-12-27 09:38:19
About the data type notation... how about we all try to agree on naming convention already while working? Most programming languages lets you define custom data type names, and when describing file data using structs you really should be using fixed data type sizes instead of relying on names like "long" (shame on me too). I was compiling some Nvidia SDKs on an AMD64 and it freaked out because they had used longs to describe file structures, but of course on 64-bit architectures "long" became 64-bit as well...

Anyway, I propose a simple data type naming scheme such as this: data_types.h (http://www.qhimm.com/data_types.h)
(this file is MS compiler style, but it should be easy to rewrite for various compilers and languages)

Again, these names should primarily be used when describing file data (where exact size is important). It's possible to write custom classes to handle the non-standard data types (binary data and fixed-point data), which would make sense... The fixed-point data might need some additional notation to describe the radix point position, sign etc. (by default fixed-point is PSX-style, with 2-complement sign and 12 bits below the radix point).
Title: Gears updated ^_^
Post by: halkun on 2004-12-27 13:10:51
you know, I just realized that Gears is pushing 198 pages now

Go us
Title: Gears updated ^_^
Post by: Cyberman on 2004-12-27 14:15:28
Quote from: Qhimm
About the data type notation... how about we all try to agree on naming convention already while working? Most programming languages lets you define custom data type names, and when describing file data using structs you really should be using fixed data type sizes instead of relying on names like "long" (shame on me too). I was compiling some Nvidia SDKs on an AMD64 and it freaked out because they had used longs to describe file structures, but of course on 64-bit architectures "long" became 64-bit as well...

Well this is more of leaving a value size up to a compilor than the archetecture.  Programers are lazy by nature I've heard, and instead of having a reliable deterministic method they used long.  Just like int on a 32 bit archetecture is also a long? Hmmmm. It's poor programing style in any case.

Quote from: Qhimm
Anyway, I propose a simple data type naming scheme such as this: data_types.h (http://www.qhimm.com/data_types.h)
(this file is MS compiler style, but it should be easy to rewrite for various compilers and languages)
I defined that type of information before I described any structures.  It just made more sense.
I prefer UINT8/UINT16/UINT32/UINT64 and INT8/INT16/INT32/INT64.  type char is not  an unsigned 8 bit integer on all archetectures. This was inanely left up to the compilor/archetecture.  On the PC it can be either signed or unsigned, most often it's SIGNED however (believe it or not).  This does make problems when you are dealing with U V coordinates in FF7 models :)

Quote from: Qhimm
Again, these names should primarily be used when describing file data (where exact size is important). It's possible to write custom classes to handle the non-standard data types (binary data and fixed-point data), which would make sense... The fixed-point data might need some additional notation to describe the radix point position, sign etc. (by default fixed-point is PSX-style, with 2-complement sign and 12 bits below the radix point).

The animation data is packed using bit streams so hmmm, structures are going to need extra care dealing with.  From L Spiro's work they can be 12 11 10 9 8 bit sized.  Which could make for difficulty presenting it in say C++ or C.  It was hard enough for me to decode 12 bit values correctly.   Obviously this is the animation data (which is the only thing left in the battle models to get right).

Cyb
Title: Gears updated ^_^
Post by: Qhimm on 2004-12-27 14:42:18
Quote from: Cyberman
I defined that type of information before I described any structures.  It just made more sense.
I prefer UINT8/UINT16/UINT32/UINT64 and INT8/INT16/INT32/INT64.  type char is not  an unsigned 8 bit integer on all archetectures. This was inanely left up to the compilor/archetecture.  On the PC it can be either signed or unsigned, most often it's SIGNED however (believe it or not).  This does make problems when you are dealing with U V coordinates in FF7 modles :)

Well as for the details of the names; whatever floats your boat, the point was just to have the size as part of the name to make it clear to whoever reads the code. :)

The char8/16/32 types would (in my vision) be used with string data, and as I subtly indicated with my Microsoft compiler reference, the definition file would have to be hand-customized for different compilers and architectures. To the actual program, a char8 should always be 8 bits long whereas the signedness would be up to the system (maybe for compatibility with RTL string routines). For anything besides strings (like U V coordinates) you would use the integer types instead, thereby entirely avoiding the signedness issue of chars.

Quote from: Cyberman
The animation data is packed using bit streams so hmmm, structures are going to need extra care dealing with. From L Spiro's work they can be 12 11 10 9 8 bit sized. Which could make for difficulty presenting it in say C++ or C. It was hard enough for me to decode 12 bit values correctly. Obviously this is the animation data (which is the only thing left in the battle models to get right).

Well yeah, the thing about bitstreams is that they're usually dynamic (fixed bit data can be stored in bitfield structs). As such, the binary type is simply a placeholder to declare and provide access to the data without assuming anything about its contents. Just a way to say "complex and/or dynamic data that won't readily fit into a C struct". On top of that, it's easy to write a stream reader or derive one from a 'binary' base class (and quite nice in code, too).
Title: Gears updated ^_^
Post by: sfx1999 on 2004-12-27 15:02:56
Qhimm try using just a plain int to get 32-bit values.

See, I always thought that ANSI standards say that longs should be 32-bit, and long longs are 64-bit.
Title: Gears updated ^_^
Post by: Alhexx on 2004-12-27 15:05:38
That means that "long" isn't machine-independent?
I always thought that the "int" type in C++ is not machine-dependant, but "char", "short" and "long" are fixed-size...

 - Alhexx

 - edit -
Whaa... this is driving me insane...
Title: Gears updated ^_^
Post by: sfx1999 on 2004-12-27 15:14:52
Upon further review in my C++ book, unsigned ints are dependent on the architecture of the CPU. Ints also. It seems that an unsigned long int is always 32-bit.

Anyway, on some C/C++ compilers, there is a type called a long long int which is 64 bit.
Title: Gears updated ^_^
Post by: Qhimm on 2004-12-27 15:49:50
See, my point here was to define those types yourself with regards to your compiler to ensure the result was exactly n bits, instead of the reader having to memorize compilers/architectures. I know exactly how to get values of n bits on my compiler, but that does not mean poor halkun automatically knows a plain int is 32 bits on msvc/x86 yet 16 bits on some other platform. Once the data type definition file has been written, you hide it away and never again care what the actual definitions are, perfectly content with using uint32 to get a 32-bit unsigned integer. Thus, if and when your code needs to be ported, all you have to change is the data type definition file (to make sure the sizes match), and you know all the structs work. Not to mention it's easier to read and makes more sense as a file format description (data serialization).

Of course this isn't a good method to handle serialization (mumle endianness mumble) but it at the very least makes data declarations more readable and understandable.
Title: Gears updated ^_^
Post by: Kislinskiy on 2004-12-27 16:13:21
Quote from: sfx1999
Upon further review in my C++ book, unsigned ints are dependent on the architecture of the CPU. Ints also. It seems that an unsigned long int is always 32-bit.

Anyway, on some C/C++ compilers, there is a type called a long long int which is 64 bit.


No, an unsigned long is not always 32-bit. Also char is not always 8 bit, short is not 16 bit and int is not 32 bit. They depend all on the architecture of the CPU. BTW long long is not ISO/ANSI conform. If you use Visual C++ search for __w64 at MSDN.

The securest way to indicate a size of a variable in a doc would be the C#-way (e.g. UInt32, Int16, Byte, SByte, ...). It is assumed that a Byte consists of 8 bits.
Title: Compilor Dependancies just like mom use to make!
Post by: Cyberman on 2004-12-27 17:06:50
Then I suggest the Open source approach of nomenclature which is what I use.
UINT8 etc etc
INT8 etc etc
are all defined as a TYPE in an include file.

This makes each value explicite and platform/compilor independant. Why? For each compilor you have to define them (and platform).

Even with GNU C this is the case.  Long ints are 64bits on the A64 archetecture as an example or MIPS6K+ archetectures in this compilor.  SO I recomend we use something that 'snoops' the compilor and archetecture it's being compilled for and defines these types.

You can identify easily how BIG a long int or an int is in your compilor by the way.

Code: [Select]
void main(void)
{
 printf("char size is %2d bytes", sizeof(char));
 printf("short size is %2d bytes", sizeof(short));
 printf("int size is %2d bytes", sizeof(int));
 printf("long size is %2d bytes", sizeof(long));
 printf("long long size is %2d bytes", sizeof(long long));
}


Very simple.  You can also use this like a configure script I suppose. :)


Cyb
Title: Gears updated ^_^
Post by: mirex on 2004-12-27 17:13:38
Yup lazines rules programmer's world (well it rules mine:)
Quote
UINT8/UINT16/UINT32/UINT64 and INT8/INT16/INT32/INT64
I like those definitions, i would use those.

I was working on Microsoft's VC++ and Borland C++, and their length of types differed althought they both were running on same CPU. Dos Borland C++: int is 2 bytes long, Windows VC++: int is 4 bytes long. So I started using: char - 1 byte, short - 2 bytes, long - 4 bytes; but looks like i'll have to use those more specific declarations (UINT8 for example), they look better.
Title: Gears updated ^_^
Post by: halkun on 2004-12-28 03:48:22
Let's throw a little more grease into the fire shall we?

The PSX doen't use floats ^_^. Actually, it has no FPU at all. Now, as you can see, this makes translating things a little trickey.

I'm sure that you are also asking "If the PSX can't do floats, then how on earth does it do 3D?"

The GTE, which hangs out as the second coprosseser (COP2) is a *fixed point* math coprosseser that only works with matrix calculations and color manipulation. That's the whole banana.

To dig deeper, let's throw some types around, shall we ^_^

Let's start out easy.

8 bits = one byte

Now that we have that defined, let's look at some platforms. The core system of the PSX is a RISC chip with the following definitions. The R3000A is preconfigured on powerup for little-endian byte order, (as you can change the endian)  and defines a word as 32-bits, a half-word, as 16-bits, and a byte as 8-bits.

The x86 is a little-endian CISC chip that runs in mutiple memory addressing and register access modes. To confuse the matters further ther are two "warring" compiler sets (POSIX vs Microsoft) that define types differently.

Under "Real mode" a double as 32-bits, a word at 16 bits , and a byte at 8 bits. Under "Protected mode" a double is 64 bits, a word is 32 bits, a half-word is 16 bits, and a byte is 8 bits.

Under MSVC, a long is 4 bytes, an int is 2 bytes, a short is 2 bytes and a char is 1 byte.

Using gcc and g++ in Linux, I get the following data back using a debugged version of cyb's code

char size is  1 bytes
short size is  2 bytes
int size is  4 bytes
long size is  4 bytes
long long size is  8 bytes

Let's just keep everything platform independant, shall we?

I like Qhimm's  data_types.h, but I don't quite grok why we need a char that's over 8 bits.  I've always read "char" as "8 bit type you use for strings... don't play with the sign bit, as it's really an ASCII toggle in this case." (i.e. 0-127 ASCII, 128-255 extended characters)  

For giggles, here's a more correct version of Cyb's little program, what kind of data do you guys get on your platforms?

Code: [Select]

#include "stdio.h"
                                                                               
int main(void)
{
 printf("char size is %2d bytes\n", sizeof(char));
 printf("short size is %2d bytes\n", sizeof(short));
 printf("int size is %2d bytes\n", sizeof(int));
 printf("long size is %2d bytes\n", sizeof(long));
 printf("long long size is %2d bytes\n", sizeof(long long));
}



Do we all agree that data_types.h is the best way to go? You should really avoid types like "int", Long", and "Word" (and to a small extent "char") as these have some vauge definitions. I'll have to mark this in my "things to clean" in Gears.

While we are at it, let's define a little something on notation too.

-Decmal numbers are written plain. (1, 2, 4, 15, 7) but it's much better to write these out in english. (one, two, four, fifteen, seven)

-Hex numbers should use 0xABCD notation. $ABCD notation insinuates that it's running on an 8 bit platfrom and looks archaic as hell. On really long types, place an underscore between words. This makes reading it much easier. (A trick I picked up doing my PSX doc --> 0xFFFF_FFFF)

-On the subject of langauge, I speak American English and will be converting some of the more "Colourful" language over to something a little more nice to my spell checker. On the same side of the coin, I'm going to probably be altering some of the more informal language on some of your docs to bring a more unifrom feel for the text. (Dropping some of the "then you... I think this is garbage but.... I don't know that this......) This doen't require much brain power and gives me the ability to work on Gears while at school. (As opposed to getting out my hacking toolset to verify some of the sections, I can simply proofread for about an hour while I'm between study sessions)  

Also, one more thing....

The reason why I export a copy of gears in DOC format is so you guys can load it up in Microsoft Word and edit my mistakes/add your own content. I don't mind if you guys copy it and make changes. Email me the section you have changed, and I'll import it into my SDW file, and then re-export the changes back into the public /gears directory.

The reason why I post pictures of cute Japanese girls in boxes is because they are pretty and fun to look at.

I think that's it...
Title: Gears updated ^_^
Post by: Qhimm on 2004-12-28 09:58:13
Well to respond innocently, my data_types.h was just an example. It just illustrates how things are easier to read if you include the bit size as part of the type name. Naturally when PSX hacking, you'd never need 16-bit or 32-bit chars (typically you'd only use these for unicode processing), but I included them for further example that even simple types like char can benefit from size specifiers. Working with PSX data you don't need floats either, but they're used on the PC version.

As for PSX fixed-points, well, that's why I added the "custom class?" note in the file... typically when working with data like this it helps sooo much if you have a data handler defined that lets you use fixed-point values with the same semantics as floating-points.

To be utterly specific when working with data serialization (storing in files), you'd normally include an endian specifier as well (add a 'be' or 'le' to the end of the type). However, since both PSX and PC typically work with little-endian, I thought it would only clutter up the code in this case. Again, handy classes can be used to work with reverse-endian data transparently.

Oh, and since we're on the topic of the MSVC compiler, the current int size is 4 bytes (much to the chagrin of Borland/MS cross-coders). I think a 32-bit int (what used to be a "long int") is becoming pretty much the standard on today's 32-bit x86 compilers (not that you can count on it, mind you). Most compilers have special keywords or extensions to directly specify the data size, such as Microsofts __intxx types, which makes writing a data_types.h a very simple matter.

I think most of the people are in agreement that using type names which include the size is good, while the specific formatting of the type name varies by personal preference (UINT32 or uint32, sint16 or int16?). Still, I think the result will be that everyone's code will be easy to bring to a common appearance afterwards with a simple search&replace. Personally I'll probably start using the data_types.h I showed earlier, perhaps with the variation of dropping the 's' for signed data (it looks a bit silly since you usually don't see a signed specifier in the language).

On your points of notation:

- I assume you mean in written English here, and not in code... It would raise five kinds of hell to use spelled-out numbers in code. ;)

- Agreed that 0xABCD is the better notation (while this should be simple enough to S&R for those poor souls who use programming languages with the $ABCD notation). As for the trick of inserting an underscore, again I assume you mean only written documentation (though I might prefer a simple space or ~, non-breaking space).

- Since I personally use a bastardized flavour of American/British English (ooh subtle one there), I realize any documentation coming from me will need adjustment. Sorry about that. :)
Title: Grease Fire!
Post by: Cyberman on 2004-12-28 16:02:19
I started a small fire.. woops.

For me it helps to have a point of reference for the documentation.

Perhaps we could borrow from the embeded community some of the notation they use?

IE U8 U16 U32 U64 (U who!) ;)
S8 S16 S32 S64 etc.

The playstation has some odd formats. I'm uncertain how to technically deal with bit fields as well. Perhaps we could deal with them in an assumed manner (unlike C).

Unfortunately the PSX version of FF7 uses LOTS of bitfields.  

Perhaps we can assume the ordering of the bit fiields 'bit wise' as a bit stream that is packed as MSB to LSB.

The endianess of the FF7 PSX data is ALL little endian.  The animation data has to be looked like a bitstream and follow what I mentioned about bitfields.  Unfortunately bit fields that spam integer sized sections don't fit this (boggle) so it's not perfect and has been inconsistantly applied by the programers.  This means we probably should take a bit stream approach and a bit field in a U8/U16/U32/U64 size seperately.  Maybe put LE  and BE before the type?

Hex numbers as 0xXXXXX <--- fine with me.  However I suggest we format code in italics or something if we supply it.  That way people aren't confiused between formating to make clear and real code.

People are passionate about source code formating! why not documentation formating too? ;) :)

Cyb
Title: Gears updated ^_^
Post by: Qhimm on 2004-12-28 16:12:44
Heh, I though code was usually formatted in fixed-width courier-style font for exactly that reason? ;)

Well I'm fine as long as you can deduce the necessary information from the name. Maybe just writing 'U32' is a little lacking... someone might ask "an unsigned what?", especially if we're in a situation where various number formats are used (integers, fixed-points, whatever).

As for bitfields, well... I would LOVE to see solid documentation on how MS compilers generate and pad bitfields, I swear sometimes I think the algorithm is based on
Code: [Select]
pad_to_next_boundary = random() % 1 ? true : false;
For this reason, when documenting it's necessary to be very specific with binary specifications. Include the obvious, such as byte and bit ordering, as there are quite a few ways to store a 13-bit number in 2-3 bytes. If C bitfields actually filled up the data in a consistent bytewise LSB to MSB manner (imagine the LSB as stored "first" in a byte) I would have no problem. Then again, stuff like MPEG-2 streams store bits in a bytewise MSB to LSB manner (imagine the MSB as stored "first" in a byte). From what I've seen, FF7 bitstreams seem to all be bytewise LSB to MSB. *headache*

The world needs more :o PASSION :o !
Title: Gears updated ^_^
Post by: sfx1999 on 2004-12-28 20:09:21
Halkun, here is my compiler's answer (MinGW):

Quote from: test.exe
char size is  1 bytes
short size is  2 bytes
int size is  4 bytes
long size is  4 bytes
long long size is  8 bytes
Press any key to continue . . .
Title: Gears updated ^_^
Post by: sfx1999 on 2004-12-29 21:51:51
I modified the code even more. It checks to see whether something is signed or unsigned by default. Here is the new code:

Code: [Select]
#include "stdio.h"

int main(void)
{
    printf("char size is %2d bytes\n", sizeof(char));
    printf("short size is %2d bytes\n", sizeof(short));
    printf("int size is %2d bytes\n", sizeof(int));
    printf("long size is %2d bytes\n", sizeof(long));
    printf("long long size is %2d bytes\n", sizeof(long long));

    char thechar = 0;
    short theshort = 0;
    int theint = 0;
    long thelong = 0;
    long long thelonglong = 0;
   
    thechar--;
    if (thechar > 0)
        printf("char defaults to unsigned\n");
    else
        printf("char defaults to signed\n");
       
    theshort--;
    if (theshort > 0)
        printf("short defaults to unsigned\n");
    else
        printf("short defaults to signed\n");

    theint--;
    if (theint > 0)
        printf("int defaults to unsigned\n");
    else
        printf("int defaults to signed\n");

    thelong--;
    if (thelong > 0)
        printf("long defaults to unsigned\n");
    else
        printf("long defaults to signed\n");

    thelonglong--;
    if (thelonglong > 0)
        printf("long long defaults to unsigned\n");
    else
        printf("long long defaults to signed\n");
}


Quote from: test.exe
char size is  1 bytes
short size is  2 bytes
int size is  4 bytes
long size is  4 bytes
long long size is  8 bytes
char defaults to signed
short defaults to signed
int defaults to signed
long defaults to signed
long long defaults to signed
Press any key to continue . . .


Ignore the "Press any key to continue". I had my program pause at the end, and I do not think the method I used would work on all platforms, so I removed it.

You know can't we create a makefile to do this all during the compile process? I've seen configure scripts do it.
Title: Gears updated ^_^
Post by: Cyberman on 2004-12-30 00:34:10
Yes I did mention that however not all compilor suites use makefles (VC does not for example unless you set that option), BCB does not you have to export it from the project file.
I suppose I understand why they don't use a makefile however it doesn't make things any better not using one.  The biggest trouble most have with make is redirecting the stream of data from the compilor.  Sigh.. nothing is ever so simple huh?

configure scripts actually set the make file and create a list of options based on the system you run it on.  For example lets say you have a linux box without the GTK libraries and you want to compile crossclient on it. It will notice that your GTK libraries are non existant your compilor type endianess various type sizes etc. Even if you have libXML installed (used to store options for the program in the user directory).  Unfortunately MS doesn't use this because they assume you are compiling on a windows box etc.  They make there own standards basically.

Cyb
Title: Gears updated ^_^
Post by: halkun on 2004-12-30 01:20:30
For giggles I just wanted to play with my compiler.

Quote

halkun@naru:~> gcc -pedantic -ansi test.c
test.c: In function `main':
test.c:9: warning: ISO C90 does not support `long long'
test.c:11: warning: ISO C89 forbids mixed declarations and code
test.c:15: warning: ISO C90 does not support `long long'


Keep in mind, I'm using "-pedantic -ansi" switch forces the compiler into strict ansi complience. It compiles anyway and I get the same results as test.exe above. I'm too lazy to download/compile gcc for mips and run the executable with my PSX emulater to see what responses I get.
Title: Gears updated ^_^
Post by: Alhexx on 2004-12-30 10:47:05
I just typed in halkun's code into MS Visual C++ NET and Dev-C++ 4.9.9.0. This was tested on my Laptop, a Intel Pentium III 700MHz (Coppermine-T)

Quote from: Ms Compiler
error C2632: 'long' followed by 'long' is illegal


Quote from: Ms Visual C++ NET
char size is 1 bytes
short size is 2 bytes
int size is 4 bytes
long size is 4 bytes
Press any key to continue :D


Quote from: Dev-C++ 4.9.9.0
char size is 1 bytes
short size is 2 bytes
int size is 4 bytes
long size is 4 bytes
long long size is 8 bytes
Drücken Sie eine beliebige Taste . . . :D


So much for that.
Oh, and I didn't want to know how you specify your integers in your programs, I wanted to know how the simple in, short and long types are defined.
Okay, thanx anyway

 - Alhexx
Title: Gears updated ^_^
Post by: Cyberman on 2004-12-30 16:10:01
Quote from: Alhexx
Oh, and I didn't want to know how you specify your integers in your programs, I wanted to know how the simple in, short and long types are defined.
Okay, thanx anyway

 - Alhexx

Hmmm well it's compilor/archetecture dependant. No standard has been given for the format and size of the data save FLOAT and DOUBLE. Those are already IEEE standards world wide accepted and people can't twiddle them (although MS seems to have tried to several times with extended floating point types <rolls eyes>).

Cyb
Title: Gears updated ^_^
Post by: Qhimm on 2004-12-30 17:03:07
MS did? The only thing I know of the current MS compiler is the lack of support for extended precision (80-bit floats) as used in intel processors...
Title: Gears updated ^_^
Post by: sfx1999 on 2004-12-31 01:07:16
Isn't a float 64 bits and a double 80?
Title: Gears updated ^_^
Post by: Qhimm on 2004-12-31 03:28:51
Single precision floating-point numbers (float) are 32 bits.
Double precision floating-point numbers (double) are 64 bits (hence "double").
Extended precision floating-point numbers (no standard type name) are 80-bit. This is the size used internally for floating-point processing in x86 CPUs.

The CPUs have support for directly using 80-bit floating-point numbers in your programs, but few higher-level languages have explicit support for it. In Microsoft's case, it's actually explicit non-support; they won't implement an extended data type because there's not enough demand to rationalize the work of exposing a third floating-point precision.
Title: Gears updated ^_^
Post by: halkun on 2004-12-31 04:04:30
Wait, lemme get this stright....
The x87 80 bit coprossesor functions, that has been around since 1980 on the 8087 add-on chip, has no support by Microsoft?

Even though Windows is the domninate OS on the x86?

Wow, the mind boggles at that one.
Title: Gears updated ^_^
Post by: Cyberman on 2004-12-31 04:31:26
It's too bad there is no support for 128 bit floats. I guess the demand for those is highly limited ;)

Yes Halkun support for extended precision is a pain. Although MS DID have support for it they don't any longer.  Namely it's not needed.  I do think they support the BCD numbers however, and those are 80bits.  15 1/2 digit mantissa with the sign with a 3 1/2 digit exponent with sign.  Good for spread sheets :)

Cyb
Title: Gears updated ^_^
Post by: Qhimm on 2004-12-31 04:36:41
Well the x87 does all its calculations in 80-bit, it just converts floats and doubles automatically as it reads from/writes to memory. The C library has special functions for the "long double" type which is basically the full 80-bit precision, but ever since the Microsoft compiler moved to Win32, "long double" is resolved to a 64-bit precision instead. The type still exists, but it has the same precision as a normal double. So, using a Microsoft compiler, there is no easy way to work with extended precision numbers. You pretty much have to write inline assembler code to do it.

Speaking of which, once the compiler moves to Win64, support for inline assembler will also be removed (God knows why). Microsoft intends to replace it with compiler intrinsics, i.e. "functions" that correspond to certain instructions but still lets the compiler handle most of the details. Anyone who ever tried using the MMX or SSE intrinsics know, however, that the code is wildly inefficient (often creates stack variables for things that could be done using only registers, etc).
Title: Gears updated ^_^
Post by: sfx1999 on 2004-12-31 04:55:34
I've heard that Microsoft's inline assembler sucked. I heard that it ran slower than C code. NOP NOP I guess.

I have just tested MinGW's size of a long double. It turns out that it is 12 bytes, or 96 bit. 96 bit? WTF? Something's not right it should be 10.
Title: Gears updated ^_^
Post by: Qhimm on 2004-12-31 12:31:12
I quite like MS's inline assembler, mostly because it's WYCIWYG. Except for when you access C variables from within the inline assembler, you pretty much know exactly what you're getting. The problem with any inline assembler is that the seam between the high-level language and the assembler is tricky and often unoptimized (the compiler has to automatically preserve registers and such), so to get the best of it you're better off writing larger chunks of code (like an entire iteration loop, instead of only the inside of the loop).

I wrote an MMX-enhanced alpha blending loop using MS inline assembler, which, without being particularly biased towards my own code, runs faster than any other software alpha blending routine I've seen on the web. So, certainly not slower than C code, unless you code badly.
Title: Gears updated ^_^
Post by: Cyberman on 2004-12-31 15:50:04
This reminds me of the fact that a number of people developing CODECS no longer use MS's compilor.  It seems MS thinks they know more than the programer or something like that.  It's really weird when someone compiles there once VC based application with 2 different compilors because MS 'assumed' they knew more about integer SSE, SSE, SSE2,  and MMX than the person using there compilor.  There is a big RANT about it on VirtualDub's page, and apparently he now writes the codec in C exports it as assembly then cleans out the garbage the compilor puts in it and cleans up the MMX SSE SSE2 etc.  It's a bit of a headache to say the least.

All I can say is 'Go Microsoft alienate more people' (sarcastic tone ;) ).

Seriously .. I really do not understand MS bullying techniques on optimization. What's with the 'We'll do it our way' attitude they have?  It's just, well, dismaying.

All right I've almost got the first conversion layer finished for exporting FF7 models into POV with rotation angles and variables.  However I'm pretty sure I've got the information wonky. (Cloud is still doing the splits!)

Cyb
Title: Gears updated ^_^
Post by: sfx1999 on 2005-01-01 03:59:40
Does anyone know why my long double was 96-bit?

Also, how would we implement a long long? Would it be a template or something?
Title: Gears updated ^_^
Post by: Micky on 2005-01-01 09:38:14
Quote from: sfx1999
Does anyone know why my long double was 96-bit?

Maybe to align it to the next 32-bit boundary?
Quote
Also, how would we implement a long long? Would it be a template or something?

Visual C has the __int64 type. My personal "types" file uses __int8/16/32/64 for visual C, and char/short/long/longlong on gcc.
Title: Gears updated ^_^
Post by: sfx1999 on 2005-01-02 22:22:34
Quote from: Qhimm
I wrote an MMX-enhanced alpha blending loop using MS inline assembler, which, without being particularly biased towards my own code, runs faster than any other software alpha blending routine I've seen on the web. So, certainly not slower than C code, unless you code badly.


OK I was thinking about this and it was killing me. Do you use a radix sort?

Then the other question I had was that if you use it to sort polygons, but use MMX, then how would you be able to sort floats? MMX doesn't have floating point support, and mixing floating point and MMX requires a couple operands and some register unloading.
Title: Gears updated ^_^
Post by: Qhimm on 2005-01-02 22:26:46
I feel there's been some slight miscommunication here. My code only blends two pixels together based on their alpha (opacity) values. It also correctly calculates the opacity of the resulting pixel. It has nothing to do with polygons or sorting, just simple low-level color channel processing (for which MMX-style SIMD instructions are ideal).
Title: Gears updated ^_^
Post by: sfx1999 on 2005-01-02 22:58:00
Oh I see. I thought it sorted them so you could draw them front to back. I was wrong.