Author Topic: I need some help with the lzs compress >_<  (Read 8708 times)

Gromtar

  • *
  • Posts: 12
    • View Profile
I need some help with the lzs compress >_<
« on: 2010-08-29 00:55:54 »
Hiho,

sorry for my bad english but i try my best and hope you guys understand me ;)

I have a little problem with the LZS compression.
I am currently learning c++ (2 days), then i found the QhimmWiki and it seemed interesting so i put some code together to load the lgp's in the memory and dump files out of it to the HD.
So far it worked good but now i want to unpack the lzs files but it seems i am to stupid  :'(

For the start i wanted to walk down all the controllbytes and lenghtbyte of the refference to get the total size of the decodet data but i get the wrong input pos and dont know why.

Code: [Select]
    int lgp_tools::decodeLZS(int fileID)
    {
        ll_obj->SelectElement(fileID);

        char *data = ll_obj->curr->memPtr;

        int inputPos = 4;
        char controlByte;
        int wSize;

        while(inputPos <= ll_obj->curr->filesize)
        {
            controlByte = data[inputPos];
            inputPos = inputPos + 1;

            for (int i = 0; i <= 7;i++)
            {
                if ((controlByte >> i) & 1)
                {
                    inputPos = inputPos + 1;
                    wSize = wSize + 1;
                }
                else
                {
                    int len = (data[inputPos + 1] & 0xF) + 3;

                    wSize = wSize + len;
                    inputPos = inputPos + 2;
                }
            }

        }

        cout <<  wSize <<"\n";
    }

Wen i dump the "ll_obj->curr->memPtr" to hd with this
Code: [Select]
    void lgp_tools::dumpFile(string path, int fileID)
    {
        ll_obj->SelectElement(fileID);
        ofstream outputFile (path.c_str(), ios::out | ios::binary);
        outputFile.write((char *) ll_obj->curr->memPtr, ll_obj->curr->filesize);
        outputFile.close();
    }

the file has the same content and file size as the file i extracted with highwind. So the input data cant be wrong.

Does some one see the error? I dont get it  (like i said it seems i am to stupid :cry: )
« Last Edit: 2010-08-30 02:49:42 by Gromtar »

nfitc1

  • *
  • Posts: 3011
  • I just don't know what went wrong.
    • View Profile
    • WM/PrC Blog
Re: I need some help with the lzs compress >_<
« Reply #1 on: 2010-08-30 05:05:10 »
Two things right off:

1. Are you sure your compressed data is starting at position 4? Maybe the file you're reading doesn't have that long of a header.

2. This line: while(inputPos <= ll_obj->curr->filesize)  is terrible practice for file sizes. Never use "<=" when comparing file sizes. If inputpos is equal to the filesize it will cause an error when trying to get data[inputpos] since the length is always one more than the last element in the data. If your program isn't crashing it's likely reading beyond the scope of data and picking up some random garbage. Since data is just a pointer to a pointer with no defined bounds it's likely to do that. Change it to:
while(inputPos < ll_obj->curr->filesize).

It's possible that your problem is a combination of these two.

Other than that your for loop looks fine. It passes in some test cases and ends on the next control byte:

FC 12 34 56 78 9A BC CD EF 01 23 DF
7F 12 34 56 78 9A BC CD EF 01 EF

Micky

  • *
  • Posts: 300
    • View Profile
Re: I need some help with the lzs compress >_<
« Reply #2 on: 2010-08-30 16:05:10 »
I recommend taking either Akari's or my implementation and compare the output. I found the problems in my algorithm by comparing it with Akari's code (I was interpreting the length word as compressed data.)

Gromtar

  • *
  • Posts: 12
    • View Profile
Re: I need some help with the lzs compress >_<
« Reply #3 on: 2010-08-30 20:50:20 »
Thanks you two, now it seems to work.
The problem was that i got the first controlbyte wrong  :-[
The file header inside the lgp is 24Bytes and i used 20 + 4 for the lzs so i had it 4 bytes to short  :roll:
« Last Edit: 2010-08-30 23:31:45 by Gromtar »

Bosola

  • Fire hazard!
  • *
  • Posts: 1752
    • View Profile
    • My YouTube Channel
Re: I need some help with the lzs compress >_<
« Reply #4 on: 2010-08-30 23:16:09 »
2. This line: while(inputPos <= ll_obj->curr->filesize)  is terrible practice for file sizes. Never use "<=" when comparing file sizes. If inputpos is equal to the filesize it will cause an error when trying to get data[inputpos] since the length is always one more than the last element in the data. If your program isn't crashing it's likely reading beyond the scope of data and picking up some random garbage. Since data is just a pointer to a pointer with no defined bounds it's likely to do that. Change it to:
while(inputPos < ll_obj->curr->filesize).

What I (assume) NFITC1 is getting at here is that inputpos offsets such that byte one is element zero, and therefore length = element size * (no of elements + 1). Consider:

Byte: FF / FF / FF
Pos: 00 / 01 / 02

If inputpos reaches the number of bytes (3), it'll overshoot.

What happens then? Well, if you're working with pointers, it depends what sits next in the memory. If you just MALLOCed a big block o blank data, you'll read null data. Otherwise, all sorts of madness could unfold.

Also, something you appear not to have seen yet in C++ - you don't need to use

Code: [Select]
nVar = nVar + 1;

You could use

Code: [Select]
nVar += 1;

Or even better

Code: [Select]
nVar++;
++nVar

That last form is called the iterator. The position of the pluses matters - if you put them after, the expression will return the current value of nVar and proceed to add; if you put them before, it'll add one first and then return it.

Consider

Code: [Select]
int somecrummyfunction()
{
    int nVar = 3;
    cout << nVar << endl;
    cout << ++nVar << endl;
    cout << nVar++ << endl;
    cout << nVar << endl;
}

This will output

Code: [Select]
3
4
4
5
« Last Edit: 2010-08-30 23:17:44 by Bosola »

Gromtar

  • *
  • Posts: 12
    • View Profile
Re: I need some help with the lzs compress >_<
« Reply #5 on: 2010-08-31 02:11:12 »
Morning  ;),

Ok now it works fine only a litte thing is off  ???

Code: [Select]
    int lgp_tools::decodeLZS(int fileID)
    {
        ll_obj->SelectElement(fileID);

        unsigned char *data = ll_obj->curr->memPtr;
        unsigned char controlByte  = NULL;
        unsigned int input_offset = 4;
        unsigned int output_offset = NULL;


        unsigned char* output_buffer = (unsigned char*) malloc(ll_obj->curr->filesize * 9);

        while (input_offset < ll_obj->curr->filesize)
        {
            controlByte = data[input_offset++];

            for (int i = 0; i < 8;i++)
            {
                if((controlByte >> i) & 1)
                {
                    output_buffer[output_offset++] = data[input_offset++];
                }
                else
                {
                    unsigned char ref1 = data[input_offset++];
                    unsigned char ref2 = data[input_offset++];

                    unsigned short RefLength = (ref2 & 0xF) + 3;
                    unsigned short ref_offset = ref1 + ((ref2 & 0xF0) << 4);

                    int real_offset = output_offset - ((output_offset - 18 - ref_offset) & 0xFFF);

                    for(int x = 0; x < RefLength; x++)
                    {
                        if (real_offset  + x < 0)
                        {
                             output_buffer[output_offset++] = 0;
                        }
                        else
                        {
                             output_buffer[output_offset++] = output_buffer[real_offset + x];
                        }
                    }
                }
            }
        }

        ll_obj->curr->filesize = output_offset;
        delete(ll_obj->curr->memPtr);
        ll_obj->curr->memPtr = (unsigned char*)realloc(output_buffer, output_offset);
    }

The decoded Data is the same as the data i got with highwind only that i get 20-30 byte to manny after the "FINAL FANTASY7" tag  :-o
It seems it goes over the last controlbyte  :(

« Last Edit: 2010-08-31 08:35:41 by Gromtar »

Gemini

  • *
  • Posts: 260
  • Not learner's Guru
    • View Profile
    • Devil Hackers
Re: I need some help with the lzs compress >_<
« Reply #6 on: 2010-08-31 11:48:09 »
I'm not exactly sure what is your problem with this algorithm, but I'd suggest you to implement a proper ring buffer, just in case. Or you could take Haruhiko Okumura's implementation of the FF7 LZSS and alter it to use buffers (like I did for my tools).

nfitc1

  • *
  • Posts: 3011
  • I just don't know what went wrong.
    • View Profile
    • WM/PrC Blog
Re: I need some help with the lzs compress >_<
« Reply #7 on: 2010-08-31 13:19:52 »
What I (assume) NFITC1 is getting at here is that inputpos offsets such that byte one is element zero, and therefore length = element size * (no of elements + 1). Consider:

Close, but not quite.
size = element_size * (no_of_objects)
Length = no_of_elements
last_index = no_of_elements - 1

last_index = length;
some_operation( data[last_index] );

will always result in memory leaks.

ADDENDUM:
Be advised that the final code block in an LZS file may not be complete. Files rarely fit neatly in this style. So the file might end with

... 12 58 47 69 31 35 77 85

but the final block in the compressed file might be

F9 12 34 50 40 90 85 EOF

Then your control block is still "expecting" four more bytes, but reached the end of the file. You'll have to stop counting once you get to this point.

You might want to change your initial for loop line to
for (int i = 0; i < 8 && inputPos < ll_obj->curr->filesize; i++)

Then it'll get kicked out when inputPos = ll_obj->curr->filesize and you'll have reached the end of the compressed file. That will then trigger the while loop condition and kick you out of that too.


PS -

Even though you're not using it now:

Code: [Select]
int len = (data[inputPos + 1] & 0xF) + 3;
wSize = wSize + len;

Some developers (like me) consider that bad practice to put an arbitrary initializer inside a loop like that. Since len was only used in the line following it's better to just do

Code: [Select]
wSize = wSize + (data[inputPos + 1] & 0xF) + 3;
then you won't have to guess if you need to clean up len or not. Although this is technically a static data type, good practices on simple things will carry over to larger projects.

Bosola

  • Fire hazard!
  • *
  • Posts: 1752
    • View Profile
    • My YouTube Channel
Re: I need some help with the lzs compress >_<
« Reply #8 on: 2010-08-31 15:10:06 »
What I (assume) NFITC1 is getting at here is that inputpos offsets such that byte one is element zero, and therefore length = element size * (no of elements + 1). Consider:

Close, but not quite.
size = element_size * (no_of_objects)
Length = no_of_elements
last_index = no_of_elements - 1

last_index = length;
some_operation( data[last_index] );

will always result in memory leaks.

Oh. I see. You mean that because of the mixup, there's data that's never getting flushed.

Gromtar

  • *
  • Posts: 12
    • View Profile
Re: I need some help with the lzs compress >_<
« Reply #9 on: 2010-08-31 18:29:43 »
Quote from: NFITC1
for (int i = 0; i < 8 && inputPos < ll_obj->curr->filesize; i++)

This done it  :-D now it works fine, thanks.

And thanks at all for the usefull tipps :)

Omzy

  • *
  • Posts: 205
    • View Profile
Re: I need some help with the lzs compress >_<
« Reply #10 on: 2010-09-02 18:28:33 »
Gemini, I was looking through your Package Beta 2 Release for your lzss algorithm and I found this file: main_org.asm.  :wink:

Also, does anyone have a standalone LZSS decompression algorithm that works? I was looking at yours Gemini and it has so many dependencies that it made my head spin.
« Last Edit: 2010-09-02 18:38:51 by Omzy »

nfitc1

  • *
  • Posts: 3011
  • I just don't know what went wrong.
    • View Profile
    • WM/PrC Blog
Re: I need some help with the lzs compress >_<
« Reply #11 on: 2010-09-02 19:02:24 »
Also, does anyone have a standalone LZSS decompression algorithm that works? I was looking at yours Gemini and it has so many dependencies that it made my head spin.

I think I have one. I'll have to dig it out and it's a .NET library file. That cool? I could give you the source (in VB.NET) if you needed that instead.

Omzy

  • *
  • Posts: 205
    • View Profile
Re: I need some help with the lzs compress >_<
« Reply #12 on: 2010-09-02 19:03:08 »
I'm using Visual c++, could I use that to call the function?

Edit: oh, yes I was looking for the source code. I just want to drop it into my program and see how it works so I can debug mine better.

Bosola

  • Fire hazard!
  • *
  • Posts: 1752
    • View Profile
    • My YouTube Channel
Re: I need some help with the lzs compress >_<
« Reply #13 on: 2010-09-02 23:05:12 »
You can find out more about LZS here, too: http://wiki.qhimm.com/FF7/LZS_format

Omzy

  • *
  • Posts: 205
    • View Profile
Re: I need some help with the lzs compress >_<
« Reply #14 on: 2010-09-02 23:40:32 »
Yah, thats the basis for everything, lol.