Author Topic: LZ48/LZ49 Z80 cruncher/decruncher  (Read 3242 times)

0 Members and 1 Guest are viewing this topic.

Offline roudoudou

  • 6128 Plus
  • ******
  • Posts: 502
  • Country: fr
    • urban exploration
  • Liked: 655
LZ48/LZ49 Z80 cruncher/decruncher
« on: 18:38, 18 September 16 »


update!

new cruncher variant LZ49, crunch better (average 10%) when used with more data
bugfix both crunchers when match/literal length equal to 270+255*n
bugfix Z80 cruncher: wrong size displayed when using more than one bank
bugfix Z80 cruncher: almost infinite loop when last key end in the last 3 bytes, or end the data

-----------



Hi!
I wrote a new compression format, freely inspired by LZ4 format (thanks to Yann Collet again)
The few modifications suits well to the Z80
The format was designed and modified to fit the decompression routine
Now it's optimised for size and speed
The decompression routine fit in 83 bytes and there is an ultra-short version (you lose some Nops)
Crunching routine is a little slow as matchkey may have an unlimited length
I will release in a few weeks max a C source code if people want to cross-crunch


Here is the format


 Token definition: contains both literal and match length information
4 high-bits: literal length 0-15
if length=15 then read one Literal length extended byte (value 0-255) and add to current literal length
if literal length extended byte=255 then read another one, etc.

 
4 low-bits: match length 0-15
if length=15 then read one Match length extended byte (value 0-255) and add to current match length
if match length extended byte=255 then read another one, etc.

 
THEN add 3 to the final match length, because it is the minimal length

 

 
Offset definition: absolute distance to match value -1
if you want to copy from previous byte, then offset value is 0
if you want to copy 5 bytes far, then offset value is 4
if you want to copy 255 bytes far, then offset value is 254
if you want to END the file, then offset value is 255

 
I will release soon (don't know when) a C source file with an brutforce key search, design to run in a computer with moar memory ;)

You may play with crunching/decrunching routine

The crunching routine read data to compress from central memory and put compressed data in extension C4/C5/C6/C7


---------------------



Changes:

1st high-bit of token is an extra offset range (set when offset>255)
« Last Edit: 16:13, 23 May 17 by roudoudou »
use RASM, the best assembler ever made :p

I will survive

Offline SRS

  • Supporter
  • 464 Plus
  • *
  • Posts: 494
  • Country: de
  • Schneider CPC464 - what else ?
  • Liked: 471
Re: LZ48 cruncher/decruncher
« Reply #1 on: 00:27, 19 September 16 »
Now this will be cool - a x86 cruncher (to work with sdcc etc.)  and and small and fast decruncher on CPC.

Offline roudoudou

  • 6128 Plus
  • ******
  • Posts: 502
  • Country: fr
    • urban exploration
  • Liked: 655
Re: LZ48 cruncher/decruncher
« Reply #2 on: 22:23, 19 September 16 »
Now this will be cool - a x86 cruncher (to work with sdcc etc.)  and and small and fast decruncher on CPC.


Here is the C source file, i tested it a little, should work (hope so)


« Last Edit: 22:42, 19 September 16 by roudoudou »
use RASM, the best assembler ever made :p

I will survive

Offline roudoudou

  • 6128 Plus
  • ******
  • Posts: 502
  • Country: fr
    • urban exploration
  • Liked: 655
Re: LZ48 cruncher/decruncher
« Reply #3 on: 12:31, 20 September 16 »
I noticed a bug when literal or match length is 270+255*n with crunchers. Will fix this.
« Last Edit: 12:33, 20 September 16 by roudoudou »
use RASM, the best assembler ever made :p

I will survive

Offline roudoudou

  • 6128 Plus
  • ******
  • Posts: 502
  • Country: fr
    • urban exploration
  • Liked: 655
Re: LZ48 cruncher/decruncher
« Reply #4 on: 14:42, 20 September 16 »
I noticed a bug when literal or match length is 270+255*n with crunchers. Will fix this.


First post updated with fixed versions of Z80 & C cruncher


I also found a bug with displayed output size (Z80 cruncher) when output size uses more than one BANK. Fixed too.


Happy crunching!
use RASM, the best assembler ever made :p

I will survive

Offline Gryzor

  • Administrator
  • 6128 Plus
  • *****
  • Posts: 14.387
  • Country: gr
  • CPC-Wiki maintainer
    • CPCWiki
  • Liked: 2593
Re: LZ48 cruncher/decruncher
« Reply #5 on: 14:58, 20 September 16 »
Spoiler: show
Screenshot?

Offline SRS

  • Supporter
  • 464 Plus
  • *
  • Posts: 494
  • Country: de
  • Schneider CPC464 - what else ?
  • Liked: 471
Re: LZ48 cruncher/decruncher
« Reply #6 on: 21:54, 20 September 16 »
This works with dev-cpp and win64:

Code: [Select]
/* LZ48 simple C source file / crappy version by roudoudou - Flower Corp. 2016 */

#include<stdio.h>
#include<stdlib.h>
#include<string.h>
#include<malloc.h>

int LZ48_encode_extended_length(unsigned char *odata, int length) {
    int ioutput=0;

    while (length>=255) {
        odata[ioutput++]=0xFF;
        length-=255;
    }
    /* if the last value is 255 we must encode 0 to end extended length */
    odata[ioutput++]=length;
    return ioutput;
}

int LZ48_encode_block(unsigned char *odata,unsigned char *data, int literaloffset,int literalcpt,int offset,int maxlength) {
    int ioutput=1;
    int token=0;
    int i;

    if (offset<0 || offset>255) {
        fprintf(stderr,"internal offset error!");
        exit(-2);
    }

    if (literalcpt<15) {
        token=literalcpt<<4;
    } else {
        token=0xF0;
        ioutput+=LZ48_encode_extended_length(odata+ioutput,literalcpt-15);
    }

    for (i=0; i<literalcpt; i++) odata[ioutput++]=data[literaloffset++];

    if (maxlength<18) {
        if (maxlength>2) {
            token|=(maxlength-3);
        } else {
            /* endoffset has no length */
        }
    } else {
        token|=0xF;
        ioutput+=LZ48_encode_extended_length(odata+ioutput,maxlength-18);
    }

    odata[ioutput++]=offset-1;

    odata[0]=token;
    return ioutput;
}

unsigned char *LZ48_encode(unsigned char *data, int length, int *retlength) {
    int i,startscan,current=1,token,ioutput=1,curscan;
    int maxoffset,maxlength,matchlength,literal=0,literaloffset=1;
    unsigned char *odata=NULL;

    if (!length) return NULL;

    odata=(unsigned char*) malloc(length*1.1);
    if (!odata) {
        fprintf(stderr,"memory full");
        exit(-1);
    }

    /* first byte always literal */
    odata[0]=data[0];

    /* force short data encoding */
    if (length<5) {
        token=(length-1)<<4;
        odata[ioutput++]=token;
        for (i=1; i<length; i++) odata[ioutput++]=data[current++];
        odata[ioutput++]=0xFF;
        *retlength=ioutput;
        return odata;
    }

    while (current<length) {
        maxlength=0;
        startscan=current-255;
        if (startscan<0) startscan=0;
        while (startscan<current) {
            matchlength=0;
            curscan=current;
            for (i=startscan; curscan<length; i++) {
                if (data[i]==data[curscan++]) matchlength++;
                else break;
            }
            if (matchlength>=3 && matchlength>maxlength) {
                maxoffset=startscan;
                maxlength=matchlength;
            }
            startscan++;
        }
        if (maxlength) {
            ioutput+=LZ48_encode_block(odata+ioutput,data,literaloffset,literal,current-maxoffset,maxlength);
            current+=maxlength;
            literaloffset=current;
            literal=0;
        } else {
            literal++;
            current++;
        }
    }
    ioutput+=LZ48_encode_block(odata+ioutput,data,literaloffset,literal,0,0);
    *retlength=ioutput;
    return odata;
}

void LZ48_decode(unsigned char *data, unsigned char *odata) {
    int HL=0,DE=0;
    int literallength,matchlength,HLmatch;

    odata[DE++]=data[HL++];

    while (1) {
        literallength=(data[HL] & 0xF0)>>4;
        matchlength=(data[HL++] & 0xF);

        if (literallength==15) {
            while (data[HL]==255) {
                literallength+=255;
                HL++;
            }
            literallength+=data[HL++];
        }

        while (literallength>0) {
            odata[DE++]=data[HL++];
            literallength--;
        }

        /* matchkey */
        if (matchlength==15) {
            while (data[HL]==255) {
                matchlength+=255;
                HL++;
            }
            matchlength+=data[HL++];
        }
        matchlength+=3;
        if (data[HL]==0xFF) return;
        else HLmatch=DE-(data[HL++]+1);

        while (matchlength) {
            odata[DE++]=odata[HLmatch++];
            matchlength--;
        }
    }
}

unsigned char * LZ48_decrunch(unsigned char *data, int *osize) {
    int literallength,matchlength;
    int HL=1,DE=1;
    unsigned char *odata=NULL;

    while (1) {
        literallength=(data[HL] & 0xF0)>>4;
        matchlength=(data[HL++] & 0xF);
        if (literallength==15) {
            while (data[HL]==255) {
                literallength+=255;
                HL++;
            }
            literallength+=data[HL++];
        }

        DE+=literallength;
        HL+=literallength;

        /* matchkey */
        if (matchlength==15) {
            while (data[HL]==255) {
                matchlength+=255;
                HL++;
            }
            matchlength+=data[HL++];
        }
        matchlength+=3;
        if (data[HL]==0xFF) break;
        else HL++;

        DE+=matchlength;
    }
    *osize=DE;
    odata=(unsigned char*)malloc(*osize);
    if (!odata) {
        fprintf(stderr,"memory full\n");
        exit(-1);
    }
    LZ48_decode(data,odata);
    return odata;
}

void Usage() {
    printf("usage: lz48 <options>\n");
    printf("options are:\n");
    printf("-i <inputfile>\n");
    printf("-o <outputfile>\n");
    printf("-b hexatext output (enable if you forget outputfile)\n");
    printf("-d decrunch\n");
    printf("\n");
    exit(0);
}

int ParseOptions(char **argv,int argc, char **inputfilename, char **outputfilename, int *hexoutput, int *crunch) {
    int i=0;

    if (argv[i][0]=='-') {
        switch(argv[i][1]) {
            case 'O':
            case 'o':
                if (i+1<argc) *outputfilename=argv[i+1];
                i++;
                break;
            case 'I':
            case 'i':
                if (i+1<argc) *inputfilename=argv[i+1];
                i++;
                break;
            case 'D':
            case 'd':
                *crunch=0;
                break;
            case 'B':
            case 'b':
                *hexoutput=1;
                break;
            default:
                Usage();
        }
    } else
        Usage();
    return i;
}

/*
 *     GetParametersFromCommandLine
 *         retrieve parameters from command line and fill pointers to file names
 *         */
void GetParametersFromCommandLine(int argc, char **argv, char **inputfilename, char **outputfilename, int *hexoutput, int *crunch) {
#undef FUNC
#define FUNC "GetParametersFromCommandLine"
    int i;

    for (i=1; i<argc; i++)
        i+=ParseOptions(&argv[i],argc-i,inputfilename,outputfilename,hexoutput,crunch);

    if (!*inputfilename) Usage();
    if (!*outputfilename) *hexoutput=1;
}



int main(int argc, char **argv) {
    unsigned char *data=NULL,*newdata=NULL;
    char *inputfilename=NULL,*outputfilename=NULL;
    int hexoutput=0,isize,crunch=1;
    int newsize,i,cr;
    FILE *fin=NULL,*fout=stdout;

    fprintf(stderr,"LZ48 cruncher / roudoudou - Flower Corp. 2016\n");

    GetParametersFromCommandLine(argc,argv,&inputfilename,&outputfilename,&hexoutput,&crunch);

    fin=fopen(inputfilename,"rb");
    if (!fin) Usage();
    fseek(fin,0,SEEK_END);
    isize=ftell(fin);
    fseek(fin,0,SEEK_SET);
    data=(unsigned char*)                                                                                                                                                                                                                                               malloc(isize);
    if (!data) {
        fprintf(stderr,"memory full\n");
        exit(-1);
    }
    if (fread(data,1,isize,fin)!=isize) {
        fprintf(stderr,"read error\n");
        exit(-1);
    }
    fclose(fin);
    switch (crunch) {
        case 0:
            newdata=LZ48_decrunch(data,&newsize);
            break;
        case 1:
            newdata=LZ48_encode(data,isize,&newsize);
            break;
    }
    printf("input: %d output: %d\n",isize,newsize);
    if (hexoutput) {
        for (i=cr=0; i<newsize; i++) {
            if (!cr) {
                printf("DEFB %02X",newdata[i]);
                cr=1;
            } else {
                if (cr==15) {
                    printf(",%02X\n",newdata[i]);
                    cr=0;
                } else {
                    printf(",%02X",newdata[i]);
                    cr++;
                }
            }
        }
        if (cr) printf("\n");
    } else {
        fout=fopen(outputfilename,"wb");
        if (!fout) Usage;
        if (fwrite(newdata,1,newsize,fout)!=newsize) {
            fprintf(stderr,"write error\n");
            exit(-1);
        }
        fclose(fout);
    }
   return 0;
}


Offline roudoudou

  • 6128 Plus
  • ******
  • Posts: 502
  • Country: fr
    • urban exploration
  • Liked: 655
Re: LZ48 cruncher/decruncher
« Reply #7 on: 21:45, 26 September 16 »
bugfix Z80 cruncher: almost infinite loop when last key end in the last 3 bytes, or end the data

new release in the first post v017
use RASM, the best assembler ever made :p

I will survive

Offline roudoudou

  • 6128 Plus
  • ******
  • Posts: 502
  • Country: fr
    • urban exploration
  • Liked: 655
Re: LZ48/LZ49 cruncher/decruncher
« Reply #8 on: 22:55, 26 October 16 »
first post updated


here is a variant of LZ48 -> LZ49 with 9bits offset


decrunch is a very little nops slower and there is an average 10% compression gain


two new sources: lz49crunch & lz49decrunch in Z80 assembly


C version will follow






I'm currently working on evolutions for LZ48 & LZ49 (it's alread programmed, i have to test it first!)


This is:
- constant time execution for partial decrunching (nop precise)
- maximum time execution for partial decrunching (maximum in nops)

- constant time execution for stream decrunching
- maximum time execution for stream decrunching
- constant data size for stream decrunching
- constant crunched size for stream decrunching


Have fun!
use RASM, the best assembler ever made :p

I will survive

Offline Arnaud

  • Supporter
  • 464 Plus
  • *
  • Posts: 385
  • Country: fr
  • Liked: 268
Re: LZ48/LZ49 cruncher/decruncher
« Reply #9 on: 22:07, 11 November 16 »
Here a version compatible with SDCC / CPCTelera :
* lz48_bin.zip
(6.13 kB - downloaded 133 times)


And exe Compressor / Decompressor for Windows :
 [ Invalid Attachment ]

@roudoudou : Can i adapt for WinCPCTelera the C decruncher version of your LZ248 (under GitHub) ?
« Last Edit: 09:48, 12 November 16 by Arnaud »

Offline roudoudou

  • 6128 Plus
  • ******
  • Posts: 502
  • Country: fr
    • urban exploration
  • Liked: 655
Re: LZ48/LZ49 cruncher/decruncher
« Reply #10 on: 17:40, 12 November 16 »
@Arnaud it will be a pleasure!
use RASM, the best assembler ever made :p

I will survive

Offline keith56

  • ちび悪魔!
  • Supporter
  • 464 Plus
  • *
  • Posts: 403
  • Country: jp
  • Part Ma, Part Aku... All Chibi!
    • Chibi Akuma(s)
  • Liked: 685
Re: LZ48/LZ49 cruncher/decruncher
« Reply #11 on: 05:30, 18 January 17 »
Just did a quick test on this! it looks excellent! it compressed the 17k loading screen of my game down to 9k! Thanks for making such a great project!
Chibi Akuma(s) Comedy-Horror 8-bit Bullet Hell shooter for CPC - http://www.chibiakumas.com
「チビ悪魔」可笑しいゴシックSTG: http://www.chibiakuma.com
Learn Z80 Assembly programming for CPC,Speccy,MSX + More with my Text+Videos Tutorials:http://www.chibiakumas.com/z80/

Offline GUNHED

  • 464 Plus
  • *****
  • Posts: 477
  • Country: de
  • Reincarnation of TFM
  • Liked: 218
Re: LZ48/LZ49 Z80 cruncher/decruncher
« Reply #12 on: 23:55, 25 April 18 »
Very nice! Will use it soon and report how it works :-)

http://futureos.de --> Get the revolutionary FutureOS
http://futureos.cpc-live.com/files/LambdaSpeak_RSX_by_TFM.zip --> Get the RSX-ROM for LambdaSpeak :-)

Offline GUNHED

  • 464 Plus
  • *****
  • Posts: 477
  • Country: de
  • Reincarnation of TFM
  • Liked: 218
Re: LZ48/LZ49 Z80 cruncher/decruncher
« Reply #13 on: 16:28, 09 August 18 »
Hi I compressed a file with:


- Exomizer: 16,1 KB
- CPCT......: 21 KB
- BitBuster: 21 KB
- LZ49 .....: 22,7 KB


Well, I just need some cruncher to get under 16 KB. Hmpf!

http://futureos.de --> Get the revolutionary FutureOS
http://futureos.cpc-live.com/files/LambdaSpeak_RSX_by_TFM.zip --> Get the RSX-ROM for LambdaSpeak :-)

Online ComSoft6128

  • ..................................
  • Supporter
  • 6128 Plus
  • *
  • Posts: 523
  • Country: scotland
  • CPC THEN CPC NOW
    • index.php?action=treasury
  • Liked: 380
Re: LZ48/LZ49 Z80 cruncher/decruncher
« Reply #14 on: 16:40, 09 August 18 »
Hi GUNHED,

This is from a long time ago but once or twice I used a CPM utility called "DU" or "Crunch" (?) to compress files, don't know how efficient it was/is, like I say it was a long long time ago.

Cheers,

Peter
« Last Edit: 17:05, 09 August 18 by ComSoft6128 »

Offline roudoudou

  • 6128 Plus
  • ******
  • Posts: 502
  • Country: fr
    • urban exploration
  • Liked: 655
Re: LZ48/LZ49 Z80 cruncher/decruncher
« Reply #15 on: 09:48, 10 August 18 »
Hi I compressed a file

Well, I just need some cruncher to get under 16 KB. Hmpf!
What kind of file?
If you want to beat a generic cruncher you must use dedicated features. Or use the best cruncher which is very slow to decrunch sadly...
use RASM, the best assembler ever made :p

I will survive

Offline GUNHED

  • 464 Plus
  • *****
  • Posts: 477
  • Country: de
  • Reincarnation of TFM
  • Liked: 218
Re: LZ48/LZ49 Z80 cruncher/decruncher
« Reply #16 on: 17:00, 10 August 18 »
Well, I just need to get this file inside 16 KB... So I tried what it probably the best four choices for the CPC. However it's different for any file (I guess).  :)
It's also good to see that some CPC crunchers, still beat some PC programs  ;D


I'll attach the file I did use, if there is interest...

http://futureos.de --> Get the revolutionary FutureOS
http://futureos.cpc-live.com/files/LambdaSpeak_RSX_by_TFM.zip --> Get the RSX-ROM for LambdaSpeak :-)

Offline reidrac

  • Supporter
  • 6128 Plus
  • *
  • Posts: 587
  • Country: gb
  • Trying to gamedev!
    • index.php?action=treasury
    • usebox.net
  • Liked: 1036
Re: LZ48/LZ49 Z80 cruncher/decruncher
« Reply #17 on: 12:34, 11 August 18 »
Well, I just need to get this file inside 16 KB... So I tried what it probably the best four choices for the CPC. However it's different for any file (I guess).  :)
It's also good to see that some CPC crunchers, still beat some PC programs  ;D


I'll attach the file I did use, if there is interest...

If you haven't checked it, look at Einar's zx7. It has the best compression on that family and it has very efficient decompressors supporting Z80.
Released The Return of Traxtor, Golden Tail and Magica for the CPC.

Offline GUNHED

  • 464 Plus
  • *****
  • Posts: 477
  • Country: de
  • Reincarnation of TFM
  • Liked: 218
Re: LZ48/LZ49 Z80 cruncher/decruncher
« Reply #18 on: 17:37, 12 August 18 »
Well, it just gets down to 20 KB. So Exomzer with 16,1 KB ist still the best, but only WinRAN can get it under 16 KB. Well, I'm afraid there is no WinRAR decompressor for the CPC.

http://futureos.de --> Get the revolutionary FutureOS
http://futureos.cpc-live.com/files/LambdaSpeak_RSX_by_TFM.zip --> Get the RSX-ROM for LambdaSpeak :-)

Offline roudoudou

  • 6128 Plus
  • ******
  • Posts: 502
  • Country: fr
    • urban exploration
  • Liked: 655
Re: LZ48/LZ49 Z80 cruncher/decruncher
« Reply #19 on: 18:37, 12 August 18 »
Well, it just gets down to 20 KB. So Exomzer with 16,1 KB ist still the best, but only WinRAN can get it under 16 KB. Well, I'm afraid there is no WinRAR decompressor for the CPC.
15240 bytes with Shrinkler but you should consider the very slow decrunching time
use RASM, the best assembler ever made :p

I will survive

Offline GUNHED

  • 464 Plus
  • *****
  • Posts: 477
  • Country: de
  • Reincarnation of TFM
  • Liked: 218
Re: LZ48/LZ49 Z80 cruncher/decruncher
« Reply #20 on: Yesterday at 12:54 »
15240 bytes with Shrinkler but you should consider the very slow decrunching time


Oh, thanks for the information - I will try immediately. Yes decrunching is slow, also very slow for Exomizer. LZ49 is far better (also CPCT is a well tool).  :) :) :)


EDIT: Sorry for getting offtopic here, anybody got a link for Shrinkler newest version?

« Last Edit: Yesterday at 12:56 by GUNHED »
http://futureos.de --> Get the revolutionary FutureOS
http://futureos.cpc-live.com/files/LambdaSpeak_RSX_by_TFM.zip --> Get the RSX-ROM for LambdaSpeak :-)

Offline roudoudou

  • 6128 Plus
  • ******
  • Posts: 502
  • Country: fr
    • urban exploration
  • Liked: 655
use RASM, the best assembler ever made :p

I will survive