LZ48/LZ49 Z80 cruncher/decruncher

Started by roudoudou, 18:38, 18 September 16

Previous topic - Next topic

0 Members and 2 Guests are viewing this topic.

uniabis



Binary of LZ49 Windows compressor & fixed LZ48 Windows compressor.


lz48_bin.zip and lz48_c_source.zip

int LZ48_encode_extended_length(unsigned char *odata, int length)
{
   int ioutput=0;
   while (length>255) {



lz48_v002.zip

   while (length>=255) {


ervin

Quote from: uniabis on 09:56, 02 April 19

Binary of LZ49 Windows compressor & fixed LZ48 Windows compressor.

lz48_bin.zip and lz48_c_source.zip

int LZ48_encode_extended_length(unsigned char *odata, int length)
{
   int ioutput=0;
   while (length>255) {


lz48_v002.zip

   while (length>=255) {


Hi there.

Ah, your first post to the forums.
:)

Thanks for the new versions.
Is the greater-than check the only thing that has changed?

roudoudou

I realise i did not released the constant time version...
Frames with max nops for decrunching
But also...
Frames with exact configurable nop count for decrunching
use RASM, the best assembler ever made :p

I will survive

Shining

I'm really interested in a version with constant data size for stream decrunching...
TGS is back

Download my productions at:
cpc.scifinet.org

TotO

"You make one mistake in your life and the internet will never let you live it down" (Keith Goodyer)

Zik

Thank you Roudoudou for this work and for sharing! I am very interested in the LZ48/LZ49 variant you made from LZ4. I may use it in my software cause I need relatively fast compression time and very fast decompression time. 

So, here is a little contribution. 
First remark, you finally put offset byte at the end of the block (after match length), which is not what the diagrams of your first post show. Anyway. 

Then, I propose some small optimizations to the decrunching routine (see attachment). In short:

       
  • gain 6 nop when no literal, loose 2 otherwise (code around cp #10 when decoding token)
  • optimize literal and match size calculation speed (8 bit add and simply inc b if carry was set)
  • small size optimization on offset calculation (instead of src=dst-oft with 8 bit subtract, perform src=-oft+dst with one 16 bit add) (hint: neg might be avoided here)
  • small size and speed optimization by moving ld c,a around
I am also looking at the crunching routine but it is too soon to share my findings. For now, I could gain >30% time when crunching my reference file (the one that has worse compression ratio). 
First simple optimization is to move the ld bc,3 some lines away in LZ48_get_match_rescan...

LTronic

#31
Hello,


I am having trouble getting the cruncher/decruncher working.
I am using lz48decrunch_v006.asm posted in this thread for decrunching on CPC, assembled with RASM, translated with DisArk in order to use with SDCC.
I don't think the C glue code is the source of the issue (C source code found on this forum..).


I am using lz48.c cruncher I compiled on Linux (not sure which version I picked... command line stated it's from 2016).


Crunching / Decrunching in command line on Linux works, not on CPC.
I think I might have a version mismatch here.


Which files should I get ?


Thanks for your help !


[EDIT] : Please disregard, I finally got it working thanks to Arnaud6128 SDCC glue code. source and and target were reversed in my code....

roudoudou

last versions are all in this threadi can't help you more without the dataif this is not a secret, send me a mail with thoses datasregards
use RASM, the best assembler ever made :p

I will survive

elpekos

Hi Roudoudou,


I'm studying your LZ49 cruncher. And I dont fully understand this part, could you give me more comments about it, line 107. I think there is one useless JR, and

I dont understand why you just raise curadr when there is 1 or 2 bytes left.


Thanks.



; we are in the last 3 bytes, and maybe there is no more byte to crunch!
encode_block21
endadrcpy3:ld de,#1234
ld hl,(curadr)
encode_block21_loop
sbc hl,de
jr z,encode_block
ld hl,(curadr)
inc hl
ld (curadr),hl
jr encode_block21_loop




jr encode_block21


encode_block3
ld hl,(literal)
inc hl
inc hl
inc hl
ld (literal),hl

roudoudou

Quote from: elpekos on 09:33, 14 December 20
Hi Roudoudou,
I'm studying your LZ49 cruncher. And I dont fully understand this part, could you give me more comments about it, line 107. I think there is one useless JR, and
I dont understand why you just raise curadr when there is 1 or 2 bytes left.


the cruncher may not be fully optimised (i do not use anymore Z80 version to crunch since i use Rasm for this)
as far as i remember when we are in the last 3 bytes, it's useless to search for a key match, then i raise  curadr AND literal in order to make a "literal block" and end with this block the crunched file
use RASM, the best assembler ever made :p

I will survive

elpekos

ok, if I'm not wrong, the code seems to just raise literal if 3 bytes left and just raise curadr if 2 or 1 byte(s) left.
And what about this strange 'jr encode_block21' ?
Is it just a cut and paste remnant ?

roudoudou

Quote from: elpekos on 09:53, 14 December 20
ok, if I'm not wrong, the code seems to just raise literal if 3 bytes left and just raise curadr if 2 or 1 byte(s) left.
And what about this strange 'jr encode_block21' ?
Is it just a cut and paste remnant ?
i hope this is only some useless dead code  ;D
use RASM, the best assembler ever made :p

I will survive

elpekos

Hi,


I adapted your code for the Z88 OS. It works great... I compared results to the files produced by the C version. Thanks. I'll now focus on some optimisations and using a small buffer instead a large memory area.

Zik

Here is my optimized version of the LZ48 cruncher. People usually focus on decompression time only, but for the Soundtracker DMA (released last year) I needed also fast compression and LZ4x algorithms are great at this. With those optimizations on my reference data, compression time goes from 1.65s down to 585ms.

I focused on the match scan routine and took advantage of cpir and cpi assembly instructions. The loop itself is now faster but loop setup is longer. So, actual gain will depend on how your data look like.

Powered by SMFPacks Menu Editor Mod