News:

Printed Amstrad Addict magazine announced, check it out here!

Main Menu
avatar_reidrac

UCL compressor

Started by reidrac, 13:37, 11 August 19

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

reidrac

I've been using UCL compression for a while in my projects and I'd like to share some tools in case this is useful to somebody.

IMHO the main problem with UCL is that the algorithm is a bit obscure. There's a GPL library to compress but it is hard to compile (I manage to compile it thanks to Debian!), and getting it to work in Windows can be a bit of a pain. Also there's basically *one* decompressor in Z80 assembler, so all together I think ZX7 is a better supported option and you should be using that one already. If ZX7 is too complicated for you, I'm not sure you'll get UCL working.

I'm sharing it as-is and without support; you can ask questions but I don't want to troubleshoot or solve 3rd party problems, so don't make me regret this :P

I've included a tool with source code and a windows binary (I cross-compiled it, tested with WINE; I hope it works for everybody).

Enjoy!
Released The Return of Traxtor, Golden Tail, Magica, The Dawn of Kernel, Kitsune`s Curse, Brick Rick and Hyperdrive for the CPC.

If you like my games and want to show some appreciation, you can always buy me a coffee.

introspec

#1
OK, I've added this compressor to my battery of tests and this is what its results look like:



Of course, all the usual caveats regarding compressor testing apply here: these results are specific to my testing corpus (1.2Mb of ZX Spectrum related files). I am actually not entirely happy with my current corpus and is moving to another one in the future, but I've got so many results for the old corpus that I cannot brace myself to switch just yet. The problem with the old corpus is that it does not have much graphics data, whereas in a lot of relevant scenarios (games, demos) graphics data tends to dominate. So, for example, this plot shows LZSA2 to perform better "on average" than MegaLZ. However, this is only because LZSA2 is not nearly as good at graphics, but the corpus does not have much graphical data. In reality their compression ratios should more-or-less tie even.

All decompressors showed in coloured rectangles are new, I wrote them over the course of the last two years. Only some of them have been published (I hope to publish most of them later this year).

So, the UCL compressor is the red circle in the centre of the diagram. It decompesses slower than LDIR*5 (the actual number on my corpus is ~120 cycles per decompressed byte). The compression ratio turns out to be very similar to MegaLZ. Of course, with a better NRV2b compressor the ratio would be slightly higher. Now, the utility of Pareto frontier diagram of this kind is that it shows the alternatives very clearly. In your case, you can switch to using Hrust1 or Aplib - you'd get basically the same decompression speed and higher compression ratio (esp. since the ApLib decompressor is smaller than decompressor for NRV2b). Alternatively, you can potentially switch to MegaLZ and this would give you up to twice the decompression speed (assuming you use yet unpublushed speed-optimized decompressor).

Of course, this state of things is not constant. If someone were to take a closer look at NRV2 decompressor, it is very likely to be possible to re-optimize it so that it becomes a lot more competitive with MegaLZ. Also, I know for a fact that better compressors for NRV2 exist, so maybe you won't get the Hrust1 level compression, but you'd potentially get at least clear of MegaLZ.

However, as things stand, I'd recommend you to have a good look at ApLib with one of the fast decompressors. This would represent an immediate improvement for you in every possible usecase.

introspec

Sorry, I forgot to mention one more thing. These results had to be slightly "cooked" for your UCL tool, because it outputs zero length files when the files turn out to be incompressible and this is something that is massively annoying to work around in the context of automated testing. If instead of outputting zero length files for incompressible data, the UCL tool was to output true expanded data, the compression ratio would get a bit worse than shown on the diagram (I used the uncompressed length as indication of the likely compressed length) and decompression speed would stay very similar (I simply used the average for all similar files). However, this only affects 2 files out of 77 files in the corpus, so I do not expect this to lead to any meaningful changes.

reidrac

Quote from: introspec on 21:53, 12 August 19
Sorry, I forgot to mention one more thing. These results had to be slightly "cooked" for your UCL tool, because it outputs zero length files when the files turn out to be incompressible and this is something that is massively annoying to work around in the context of automated testing. If instead of outputting zero length files for incompressible data, the UCL tool was to output true expanded data, the compression ratio would get a bit worse than shown on the diagram (I used the uncompressed length as indication of the likely compressed length) and decompression speed would stay very similar (I simply used the average for all similar files). However, this only affects 2 files out of 77 files in the corpus, so I do not expect this to lead to any meaningful changes.

Oh, sorry about that. The tool was meant to be used and I'm not interested in making the data larger :)

UCL is supported by UPX, not sure which algorithms or if it would make sense to look at its source code. Also I think is probably not worth looking at NRV further because the author recommends LZO as open source alternative (also written and maintained by him; and looks more like LZ4), so UCL looks final to me.

I know these type of tests aren't perfect. In my current project, changing from UCL to ZX7 saved me around 600 bytes. This is compressing the internal data like tilesets, maps, etc; but then when using ZX7 to compress the whole binary to generate the tape, ZX7 gave me worse compression (consistent with your results).

It all depends on the data. I may give LZSA a go to see how it performs in my use case compared with ZX7.

Thanks for this, it is very interesting!
Released The Return of Traxtor, Golden Tail, Magica, The Dawn of Kernel, Kitsune`s Curse, Brick Rick and Hyperdrive for the CPC.

If you like my games and want to show some appreciation, you can always buy me a coffee.

SyX

Quote from: introspec on 21:49, 12 August 19
However, as things stand, I'd recommend you to have a good look at ApLib with one of the fast decompressors. This would represent an immediate improvement for you in every possible usecase.
My biggest problem with Aplib (and i am one of the guys that optimized it for z80 and ported to 68000 for my Amiga/Megadrive productions) is that the compressor is not open source (you need to link against a binary library) and a long time ago when i asked to the author about it, he said that he was not interested in it.
Do you know if there is an open source alternative aplib compressor implementation?

introspec

Quote from: reidrac on 22:23, 12 August 19UCL is supported by UPX, not sure which algorithms or if it would make sense to look at its source code. Also I think is probably not worth looking at NRV further because the author recommends LZO as open source alternative (also written and maintained by him; and looks more like LZ4), so UCL looks final to me.
UCL library contains algorithms NRV2b, NRV2d and  NRV2e. Out of the three, NRV2b tends to compress in general best. As we've seen, it is approximately in the same league as Hrum, MegaLZ or Pletter. It would usually beat ZX7 and LZSA2 on compression, it would usually be beaten on compression by Hrust, ApLib or Exomizer.


LZO on the other hand is a pure byte-packer, which makes it fast, but also makes it lose a lot of compression ratio. In my tests LZOP (compressor based on LZO library) has a very slightly better top compression ratio compared to LZSA1. LZSA2 uses bytes and nibbles, so would beat LZO on ratio pretty much any time. Add to this that I do not think that there exists a Z80 port of LZO. Overall, I'd be interested to test LZO on Z80, but frankly I would not expect it to be all that special, esp. since on Z80 it would now face competition from LZSA1, which is targeting a similar niche.

introspec

Quote from: SyX on 22:25, 12 August 19
My biggest problem with Aplib (and i am one of the guys that optimized it for z80 and ported to 68000 for my Amiga/Megadrive productions) is that the compressor is not open source (you need to link against a binary library) and a long time ago when i asked to the author about it, he said that he was not interested in it.
Do you know if there is an open source alternative aplib compressor implementation?
I just wrote up a summary here.

SyX


Powered by SMFPacks Menu Editor Mod