News:

Printed Amstrad Addict magazine announced, check it out here!

Main Menu
avatar_cpcitor

stream decompression for big memory savings in some cases

Started by cpcitor, 20:20, 09 January 13

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

cpcitor

Hello,

This question is in the context of 16KB ROM game compo - technical thread, but can be useful in any CPC programming context.

In such a context, it may be interesting to minimize the memory footprint of a program. If you load a compressed package (program+data) and decompress it all before running, you have saved on size before decompression but not after. If your program was in ROM, you have fully lost the benefit of saving RAM.

A solution would be a stream-based decompressor. It would be a routine that does not uncompress the whole data at once, but only processes a small fraction of it, writes the result and return control to the caller. The caller will then use the data, and when it needs more, it call again a function "decompress_some_more", until there is nothing more to decompress. In cases where decompressed data from the beginning of the stream is no longer used, that memory can be reclaimed, and this is where we win.

People who have piped gzip output to another program on Linux know what I mean. But here we've got no underlying OS for such a generic tool.

Another example is video decoding : a movie in any MPEG standard (fortunately) does not need to keep all the old frames in memory to be fully played. In this context, stream-based decompression is the rule.

Another good example actually useful for the CPC is music. Music data often compresses very well because it is highly redundant.
One could imagine, for example, decompressing music data just as needed. If you decompress it from a ROM, you can for example uncompress one pattern (see http://en.wikipedia.org/wiki/MOD_%28file_format%29), play it while decompressing the next one, and when playing the second, overwrite memory used by the first to write the third one, etc. It will take CPU time *while* playing instead of *before* (but not more in total than fully decompressing before), and it definitely saves memory because at all times you consume memory for only two patterns.

Of course, any data can be artificially split into small compressed blocks, but you lose a lot in compression efficiency due to inter-block redundancy, so a decompressor that is stream-aware is definitely a win.

Intuitively any decompressing routine actually implements a loop with a "decompress_some_more" function internally. Calling that function only when needed allows what we need, but there are some constraints to resolve, like : does the decompressor need to access recently decompressed data (to how long in the past), or only data from the stream ? Only a person that has written a decompressor can answer, for that particular algorithm.

Does anyone know about a stream-based decompressor written for Z80, or has hint on adjusting one already written ?
Had a CPC since 1985, currently software dev professional, including embedded systems.

I made in 2013 the first CPC cross-dev environment that auto-installs C compiler and tools: cpc-dev-tool-chain: a portable toolchain for C/ASM development targetting CPC, later forked into CPCTelera.

fano

It is possible to do that with exomizer, i used this one time to display a decompression bar.As the source code is avaible it is not very difficult.
"NOP" is the perfect program : short , fast and (known) bug free

Follow Easter Egg products on Facebook !

TFM

Is there a ressource where one can find informations about Exomizer? All I got is homepages with some foggy informations, but no clean way how to use it. An stupid-proof handbook would be good - especially for the game compo.
TFM of FutureSoft
Also visit the CPC and Plus users favorite OS: FutureOS - The Revolution on CPC6128 and 6128Plus

fano

I must say it is so easy to use that it does not need docs  :laugh:
What do you need to know about it ?

btw: exomizer is not that slow as i used it to decompress boss music without pausing the game in R-type , no one noticed it =)
"NOP" is the perfect program : short , fast and (known) bug free

Follow Easter Egg products on Facebook !

Prodatron

Here is the actual Z80-based decruncher by Metalbrain:
http://www.speccy.org/metalbrain/exo_v4.zip
(and it's really easy to use  :laugh: )

This should be the official website, where you can download the cruncher for DOS and Windows:
Exomizer 2 website

Here is the CPC-Wiki article:
Exomizer - CPCWiki

CU,
Prodatron

PS: I also found this on MSX.org: New compression method: Exomizer 2 | MSX Resource Center(Page 1/2)

GRAPHICAL Z80 MULTITASKING OPERATING SYSTEM

TFM

Quote from: fano on 21:29, 09 January 13
I must say it is so easy to use that it does not need docs  :laugh:
What do you need to know about it ?

Not me, people participating in the game contest. I just mentioned my troubles in finding docs back that day. I don't want to see people get scared off, just because they are not familiar with compression. And compression will be an important part of this contest, right? (Personally I still prefer MadRams Turbo-Cruncher, CPCT).

@Prodi: Thanks' for the links!
TFM of FutureSoft
Also visit the CPC and Plus users favorite OS: FutureOS - The Revolution on CPC6128 and 6128Plus

fano

I am speaking of exomizer simply because i am using it regulary and it is known to have the best compression ratio (and it is very reliable)
I'll see with Syx if he can add some info about exomizer in the challenge info like he does for rsx.
"NOP" is the perfect program : short , fast and (known) bug free

Follow Easter Egg products on Facebook !

Sykobee (Briggsy)

Quote from: fano on 07:47, 10 January 13
I am speaking of exomizer simply because i am using it regulary and it is known to have the best compression ratio (and it is very reliable)
I'll see with Syx if he can add some info about exomizer in the challenge info like he does for rsx.
SyX has mentioned to me that he will try to add an exomiser in ROM example in the near future.
I like the idea of in-game decompression rather than pre-game decompression. This could even, on a multi-screen game, let you have different graphics on a per-screen basic rather than a per-level basis, etc. Great for boss graphics and music too, as mentioned.


it does mean that you will always have to have the exomiser scratch memory allocated during game play.

fano

About "stream" decompression , the method i used is to patch the code at 'exo_mainloop' to call my own routine.Here DE is used to point where exomizer writes data so you can know ammount of bytes written.This way, you can exit exomizer code when it wrote a minimal ammount of data, save all the register and restart where you stopped the exomizer code later.Just do not forget to save registers (if i remember well exomizer uses AF,AF',BC,DE,HL,IX,IY) and restore the stack in the same configuration (you may use another stack for exomizer itself)
Another solution could be to use interrupts to stop exomizer work and save the context to restore later.
"NOP" is the perfect program : short , fast and (known) bug free

Follow Easter Egg products on Facebook !

SyX

I was waiting to the optimizing at the exomizer decruncher were finished. You can find the actual version of the exomizer decrunchers here. You can see a lot of files there, but you can split in two categories, the official exomizer decrunchers (deexo.asm, deexo_b.asm, deexo_simple.asm and deexo_simple_b.asm) and the optimized versions (the rest). For using the officials, you can use the fantastic guide in the wiki.

For the optimized ones, there is 8 versions (backward and forward from speed=0 to speed=3) and you need to use the tool exoopt.exe (i have attached a compiled version for windows) for processing the exomizer bitstream. And with this tool, you can ignore all the deexoopt* files, because the tool will generate the d.asm file with the correct decruncher source. The syntax of the tool is:
exoopt <input_file> <output_file> <table_address> [<type>]
  <input_file>     Origin file
  <output_file>    Genetated output file
  <table_address>  Hexadecimal address for the temporal 156 bytes table (mapbase*)
  <type>           Target decruncher (f0, f1, f2, f3, b0, b1, b2 and b3)

*If the lower byte of the mapbase is between $F0-$87, we save 1 byte in the decruncher.


The advantages of using these optimized versions are in speed and size of the decruncher. I have made a few test and here is the results (the roms and the source are included).

The size in bytes of the official versions are:
SIZE       normal     simple
forward      182        180
backward     169        167

And the speed in microseconds for decrunching the pacman loading screen:
SPEED      normal     simple
forward   513661     508587
backward  xxxxxx     xxxxxx
* The official backward versions crashed.


The size for the optimized (mapbit lower byte in the range $F0-$87):
SIZE       speed 0   speed 1   speed 2   speed 3
forward       148       150       166       203
backward      146       148       164       201

And the speed for the pacman screen:
SPEED      speed 0   speed 1   speed 2   speed 3
forward    625547    526591    465238    397573
backward   623828    523844    461542    393899


If somebody has any doubts, only ask :)

fano

Thx a lot ! i saw you added ROM example with this  ;)
"NOP" is the perfect program : short , fast and (known) bug free

Follow Easter Egg products on Facebook !

SyX

Yes, i put the source of the rom example there :)

Powered by SMFPacks Menu Editor Mod