Dumb question #15839 - BITBUSTER 1.2

tastefulmrship · 16:58, 11 October 11

@Syx

Lots of great stuff to think about, thanks!
Out of interest, which version of Packfire were you using? Was it the v1.2h release?

EDIT: Using 1.2g, my two test screens (AMPSKULL.SCR & MASKFACE.SCR) do pack better with PackFire than all of the other external packers. ~~Is there a Z80 decompressor available yet?~~

redbox · 18:05, 11 October 11

SyX provided his port of the Packfire decompressor in his post...

tastefulmrship · 18:17, 11 October 11

Quote from: redbox on 18:05, 11 October 11
SyX provided his port of the Packfire decompressor in his post...

Yeah, I guess that's my dumb-move-of-the-day! I even had it in an assembler window in WinAPE!
As the modern philosopher, Homer, would say; "D'oh!"

SyX · 18:19, 11 October 11

Quote from: redbox on 14:27, 11 October 11Packfire rulez! Just need to optimise and it's a real Exomizer beater.

Although that is not an esay task, jejeje, PF is designed to make great use of the 68000 registers and addressing modes by one of the best 68000 assembly coders. He help me to optimize my aplib port to 68000 and we were able to shrink at 164 bytes, only one byte more than the z80 version and we're talking of a cpu with the opcodes of a double size in bytes minimum.

Quote from: redbox on 14:27, 11 October 11What were you doing with Renegade?

I had patched to use the 3 fire buttons joystick that i made a few weeks ago, you can find here

Quote from: TFM/FS on 16:21, 11 October 11So... the best PC cruncher is "just" 25% better than the best CPC cruncher. Actually not really much. But once a while it will be needed :-)))

And if you need help to adding at one of your projects only tell me

Jejeje, you know how i'm old school for a few things (the config of my text editor has not changed since the cygnus editor days

) and new school for all my cross developing tools, jejeje.

Quote from: tastefulmrship on 16:58, 11 October 11Out of interest, which version of Packfire were you using? Was it the v1.2h release?

I'm using the last version (1.2h), but the sources of the decruncher has not changed from a few versions ago, only the cruncher to improve the compression and fix bugs

Quote from: tastefulmrship on 16:58, 11 October 11EDIT: Using 1.2g, my two test screens (AMPSKULL.SCR & MASKFACE.SCR) do pack better with PackFire than all of the other external packers. Is there a Z80 decompressor available yet?

Remember that i only have converted the tiny mode (params -t when you are crunching and with that the crunched sizes are similar to exomizer). I didn't convert the large mode (is a variant of LZMA, tiny is a LZ how aplib and exomizer) because it would be so slow that it would not had practical use in CPC... it's even slow in 68000 and there, it's using every trick in the hat.

redbox · 19:00, 11 October 11

Missed your Renegade patch SyX - very nice though. Didn't realise Amstrad.es has a forum, I need to learn Spanish now (I have been trying French but that is hard, so maybe Spanish will be a nice diversion!).

I love compression because it's real uber programming and maths. The Packfire compressor and decompressor is very exciting, think I'll take a look through your source and try and see what's going on because for once hopefully I'll understand it because it's in Z80

SyX · 20:44, 11 October 11

Quote from: redbox on 19:00, 11 October 11Missed your Renegade patch SyX - very nice though. Didn't realise Amstrad.es has a forum, I need to learn Spanish now (I have been trying French but that is hard, so maybe Spanish will be a nice diversion!).

Everybody is welcome at the spanish forum, it's full of great content and we don't bite at people talking in other languages

Totally agreed, learn a new language is always FUN

Quote from: redbox on 19:00, 11 October 11I love compression because it's real uber programming and maths. The Packfire compressor and decompressor is very exciting, think I'll take a look through your source and try and see what's going on because for once hopefully I'll understand it because it's in Z80

I usually convert or make a compressor to get familiar with a new CPU (my last was convert aplib to 6800, for a Dragon project

), because you need to use a great part of the instruction set, it's a much better "hello world" program

Here is the original 68000 source "hypercomented" (in spanish sorry), you can use to understand better how i converted it:

Code Select

; ------------------------------------------
; PackFire 1.2f - (tiny depacker)
; ------------------------------------------

; ------------------------------------------
; packed data in a0                                     ; A0 = Origen (se usa como puntero a 
                                                        ; la tabla de frecuencias)            
; dest in a1                                            ; A1 = Destino 
start:                  lea     26(a0),a2               ; A2 = A0 + 26  <- Datos comprimidos

                        move.b  (a2)+,d7                ; D7.B = (A2)   <- 1º Byte comprimido
                                                        ; A2++
                                                        ; Se inicializa el contador de bits
                                                        
lit_copy:               move.b  (a2)+,(a1)+             ; (A1).B = (A2).B <- Copiamos byte
                                                        ; A1++
                                                        ; A2++
main_loop:              bsr.b   get_bit                 ; CALL get_bit <- .B un salto corto
                        bcs.b   lit_copy                ; JR   C,lit_copy <- CS Carry Set
                                                        ; Este último trozo lo que hace es
                                                        ; que mientras se obtengan bits a
                                                        ; 1, iremos copiando bytes de forma
                                                        ; literal.
                        
                        moveq   #-1,d3                  ; D3.L = $FFFFFFFF
get_index:              addq.l  #1,d3                   ; D3++
                        bsr.b   get_bit                 ; CALL get_bit
                        bcc.b   get_index               ; JR   NC,get_index <- CC Carry Clear
                        cmp.w   #$10,d3                 ; ¿D3 == $0010?
                        beq.b   depack_stop             ; JR   Z,depack_stop
                                                        ; Se acabó la descompresión
                                                        
                        bsr.b   get_pair                ; CALL get_pair
                        move.w  d3,d6                   ; D6.W = D3.W <- Lo guarda para
                                                        ; usarlo como contador en copy_bytes
                                                        
                        cmp.w   #2,d3                   ; ¿D3 == $0002?
                        ble.b   out_of_range            ; JR  <=,out_of_range 
                                                        ;             <- LE Less or equal
                        moveq   #0,d3                   ; D3.L = 0
                                                        ; En estás 3 últimas líneas lo que
                                                        ; hacen es si D3 > 2, entonces D3=0
                        
out_of_range:           moveq   #0,d0                   ; D0.l = 0
                        move.b  table_len(pc,d3.w),d1   ; D1.B = table_len[D3.W]
                        move.b  table_dist(pc,d3.w),d0  ; D0.B = table_dist[D3.W]
                        bsr.b   get_bits                ; CALL get_bits
                        bsr.b   get_pair                ; CALL get_pair
                                                        ; Se obtiene el offset y la longitud 
                                                        ; de la cadena de bytes ya
                                                        ; descomprimida que se va a copiar
                                                        ; a continuación.
                        move.l  a1,a3                   ; A3.L = A1.L
                        sub.l   d3,a3                   ; A3.L -= D3.L
copy_bytes:             move.b  (a3)+,(a1)+             ; (A1).B = (A3).B
                                                        ; A3++
                                                        ; A1++
                        subq.w  #1,d6                   ; D6.W--
                        bne.b   copy_bytes              ; JR   NZ,copy_bytes <- ¿D6 == 0?
                        bra.b   main_loop               ; JR   main_loop

table_len:              dc.b    $04,$02,$04                
table_dist:             dc.b    $10,$30,$20                

get_pair:               sub.l   a6,a6                   ; A6.L = 0
                        moveq   #$f,d2                  ; D2.L = $0000000F <- D2 es constante
calc_len_dist:          move.w  a6,d0                   ; D0.W = A6.W
                        and.w   d2,d0                   ; D0.W &= D2.W <- D0.W 0-$F
                        bne.b   node                    ; JR   NZ,node
                        moveq   #1,d5                   ; D5.L = 1
node:                   move.w  a6,d4                   ; D4.W = A6.W
                        lsr.w   #1,d4                   ; D4.W >> 1
                        move.b  (a0,d4.w),d1            ; D1 = A0[D4.W] <- Se lee un byte
                                                        ; de la tabla de frecuencias, por lo
                                                        ; que D4 < 26 ó lo que es lo mismo
                                                        ; A6 < 26 * 2, lo cual nos 
                                                        ; permitirá usar un registro de 8 
                                                        ; bits del Z80.
                                                        ;
                        moveq   #1,d4                   ; D4.L = 1
                        and.w   d4,d0                   ; D0.W &= D4.W <- D0.W &= 1
                        beq.b   nibble                  ; JR   NZ,nibble
                                                        ; Podemos sustituir las tres últimas
                                                        ; instrucciones por:
                                                        ; BIT 0,D0 / JR Z,nibble
                        lsr.b   #4,d1                   ; D1.B >> 4
                                                        ; Lo que se ha hecho es seleccionar
                                                        ; entre el nibble de más peso ó
                                                        ; menos peso del byte leido de la 
                                                        ; tabla de frecuencias (así que mas
                                                        ; que 26 bytes, son 52 nibbles), y ; se realiza de forma alternativa,
                                                        ; empezando por el bajo.
                                                        
nibble:                 move.w  d5,d0                   ; D0.W = D5.W 
                        and.w   d2,d1                   ; D1.W &= $F <- D1.W 0-$F
                                                        ; nos quedamos solo con el nibble
                        lsl.l   d1,d4                   ; D4.L << D1 <- 1 << D1
                                                        ; D4 = 2^D1 <- 1 - 32768
                        add.l   d4,d5                   ; D5.L += D4.L
                        addq.w  #1,a6                   ; D6.W++
                                                        ; Recordar que D6 < 26 * 2
                        dbf     d3,calc_len_dist        ; D3.W--
                                                        ; ¿D3 == $FFFF?
                                                        ; JR   NZ,calc_len_dist
                                                        ; En el 68000, su DJNZ es hasta -1
                                                        ; en lugar de a 0, como en el Z80.
                                                        
get_bits:               moveq   #0,d3                   ; D3.L = 0 
getting_bits:           subq.b  #1,d1                   ; D1.B--
                        bhs.b   cont_get_bit            ; JR   NC,cont_get_bit <- HS = CC
                        add.w   d0,d3                   ; D3.W += D0.W
depack_stop:            rts                             ; RET <- Por aquí se sale también
cont_get_bit:           bsr.b   get_bit                 ; CALL get_bit
                        addx.l  d3,d3                   ; D3 = D3 * 2 + Carry
                                                        ; En el registro de estado del 68000
                                                        ; a parte del Acarreo, hay una
                                                        ; bandera X (eXtended) que se usa
                                                        ; como bit de más peso en las
                                                        ; operaciones matemáticas, aunque 
                                                        ; normalmente es una copia del
                                                        ; Acarreo, como en este caso, no 
                                                        ; siempre lo es.
                        bra.b   getting_bits            ; JR   getting_bits
                        
get_bit:                add.b   d7,d7                   ; D7 << 1
                        bne.b   byte_done               ; JR   NZ,byte_done <- RET NZ
                        move.b  (a2)+,d7                ; D7.B = (A2)
                                                        ; A2++
                        addx.b  d7,d7                   ; D7 = D7 * 2 + Carry
byte_done:              rts                             ; RET
                                                        ; Esta es la típica rutina para
                                                        ; extraer bits de los datos
                                                        ; comprimidos.

Nich · 21:15, 11 October 11

Quote from: Nich on 11:45, 09 October 11
I really ought to write a CPCWiki article on how to use Exomizer with a CPC...

I've done it!

It's rather verbose, and it assumes that you are using a Windows machine (sorry, Linux users!

), and there are still some things I want to add to the article, such as how to use backwards compression. I also want to provide a version of the decompression routines that actually work with WinAPE without having to make significant corrections to it. Let me know what you think of the article!

SyX · 21:35, 11 October 11

It looks Great Nich!!!

I think is perfect for everybody interested in how use a cross compresor (all works exactly the same) in his projects and kudos for explain the safety offset, not everybody know that feature

ervin · 23:13, 11 October 11

Fantastic stuff Nich!
I'm going to find that very handy indeed.
Thanks!

Executioner · 00:31, 12 October 11

Quote from: Nich on 21:15, 11 October 11
I also want to provide a version of the decompression routines that actually work with WinAPE without having to make significant corrections to it.

There's actually only a couple of minor changes:

1. Remove the : from the comment line at the top or add an extra ;
2. Remove one t from the gett4bits label
3. Replace the [] with () for all the [iy+d] instructions

Maybe these changes should be recommended to the original author, and maybe WinAPE should support square brackets for indexed addressing even though it's not MAXAM compatible.

redbox · 10:16, 12 October 11

Nice work Nich.

The decompressor on the CPCRulez website assembles with Maxam/WinAPE out of the box.

I think you just need to Remark one of the comments at the bottom.

TFM · 17:18, 12 October 11

Quote from: SyX on 18:19, 11 October 11
And if you need help to adding at one of your projects only tell me

Actually I take that offer for our Giana Sister Clone :-) I intend to continue this project now, had no time to work at it since February. I come back to crunching as soon as needed :-)))

Nich · 19:08, 12 October 11

Quote from: Executioner on 00:31, 12 October 11
There's actually only a couple of minor changes:

1. Remove the : from the comment line at the top or add an extra ;
2. Remove one t from the gett4bits label
3. Replace the [] with () for all the [iy+d] instructions

Maybe these changes should be recommended to the original author, and maybe WinAPE should support square brackets for indexed addressing even though it's not MAXAM compatible.

Is it possible for you to fix the bug that prevents colons from being included within comments?

redbox · 19:18, 12 October 11

Just double colon ;; at the start of the comment.

Executioner · 23:59, 13 October 11

Quote from: Nich on 19:08, 12 October 11
Is it possible for you to fix the bug that prevents colons from being included within comments?

If it were a bug, I'd fix it, but Maxam does the same. I'm thinking maybe I could set up some default options where this (and eg. operator precedence) could be changed just by adding assembler options.

TFM · 18:19, 14 October 11

The Executioner is right.

This is not a bug, it's a feature! It allows to add code after comments in the same line. And it's good to have it that way.

Nich · 19:25, 16 October 11

Quote from: TFM/FS on 18:19, 14 October 11
This is not a bug, it's a feature! It allows to add code after comments in the same line. And it's good to have it that way.

Why would you want to do that? Every programming example I've seen puts comments after the code if the comment is on the same line.

Anyway, if the 'bug' is a feature of Maxam, then I accept Executioner's decision not to fix this problem.

Executioner · 23:46, 17 October 11

Quote from: Nich on 19:25, 16 October 11
Why would you want to do that? Every programming example I've seen puts comments after the code if the comment is on the same line.

I agree it's a bit silly, and I'd like to fix it, but that may break a lot of existing code. Maybe instead of a load of options in the files allowing you to change the way WinAPE assembler works there could be one check box, "Strict Maxam Compatibility" which enabled code after comments, and disabled such things as macros. Problem is you'd need to turn it on/off for different files, so it may be better to put something like ;;$M at the top of any file you want strict compatibility in.

TFM · 22:13, 18 October 11

Quote from: Nich on 19:25, 16 October 11
Why would you want to do that? Every programming example I've seen puts comments after the code if the comment is on the same line.

Haha, have you ever seen Odiesofts Code? Every line is nearly full (250 chars?). Opcodes, Comments, Opcodes ... Why? Because memory is limited, and the READ instruction is limited. We're talking about a CPC, not emulation.

So my code it nice at the beginning, but with adding more functions sometimes every character has to be saved to be pressed in a single file.

However, that's the way we did it in the old days (and I like still to work that way, without PCs!). Today everybody first buys a PC to be able to program for the CPC. Today you don't care about RAM and filesize, but in the old days it was different. And THEREFORE Arnor was smart enought to introduce that FEATURE

TFM · 22:17, 18 October 11

Quote from: Executioner on 23:46, 17 October 11
I agree it's a bit silly, and I'd like to fix it, but that may break a lot of existing code. Maybe instead of a load of options in the files allowing you to change the way WinAPE assembler works there could be one check box, "Strict Maxam Compatibility" which enabled code after comments, and disabled such things as macros. Problem is you'd need to turn it on/off for different files, so it may be better to put something like ;;$M at the top of any file you want strict compatibility in.

This is now very IMHO.... It I have to assemble code then I have no time to waste. So I can use WinApe and if it works then it's great! But it there are problems, then I will just take another assembler - for me this is MAXAM on ROM. And if I use that under Winape then it runs very quick. But I will not waste time to search the problem of incompatibiltity. Maybe other people think the same way. So what ever you do, IMHO it would be great to keep it easy

Metalbrain · 19:19, 17 January 12

Hi all!
I've just seen the CPCWiki article only uses the official exomizer release, and therefore you don't get my optimizations. Anyway, the depackers have been recently optimized by Antonio Villena, and I'm gonna start a thread about it.

Quote from: Executioner on 00:31, 12 October 11

2. Remove one t from the gett4bits label
3. Replace the [] with () for all the [iy+d] instructions

Ops

, I don't know how those got there in the official exomizer release, I guess I didn't check it too much because it happened like ten months after I had released my depackers. The new versions I just sent to Magnus Lind shouldn't have those bugs.

Metalbrain · 19:35, 17 January 12

Quote from: SyX on 18:26, 10 October 11Apropos of compressor (the Babel subject is other world, jejejeje, and wait to add the dialects to the discussion ), i don't trust in exomizer, i have used in a few projects without problems even in backward mode, but i have crunched a few files that the z80 decruncher can not decrunch and passed a lot of time debugging my code

Do you still have those files?

Quote from: SyX on 18:26, 10 October 11i have converted packfire tiny mode to z80, and how aplib, it doesn't need a buffer in ram how exomizer (the table is in the first 26 bytes of every crunched file) and the decrunched files are the same size that exomizer (or even a few bytes smaller), but is much much more slower and the tiny mode only can compress up 32 KBs, not a single byte more.

When I first converted exomizer depackers to Z80, I noticed the format was wasting 2 bits, so I made an optimizer to cut those 2 bits and also resort the bit stream to be read from left to right (faster on Z80 thanks to ADD instructions). Checking Packfire's tiny mode, I noticed it's basically the same as exomizer but cutting one of those redundant bits from the "official exomizer 2 format", but not the other one, so it only has a 12,5% of probability to improve the resulting size in one byte, not a 25% like my optimizator has

.
The fact that your depacker needs no table is interesting, but I'm not sure if the extra time it takes to get the info from the minitable is really worth it.

SyX · 21:02, 17 January 12

Quote from: Metalbrain on 19:35, 17 January 12Do you still have those files?

I think no, it happened mainly when i was making the tapes for the Mojon Twins games (a problem with a sequence of opcodes generated by z88dk!?!?!?!?), but i will give a test to your new version and if i get another "decrunching bug" i will send you

Quote from: Metalbrain on 19:35, 17 January 12The fact that your depacker needs no table is interesting, but I'm not sure if the extra time it takes to get the info from the minitable is really worth it.

Yes, those are the problems, bu it was more a proof of concept and an exercise for refreshing my 68000, the "damned" decruncher abuse of 68000 registers and addressing modes, so much flexibles that our poor z80

, i had to use self-modify code and that is not always the best solution for so little code

News:

Dumb question #15839 - BITBUSTER 1.2

tastefulmrship

tastefulmrship