Hi there, i made a decrunch routine for the Amiga Shrinkler cruncher
This statistic cruncher is a LZMA stripdown algo so it is very efficiently but also very slow and need 2.5k of memory to store probabilities
To compare with exomizer and the wellknown Boubler Dash executable, you gain 10% over exomizer
The decrunch speed is not dependant of the target size but of the crunched size -> expect more than 15s for a 4K intro
From 350 bytes, Thanks to Hicks, Antonio Villena & Urusergi the routine were quickly sized down to 245 bytes
Madram joined the game and sent me his version, reoptimised and simplified (D4 in 8 bits) 209 bytes for a recall version!
Parameters to use with the version:
IX=source
DE=destination
NOTICE: The source is supposed to be used with an assembler dealing with a correct operator priority. This need adaptations for old assemblers (remove parenthesis, use division instead of shifting, ...)
Shrinkler Amiga/Linux/Windows executables here: http://crinkler.net/shrinkler45.zip (http://crinkler.net/shrinkler45.zip)
Official topic here: http://ada.untergrund.net/?p=boardthread&id=264 (http://ada.untergrund.net/?p=boardthread&id=264)
With the cruncher use the options -d -p -9
-d in order to crunch without header in output
-p to avoid shit with windows (don't care with Linux)
-9 to crunch better, the default parameter is not enough to have good results on small files
sample test with a crunched screen (rasm source)
buildsna
bankset 0
org #100
run #100
di
ld sp,#100
ld ix,#4000
ld de,#C000
breakpoint
call shrinkler_decrunch
breakpoint
jr $
include 'shrinkler.asm'
org #4000
incbin 'defaultscr.shr'
Wow! Well done!
thanks: updated 1st post with official cruncher executables and official Amiga topic
I'll look at it as soon as possible.
However with 1min to fill the banks, I doubt I'll use it
Quote from: krusty_benediction on 11:28, 31 January 18
I'll look at it as soon as possible.
However with 1min to fill the banks, I doubt I'll use it
In fact i misled.
The decrunch speed is not dependant of the target size but of the crunched size ;D roughly 5000 nops per crunched bytes
My test was 2748 bytes crunched so 13.7s for 16k. Once the bits are decoded, that's just ldi or ldir
Quote from: roudoudou on 10:42, 31 January 18
I may send a commented version if people want to try to optimize the code
Yes, please, please :)
Yeah! I gain 2 bytes more with an optim!
Updated the first post with new source and commented version
Quote from: roudoudou on 10:42, 31 January 18
Hi there, i made a decrunch routine for the Amiga Shrinkler cruncher
This statistic cruncher is a LZMA stripdown algo so it is very efficiently but also very slow and need memory to store probabilities
To compare with exomizer and the wellknown Boubler Dash executable, you gain 10% over exomizer
But...
But the decrunch routine is 180 bytes bigger and the buffer needs 3Kb anywhere in the memory
Speed decrunch performance is roughly 1kb/s
The decrunch speed is not dependant of the target size but of the crunched size -> 5000 nops per crunched byte
Do not hesitate to report me bugs
I may send a commented version if people want to try to optimize the code
The version here is supposed to be compliant with all assemblers
Shrinkler Amiga/Linux/Windows executables here: http://crinkler.net/shrinkler45.zip (http://crinkler.net/shrinkler45.zip)
Official topic here: http://ada.untergrund.net/?p=boardthread&id=264 (http://ada.untergrund.net/?p=boardthread&id=264)
Well done !
New file in the first post shrinkler_z80_330.asm (http://www.cpcwiki.eu/forum/programming/shrinkler-z80-decrunch-routine/?action=dlattach;attach=24591)
I re-wrote the variables initialization
I modified the spread of flags
I factored some call parameters
Finally, I re-wrote the many calls to the virtual registry D6
The result is that the binary is now 330 bytes 8)
With Hicks advices, some size optimisations at the end -> now 326 bytes ;D
First post updated with source
323 ;D 319
The current version is now 304 bytes :D
Quote from: roudoudou on 21:21, 02 February 18
The current version is now 304 bytes :D
-2 bytes
shrinkler_decrunch:
ld (shrinkler_a5+1), hl
; Init range decoder state
ld hl, shrinkler_dr+1536*2+8
ld bc, 1536*2
ld (hl), c
inc hl
ld (hl), $80
ld de, shrinkler_dr+1536*2+7
lddr
ld c, $08
dec hl
lddr
inc (hl)
shrinkler_lit:
Another -2 bytes
shrinkler_lit:
; Literal
scf
shrinkler_getlit:
call nc, shrinkler_getbit
ld hl, shrinkler_d6
rl (hl)
jr nc, shrinkler_getlit
shrinkler_a5:
ld de, $1234
ldi
Easy -1 byte
jr nc, shrinkler_lit
; Reference
sbc hl, hl
call shrinkler_altgetbit
@roudoudo - have the optimisations found so far resulted in any speed increases?
I submitted to Roudoudou a lot of optimisations by PM, but I will write here now if other people work on it too (hi antonio!).
shrinkler_nonewword
ld (shrinkler_d4),de : ld (shrinkler_d4+2),hl ; written in little endian
ex af,af' ; retrieve previous carry
ld hl,(shrinkler_d2) : adc hl,hl : ld (shrinkler_d2),hl
ld hl,(shrinkler_d3) : add hl,hl : ld (shrinkler_d3),hl ; 0x22
Becomes (for -1) :
shrinkler_nonewword
ld (shrinkler_d4),de : ld (shrinkler_d4+2),hl ; written in little endian
ex af,af'
ld hl,shrinkler_d2+1
rl (hl) : dec hl : rl (hl) : dec hl
sla (hl) : dec hl : rl (hl)
But you must modify the order of the labels : d3 - d6 - d2 becomes d3 - d2 - d6.
Quote from: ervin on 08:34, 03 February 18
@roudoudo - have the optimisations found so far resulted in any speed increases?
The speed gain/loss is not significant. Thanks Antonio for the optims!
Hicks, the registers order of d3 (first) & d6 (last) can't be changed without changing the init
I updated first post with 299 bytes version
I modified my post: the same tricks works with d3 - d2 - d6 order...
First must be d3 and last d4 (no d6), so we can reverse d2 / d6.
... (2s)
ok, i succeed to make the rl optim
(updated the first post)
ld hl,shrinkler_d3
sla (hl) : inc hl : rl (hl) : inc hl
ex af,af' ; retrieve previous carry
rl (hl) : inc hl : rl (hl)
This part:
shrinkler_readoffset:
ld a,3
call shrinkler_getnumber
; retour systématique avec carry et HL=BC
ld hl,2
xor a
sbc hl,bc
Since Carry is always=1, then:
shrinkler_readoffset:
ld a,3
call shrinkler_getnumber
; retour systématique avec carry et HL=BC
ld hl,2+1 ; +1 in order to compensate Carry
sbc hl,bc
Then -1 byte
Another -1 byte:
shrinkler_d5: ld de,#0101
ld hl,(shrinkler_a5+1)
push hl
add hl,de
pop de
Becomes:
shrinkler_d5: ld hl,#0101
ld de,(shrinkler_a5+1)
add hl,de
A lot of CPU saved too!
I like where this tread is going :)
296!
-1 byte:
ex de,hl
ld hl,1 ; d7
shrinkler_bitsloop:
exx
call shrinkler_getbit
exx
adc hl,hl
ld a,(de)
sub 2
ld (de),a
jr nc,shrinkler_bitsloop
ld b,h
ld c,l
ret
Becomes:
ld bc,1
shrinkler_bitsloop:
exx
call shrinkler_getbit
exx
rl c : rl b
ld a,(hl)
sub 2
ld (hl),a
jr nc,shrinkler_bitsloop
ret
Save CPU too.
valid! updated first post, blablabla
-2 bytes. But without apply the last -1
shrinkler_decrunch:
ld (shrinkler_a5+1), hl
; Init range decoder state
ld hl, shrinkler_dr+1536*2+8
ld bc, 1536*2
ld (hl), c
inc hl
ld (hl), $80
ld de, shrinkler_dr+1536*2+7
lddr
ld c, $08
dec hl
lddr
inc (hl)
shrinkler_lit:
; Literal
scf
shrinkler_getlit:
call nc, shrinkler_getbit
ld hl, shrinkler_d6
rl (hl)
jr nc, shrinkler_getlit
shrinkler_a5:
ld de, $1234
ldi
; After literal
call shrinkler_getkind
jr nc, shrinkler_lit
; Reference
sbc hl, hl
call shrinkler_altgetbit
jr nc, shrinkler_readoffset
shrinkler_readlength:
ld a, 4
call shrinkler_getnumber
shrinkler_d5:
ld hl, $0101
ld de, (shrinkler_a5+1)
add hl, de
ldir
; After reference
call shrinkler_getkind
jr nc, shrinkler_lit
shrinkler_readoffset:
ld a, 3
call shrinkler_getnumber
; retour systématique avec carry et HL=BC
ld hl, 3
sbc hl, bc
ld (shrinkler_d5+1), hl
jr nz, shrinkler_readlength
ret
;--------------------------------------------------
shrinkler_getnumber:
; Out: Number in HL
ld hl, shrinkler_d6+1
ld (hl), a
dec hl
ld (hl), 0
shrinkler_numberloop:
inc (hl)
inc (hl)
call shrinkler_getbit
jr c, shrinkler_numberloop
dec (hl)
ex de, hl
ld hl, 1 ; d7
shrinkler_bitsloop:
call shrinkler_getbit
adc hl, hl
ld a, (de)
sub 2
ld (de), a
jr nc, shrinkler_bitsloop
ld b, h
ld c, l
ret
;--------------------------------------------------
; Out: Bit in C
shrinkler_readbit:
ld hl, (shrinkler_d4)
add hl, hl
ex de, hl
ld hl, (shrinkler_d4+2) ; lu en little endian
adc hl, hl
ex af, af'
ld a, h
or l
or d
or e
jr nz, shrinkler_nonewword
; HL=DE=0
ld e, 4
add ix, de
ld l, (ix-1)
ld h, (ix-2)
ld e, (ix-3)
ld d, (ix-4) ; DEHL=(a4) nouvelle valeur lue en big endian!
ex af, af' ; injecte la CARRY précédente
adc hl, hl
ex hl, de
adc hl, hl
ex af, af' ; save carry
shrinkler_nonewword:
ld (shrinkler_d4), de
ld (shrinkler_d4+2), hl ; mais écrite en little endian
ld hl, shrinkler_d3
sla (hl)
inc hl
rl (hl)
inc hl
ex af, af' ; retrieve previous carry
rl (hl)
inc hl
rl (hl)
jr shrinkler_getbit1
;--------------------------------------------------
shrinkler_getkind:
;Use parity as context
ld (shrinkler_a5+1), de
xor a
ld l, a
inc a
and e
ld h, a
shrinkler_altgetbit:
ld (shrinkler_d6), hl
shrinkler_getbit:
exx
shrinkler_getbit1:
ld a, (shrinkler_d3+1) ; obligé de relire les 8 bits forts la valeur...
add a, a
jr nc, shrinkler_readbit
ld hl, (shrinkler_d6)
add hl, hl
ld de, shrinkler_pr+2 ; cause -1 context
add hl, de
push hl
ld e, (hl)
inc hl
ld d, (hl)
; D1 = One prob
push de
ld b, d
ld c, e ; bc=de=d1 / hl=a1
ld a, 4
shrinkler_shift4:
srl b
rr c
dec a
jr nz, shrinkler_shift4
xor a
ex hl, de
sbc hl, bc ; hl=d1-d1/16
ex hl, de
ld (hl), d
dec hl
ld (hl), e
pop bc ; bc=d1 initial
ld de, (shrinkler_d3)
; input: DE x BC
; output: DEHL
ld h, a
ld l, a
ld a, 16
shrinkler_muluw:
add hl, hl
rl e
rl d
jr nc, shrinkler_cont
add hl, bc
jr nc, shrinkler_cont
inc de
shrinkler_cont:
dec a
jr nz, shrinkler_muluw
ld hl, (shrinkler_d2)
xor a
sbc hl, de
jr c, shrinkler_one
shrinkler_zero:
; oneprob = oneprob * (1 - adjust) = oneprob - oneprob * adjust
ld (shrinkler_d2), hl
ld hl, (shrinkler_d3)
sbc hl, de
pop de
jr shrinkler_d3ret
shrinkler_one:
; onebrob = 1 - (1 - oneprob) * (1 - adjust) = oneprob - oneprob * adjust + adjust
; move+add out of order!
pop hl
dec a
add a, (hl)
ld (hl), a
inc hl
ld a, (hl)
adc a, $0f ; (a1)+#FFF
ld (hl), a
scf ; SET CARRY
ex de, hl
shrinkler_d3ret:
ld (shrinkler_d3), hl
exx
ret
Corrected with the Hicks code:
shrinkler_decrunch:
ld (shrinkler_a5+1), hl
; Init range decoder state
ld hl, shrinkler_dr+1536*2+8
ld bc, 1536*2
ld (hl), c
inc hl
ld (hl), $80
ld de, shrinkler_dr+1536*2+7
lddr
ld c, $08
dec hl
lddr
inc (hl)
shrinkler_lit:
; Literal
scf
shrinkler_getlit:
call nc, shrinkler_getbit
ld hl, shrinkler_d6
rl (hl)
jr nc, shrinkler_getlit
shrinkler_a5:
ld de, $1234
ldi
; After literal
call shrinkler_getkind
jr nc, shrinkler_lit
; Reference
sbc hl, hl
call shrinkler_altgetbit
jr nc, shrinkler_readoffset
shrinkler_readlength:
ld a, 4
call shrinkler_getnumber
shrinkler_d5:
ld hl, $0101
ld de, (shrinkler_a5+1)
add hl, de
ldir
; After reference
call shrinkler_getkind
jr nc, shrinkler_lit
shrinkler_readoffset:
ld a, 3
call shrinkler_getnumber
; retour systématique avec carry et HL=BC
ld hl, 3
sbc hl, bc
ld (shrinkler_d5+1), hl
jr nz, shrinkler_readlength
ret
;--------------------------------------------------
shrinkler_getnumber:
; Out: Number in HL
ld hl, shrinkler_d6+1
ld (hl), a
dec hl
ld (hl), 0
shrinkler_numberloop:
inc (hl)
inc (hl)
call shrinkler_getbit
jr c, shrinkler_numberloop
dec (hl)
ld bc, 1
shrinkler_bitsloop:
call shrinkler_getbit
rl c
rl b
ld a, (hl)
sub 2
ld (hl), a
jr nc, shrinkler_bitsloop
ret
;--------------------------------------------------
; Out: Bit in C
shrinkler_readbit:
ld hl, (shrinkler_d4)
add hl, hl
ex de, hl
ld hl, (shrinkler_d4+2) ; lu en little endian
adc hl, hl
ex af, af'
ld a, h
or l
or d
or e
jr nz, shrinkler_nonewword
; HL=DE=0
ld e, 4
add ix, de
ld l, (ix-1)
ld h, (ix-2)
ld e, (ix-3)
ld d, (ix-4) ; DEHL=(a4) nouvelle valeur lue en big endian!
ex af, af' ; injecte la CARRY précédente
adc hl, hl
ex hl, de
adc hl, hl
ex af, af' ; save carry
shrinkler_nonewword:
ld (shrinkler_d4), de
ld (shrinkler_d4+2), hl ; mais écrite en little endian
ld hl, shrinkler_d3
sla (hl)
inc hl
rl (hl)
inc hl
ex af, af' ; retrieve previous carry
rl (hl)
inc hl
rl (hl)
jr shrinkler_getbit1
;--------------------------------------------------
shrinkler_getkind:
;Use parity as context
ld (shrinkler_a5+1), de
xor a
ld l, a
inc a
and e
ld h, a
shrinkler_altgetbit:
ld (shrinkler_d6), hl
shrinkler_getbit:
exx
shrinkler_getbit1:
ld a, (shrinkler_d3+1) ; obligé de relire les 8 bits forts la valeur...
add a, a
jr nc, shrinkler_readbit
ld hl, (shrinkler_d6)
add hl, hl
ld de, shrinkler_pr+2 ; cause -1 context
add hl, de
push hl
ld e, (hl)
inc hl
ld d, (hl)
; D1 = One prob
push de
ld b, d
ld c, e ; bc=de=d1 / hl=a1
ld a, 4
shrinkler_shift4:
srl b
rr c
dec a
jr nz, shrinkler_shift4
xor a
ex hl, de
sbc hl, bc ; hl=d1-d1/16
ex hl, de
ld (hl), d
dec hl
ld (hl), e
pop bc ; bc=d1 initial
ld de, (shrinkler_d3)
; input: DE x BC
; output: DEHL
ld h, a
ld l, a
ld a, 16
shrinkler_muluw:
add hl, hl
rl e
rl d
jr nc, shrinkler_cont
add hl, bc
jr nc, shrinkler_cont
inc de
shrinkler_cont:
dec a
jr nz, shrinkler_muluw
ld hl, (shrinkler_d2)
xor a
sbc hl, de
jr c, shrinkler_one
shrinkler_zero:
; oneprob = oneprob * (1 - adjust) = oneprob - oneprob * adjust
ld (shrinkler_d2), hl
ld hl, (shrinkler_d3)
sbc hl, de
pop de
jr shrinkler_d3ret
shrinkler_one:
; onebrob = 1 - (1 - oneprob) * (1 - adjust) = oneprob - oneprob * adjust + adjust
; move+add out of order!
pop hl
dec a
add a, (hl)
ld (hl), a
inc hl
ld a, (hl)
adc a, $0f ; (a1)+#FFF
ld (hl), a
scf ; SET CARRY
ex de, hl
shrinkler_d3ret:
ld (shrinkler_d3), hl
exx
ret
Another -1
shrinkler_decrunch:
ld (shrinkler_a5+1), hl
; Init range decoder state
ld hl, shrinkler_dr+1536*2+8
ld bc, 1536*2
ld (hl), c
inc hl
ld (hl), $80
ld de, shrinkler_dr+1536*2+7
lddr
ld c, $08
dec hl
lddr
inc (hl)
shrinkler_lit:
; Literal
scf
shrinkler_getlit:
call nc, shrinkler_getbit
ld hl, shrinkler_d6
rl (hl)
jr nc, shrinkler_getlit
shrinkler_a5:
ld de, $1234
ldi
; After literal
call shrinkler_getkind
jr nc, shrinkler_lit
; Reference
sbc hl, hl
call shrinkler_altgetbit
jr nc, shrinkler_readoffset
shrinkler_readlength:
ld a, 4
call shrinkler_getnumber
shrinkler_d5:
ld hl, $0101
ld de, (shrinkler_a5+1)
add hl, de
ldir
; After reference
call shrinkler_getkind
jr nc, shrinkler_lit
shrinkler_readoffset:
ld a, 3
call shrinkler_getnumber
; retour systématique avec carry et HL=BC
ld hl, 2
sbc hl, bc
ld (shrinkler_d5+1), hl
jr nz, shrinkler_readlength
ret
;--------------------------------------------------
shrinkler_getnumber:
; Out: Number in HL
ld hl, shrinkler_d6+1
ld (hl), a
dec hl
ld (hl), 0
shrinkler_numberloop:
inc (hl)
inc (hl)
call shrinkler_getbit
jr c, shrinkler_numberloop
dec (hl)
ld bc, 1
shrinkler_bitsloop:
call shrinkler_getbit
rl c
rl b
dec (hl)
dec (hl)
jp p, shrinkler_bitsloop
ret
;--------------------------------------------------
; Out: Bit in C
shrinkler_readbit:
ld hl, (shrinkler_d4)
add hl, hl
ex de, hl
ld hl, (shrinkler_d4+2) ; lu en little endian
adc hl, hl
ex af, af'
ld a, h
or l
or d
or e
jr nz, shrinkler_nonewword
; HL=DE=0
ld e, 4
add ix, de
ld l, (ix-1)
ld h, (ix-2)
ld e, (ix-3)
ld d, (ix-4) ; DEHL=(a4) nouvelle valeur lue en big endian!
ex af, af' ; injecte la CARRY précédente
adc hl, hl
ex hl, de
adc hl, hl
ex af, af' ; save carry
shrinkler_nonewword:
ld (shrinkler_d4), de
ld (shrinkler_d4+2), hl ; mais écrite en little endian
ld hl, shrinkler_d3
sla (hl)
inc hl
rl (hl)
inc hl
ex af, af' ; retrieve previous carry
rl (hl)
inc hl
rl (hl)
jr shrinkler_getbit1
;--------------------------------------------------
shrinkler_getkind:
;Use parity as context
ld (shrinkler_a5+1), de
xor a
ld l, a
inc a
and e
ld h, a
shrinkler_altgetbit:
ld (shrinkler_d6), hl
shrinkler_getbit:
exx
shrinkler_getbit1:
ld a, (shrinkler_d3+1) ; obligé de relire les 8 bits forts la valeur...
add a, a
jr nc, shrinkler_readbit
ld hl, (shrinkler_d6)
add hl, hl
ld de, shrinkler_pr+2 ; cause -1 context
add hl, de
push hl
ld e, (hl)
inc hl
ld d, (hl)
; D1 = One prob
push de
ld b, d
ld c, e ; bc=de=d1 / hl=a1
ld a, 4
shrinkler_shift4:
srl b
rr c
dec a
jr nz, shrinkler_shift4
xor a
ex hl, de
sbc hl, bc ; hl=d1-d1/16
ex hl, de
ld (hl), d
dec hl
ld (hl), e
pop bc ; bc=d1 initial
ld de, (shrinkler_d3)
; input: DE x BC
; output: DEHL
ld h, a
ld l, a
ld a, 16
shrinkler_muluw:
add hl, hl
rl e
rl d
jr nc, shrinkler_cont
add hl, bc
jr nc, shrinkler_cont
inc de
shrinkler_cont:
dec a
jr nz, shrinkler_muluw
ld hl, (shrinkler_d2)
xor a
sbc hl, de
jr c, shrinkler_one
shrinkler_zero:
; oneprob = oneprob * (1 - adjust) = oneprob - oneprob * adjust
ld (shrinkler_d2), hl
ld hl, (shrinkler_d3)
sbc hl, de
pop de
jr shrinkler_d3ret
shrinkler_one:
; onebrob = 1 - (1 - oneprob) * (1 - adjust) = oneprob - oneprob * adjust + adjust
; move+add out of order!
pop hl
dec a
add a, (hl)
ld (hl), a
inc hl
ld a, (hl)
adc a, $0f ; (a1)+#FFF
ld (hl), a
scf ; SET CARRY
ex de, hl
shrinkler_d3ret:
ld (shrinkler_d3), hl
exx
ret
Another easy -1
shrinkler_decrunch:
ld (shrinkler_a5+1), hl
; Init range decoder state
ld hl, shrinkler_dr+1536*2+8
ld bc, 1536*2
ld (hl), c
inc hl
ld (hl), $80
ld de, shrinkler_dr+1536*2+7
lddr
ld c, $08
dec hl
lddr
inc (hl)
shrinkler_lit:
; Literal
scf
shrinkler_getlit:
call nc, shrinkler_getbit
ld hl, shrinkler_d6
rl (hl)
jr nc, shrinkler_getlit
shrinkler_a5:
ld de, $1234
ldi
; After literal
call shrinkler_getkind
jr nc, shrinkler_lit
; Reference
sbc hl, hl
call shrinkler_altgetbit
jr nc, shrinkler_readoffset
shrinkler_readlength:
ld a, 4
call shrinkler_getnumber
shrinkler_d5:
ld hl, $0101
ld de, (shrinkler_a5+1)
add hl, de
ldir
; After reference
call shrinkler_getkind
jr nc, shrinkler_lit
shrinkler_readoffset:
ld a, 3
call shrinkler_getnumber
; return without carry and HL=BC
ld hl, 2
sbc hl, bc
ld (shrinkler_d5+1), hl
jr nz, shrinkler_readlength
ret
;--------------------------------------------------
shrinkler_getnumber:
; Out: Number in HL
ld hl, shrinkler_d6+1
ld (hl), a
dec hl
ld (hl), 0
shrinkler_numberloop:
inc (hl)
inc (hl)
call shrinkler_getbit
jr c, shrinkler_numberloop
dec (hl)
ld bc, 1
shrinkler_bitsloop:
call shrinkler_getbit
rl c
rl b
dec (hl)
dec (hl)
ret m
jr shrinkler_bitsloop
;--------------------------------------------------
; Out: Bit in C
shrinkler_readbit:
ld hl, (shrinkler_d4)
add hl, hl
ex de, hl
ld hl, (shrinkler_d4+2) ; lu en little endian
adc hl, hl
ex af, af'
ld a, h
or l
or d
or e
jr nz, shrinkler_nonewword
; HL=DE=0
ld e, 4
add ix, de
ld l, (ix-1)
ld h, (ix-2)
ld e, (ix-3)
ld d, (ix-4) ; DEHL=(a4) nouvelle valeur lue en big endian!
ex af, af' ; injecte la CARRY précédente
adc hl, hl
ex hl, de
adc hl, hl
ex af, af' ; save carry
shrinkler_nonewword:
ld (shrinkler_d4), de
ld (shrinkler_d4+2), hl ; mais écrite en little endian
ld hl, shrinkler_d3
sla (hl)
inc hl
rl (hl)
inc hl
ex af, af' ; retrieve previous carry
rl (hl)
inc hl
rl (hl)
jr shrinkler_getbit1
;--------------------------------------------------
shrinkler_getkind:
;Use parity as context
ld (shrinkler_a5+1), de
xor a
ld l, a
inc a
and e
ld h, a
shrinkler_altgetbit:
ld (shrinkler_d6), hl
shrinkler_getbit:
exx
shrinkler_getbit1:
ld a, (shrinkler_d3+1) ; obligé de relire les 8 bits forts la valeur...
add a, a
jr nc, shrinkler_readbit
ld hl, (shrinkler_d6)
add hl, hl
ld de, shrinkler_pr+2 ; cause -1 context
add hl, de
push hl
ld e, (hl)
inc hl
ld d, (hl)
; D1 = One prob
push de
ld b, d
ld c, e ; bc=de=d1 / hl=a1
ld a, 4
shrinkler_shift4:
srl b
rr c
dec a
jr nz, shrinkler_shift4
xor a
ex hl, de
sbc hl, bc ; hl=d1-d1/16
ex hl, de
ld (hl), d
dec hl
ld (hl), e
pop bc ; bc=d1 initial
ld de, (shrinkler_d3)
; input: DE x BC
; output: DEHL
ld h, a
ld l, a
ld a, 16
shrinkler_muluw:
add hl, hl
rl e
rl d
jr nc, shrinkler_cont
add hl, bc
jr nc, shrinkler_cont
inc de
shrinkler_cont:
dec a
jr nz, shrinkler_muluw
ld hl, (shrinkler_d2)
xor a
sbc hl, de
jr c, shrinkler_one
shrinkler_zero:
; oneprob = oneprob * (1 - adjust) = oneprob - oneprob * adjust
ld (shrinkler_d2), hl
ld hl, (shrinkler_d3)
sbc hl, de
pop de
jr shrinkler_d3ret
shrinkler_one:
; onebrob = 1 - (1 - oneprob) * (1 - adjust) = oneprob - oneprob * adjust + adjust
; move+add out of order!
pop hl
dec a
add a, (hl)
ld (hl), a
inc hl
ld a, (hl)
adc a, $0f ; (a1)+#FFF
ld (hl), a
scf ; SET CARRY
ex de, hl
shrinkler_d3ret:
ld (shrinkler_d3), hl
exx
ret
the JP P does not work because the overflow may occur at the first DEC (hl)
BTW for short modif is it possible to post only a few lines? ;D
Quote from: roudoudou on 22:06, 03 February 18
the JP P does not work because the overflow may occur at the first DEC (hl)
BTW for short modif is it possible to post only a few lines? ;D
Sorry. This is the last time I do with this -1
shrinkler_decrunch:
ld (shrinkler_a5+1), hl
; Init range decoder state
ld hl, shrinkler_dr+1536*2+8
ld bc, 1536*2
ld (hl), c
inc hl
ld (hl), $80
ld de, shrinkler_dr+1536*2+7
lddr
ld c, $08
dec hl
lddr
inc (hl)
shrinkler_lit:
; Literal
scf
shrinkler_getlit:
call nc, shrinkler_getbit
ld hl, shrinkler_d6
rl (hl)
jr nc, shrinkler_getlit
shrinkler_a5:
ld de, $1234
ldi
; After literal
call shrinkler_getkind
jr nc, shrinkler_lit
; Reference
sbc hl, hl
call shrinkler_altgetbit
jr nc, shrinkler_readoffset
shrinkler_readlength:
ld a, 4
call shrinkler_getnumber
shrinkler_d5:
ld hl, $0101
ld de, (shrinkler_a5+1)
add hl, de
ldir
; After reference
call shrinkler_getkind
jr nc, shrinkler_lit
shrinkler_readoffset:
ld a, 3
call shrinkler_getnumber
; return without carry and HL=BC
ld hl, 2
sbc hl, bc
ld (shrinkler_d5+1), hl
jr nz, shrinkler_readlength
ret
;--------------------------------------------------
shrinkler_getnumber:
; Out: Number in HL
ld bc, 1
ld hl, shrinkler_d6+1
ld (hl), a
dec hl
ld (hl), b
shrinkler_numberloop:
inc (hl)
inc (hl)
call shrinkler_getbit
jr c, shrinkler_numberloop
dec (hl)
shrinkler_bitsloop:
call shrinkler_getbit
rl c
rl b
dec (hl)
dec (hl)
ret m
jr shrinkler_bitsloop
;--------------------------------------------------
; Out: Bit in C
shrinkler_readbit:
ld hl, (shrinkler_d4)
add hl, hl
ex de, hl
ld hl, (shrinkler_d4+2) ; lu en little endian
adc hl, hl
ex af, af'
ld a, h
or l
or d
or e
jr nz, shrinkler_nonewword
; HL=DE=0
ld e, 4
add ix, de
ld l, (ix-1)
ld h, (ix-2)
ld e, (ix-3)
ld d, (ix-4) ; DEHL=(a4) nouvelle valeur lue en big endian!
ex af, af' ; injecte la CARRY précédente
adc hl, hl
ex hl, de
adc hl, hl
ex af, af' ; save carry
shrinkler_nonewword:
ld (shrinkler_d4), de
ld (shrinkler_d4+2), hl ; mais écrite en little endian
ld hl, shrinkler_d3
sla (hl)
inc hl
rl (hl)
inc hl
ex af, af' ; retrieve previous carry
rl (hl)
inc hl
rl (hl)
jr shrinkler_getbit1
;--------------------------------------------------
shrinkler_getkind:
;Use parity as context
ld (shrinkler_a5+1), de
xor a
ld l, a
inc a
and e
ld h, a
shrinkler_altgetbit:
ld (shrinkler_d6), hl
shrinkler_getbit:
exx
shrinkler_getbit1:
ld a, (shrinkler_d3+1) ; obligé de relire les 8 bits forts la valeur...
add a, a
jr nc, shrinkler_readbit
ld hl, (shrinkler_d6)
add hl, hl
ld de, shrinkler_pr+2 ; cause -1 context
add hl, de
push hl
ld e, (hl)
inc hl
ld d, (hl)
; D1 = One prob
push de
ld b, d
ld c, e ; bc=de=d1 / hl=a1
ld a, 4
shrinkler_shift4:
srl b
rr c
dec a
jr nz, shrinkler_shift4
xor a
ex hl, de
sbc hl, bc ; hl=d1-d1/16
ex hl, de
ld (hl), d
dec hl
ld (hl), e
pop bc ; bc=d1 initial
ld de, (shrinkler_d3)
; input: DE x BC
; output: DEHL
ld h, a
ld l, a
ld a, 16
shrinkler_muluw:
add hl, hl
rl e
rl d
jr nc, shrinkler_cont
add hl, bc
jr nc, shrinkler_cont
inc de
shrinkler_cont:
dec a
jr nz, shrinkler_muluw
ld hl, (shrinkler_d2)
xor a
sbc hl, de
jr c, shrinkler_one
shrinkler_zero:
; oneprob = oneprob * (1 - adjust) = oneprob - oneprob * adjust
ld (shrinkler_d2), hl
ld hl, (shrinkler_d3)
sbc hl, de
pop de
jr shrinkler_d3ret
shrinkler_one:
; onebrob = 1 - (1 - oneprob) * (1 - adjust) = oneprob - oneprob * adjust + adjust
; move+add out of order!
pop hl
dec a
add a, (hl)
ld (hl), a
inc hl
ld a, (hl)
adc a, $0f ; (a1)+#FFF
ld (hl), a
scf ; SET CARRY
ex de, hl
shrinkler_d3ret:
ld (shrinkler_d3), hl
exx
ret
thanks, the LD BC,1 upper works
now it's 292 bytes :)
Another -1
ld a, $e0
shrinkler_shift4:
srl b
rr c
add a, a
jr c, shrinkler_shift4
ex hl, de
About this code:
;--------------------------------------------------
shrinkler_getnumber:
; Out: Number in HL
ld bc, 1
ld hl, shrinkler_d6+1
ld (hl), a
dec hl
ld (hl), b
shrinkler_numberloop:
inc (hl)
inc (hl)
call shrinkler_getbit
jr c, shrinkler_numberloop
dec (hl)
shrinkler_bitsloop:
call shrinkler_getbit
rl c
rl b
dec (hl)
dec (hl)
ret m
jr shrinkler_bitsloop
Never fails. Because you only need to load up to 16 bits in bc and with positive range you have up to 63 (positive range is between 0 and 127)
i can't get it work with my test file. Maybe i miss a modif?
Quote from: roudoudou on 23:03, 03 February 18
i can't get it work with my test file. Maybe i miss a modif?
Send me your file and I can try. Another -1
ld a, $e1
shrinkler_shift4:
srl b
rr c
add a, a
jr c, shrinkler_shift4
ex hl, de
sbc hl, bc ; hl=d1-d1/16
ex hl, de
ld (hl), d
dec hl
ld (hl), e
pop bc ; bc=d1 initial
ld de, (shrinkler_d3)
; input: DE x BC
; output: DEHL
ld h, 0
ld l, h
shrinkler_muluw:
add hl, hl
Attached my test. It's a ZX Spectrum screen. Use sjasmplus as assembler
Quote from: antoniovillena on 23:16, 03 February 18
Attached my test. It's a ZX Spectrum screen. Use sjasmplus as assembler
there is no manic.shr in the archive
here is my shr. Expect 16384 bytes in output (Amstrad screen)
I can not decompress 32k. Check if the first 14k are ok. manic.shr is generated by compressor.
I will migrate in the future to Ticks
http://retrolandia.net/foro/showthread.php?tid=43&pid=654#pid654
Because I would like to release a speed optimized version and this tool count the Z80 cycles of the whole execution. Also generate files to compare if the file is good extracted.
I get it, that changed the carry (always setted to always reset) now it works!
288 bytes
Quote from: roudoudou on 23:42, 03 February 18
I get it, that changed the carry (always setted to always reset) now it works!
288 bytes
Now tested with Ticks. It costs 21.642.848 cycles decrounch your file. Attached if can be useful for you.
Quote from: antoniovillena on 00:01, 04 February 18
Now tested with Ticks. It costs 21.642.848 cycles decrounch your file. Attached if can be useful for you.
CPC emulators can "tick" on demand.
On Amstrad machines we have wait-states so the final count is higher. 5.834.708 nops that to say 23.338.832 cycles
Quote from: roudoudou on 00:09, 04 February 18
CPC emulators can "tick" on demand.
On Amstrad machines we have wait-states so the final count is higher. 5.834.708 nops that to say 23.338.832 cycles
Yes. Also Spectrum emulators have contention. But it's a good number for compare with other algorithms.
The next one compares compression ratio with other algorithms.
Numbers are filesizes in bytes.
Size Shrinkler Exomizer aPLib saukav zx7b BBuster
lena1k 776 812 872 873 902 905
lena16k 13796 13581 14635 14649 14689 14798
lena32k 28368 28019 29991 30071 30272 30446
alice1k 548 613 617 611 631 636
alice16k 6872 7266 7659 7738 8175 8429
alice32k 12868 13461 14473 14535 16074 16570
128rom1k 840 884 889 913 923 925
128rom16k 11848 12260 12434 12728 12806 12882
128rom32k 23648 24415 24820 26157 26524 26708
This last table compares speed with other algorithms.
Numbers are execution cycles.
Shrinkler deexov4 aPLib BBuster zx7mega saukav zx7bf2
--------------------------------------------------------------------------
lena1k 13757267 303436 176642 106746 95255 76547 81040
lena16k 238317371 4407913 2961621 1908398 1727095 1646032 1462568
lena32k 484967405 8443253 5820921 3651800 3300486 3231882 2803116
alice1k 10060954 274111 136224 98914 89385 70869 73459
alice16k 131592504 2973592 2143122 1812259 1614225 1338287 1328886
alice32k 249719379 5378511 4189855 3614393 3230255 2550243 2654236
128rom1k 13773150 249124 131667 82637 74110 60222 62000
128rom16k 197319929 3571407 2292945 1550682 1407478 1392317 1180569
128rom32k 394594060 7355277 4583902 3107867 2825773 1926027 2381847
--------------------------------------------------------------------------
routine size 288 201 197 168 244 ~200 191
https://github.com/antoniovillena/zx7b
Another -1
ld hl, (shrinkler_d2)
xor a
sbc hl, de
pop bc
jr c, shrinkler_one
shrinkler_zero:
; oneprob = oneprob * (1 - adjust) = oneprob - oneprob * adjust
ld (shrinkler_d2), hl
ld hl, (shrinkler_d3)
sbc hl, de
jr shrinkler_d3ret
shrinkler_one:
; onebrob = 1 - (1 - oneprob) * (1 - adjust) = oneprob - oneprob * adjust + adjust
; move+add out of order!
ld a, (bc)
sub 1
ld (bc), a
inc bc
ld a, (bc)
sbc a, $f0 ; (a1)+#FFF
ld (bc), a
ex de, hl
shrinkler_d3ret:
ld (shrinkler_d3), hl
exx
ret
Another -1
ld de, (shrinkler_d3)
; input: DE x BC
; output: DEHL
sbc hl, hl
shrinkler_muluw:
add hl, hl
Total: 286 bytes
Quote from: antoniovillena on 00:36, 04 February 18
Size Shrinkler Exomizer aPLib saukav zx7b BBuster
Didn't know about saukav vs zx7b and his performance, i will add it to my assembler for
on the fly data/code compression
Another -1. I'm sorry this time I put whole code because reordering. Basically you can avoid the final ret by putting dummy routine.
shrinkler_getnumber:
; Out: Number in HL
ld bc, 1
ld hl, shrinkler_d6+1
ld (hl), a
dec hl
ld (hl), b
shrinkler_numberloop:
inc (hl)
inc (hl)
call shrinkler_getbit
jr c, shrinkler_numberloop
dec (hl)
shrinkler_bitsloop:
call shrinkler_getbit
rl c
rl b
dec (hl)
dec (hl)
ret m
jr shrinkler_bitsloop
;--------------------------------------------------
; Out: Bit in C
shrinkler_readbit:
ld hl, (shrinkler_d4)
add hl, hl
ex de, hl
ld hl, (shrinkler_d4+2) ; lu en little endian
adc hl, hl
ex af, af'
ld a, h
or l
or d
or e
jr nz, shrinkler_nonewword
; HL=DE=0
ld e, 4
add ix, de
ld l, (ix-1)
ld h, (ix-2)
ld e, (ix-3)
ld d, (ix-4) ; DEHL=(a4) nouvelle valeur lue en big endian!
ex af, af' ; injecte la CARRY précédente
adc hl, hl
ex hl, de
adc hl, hl
ex af, af' ; save carry
shrinkler_nonewword:
ld (shrinkler_d4), de
ld (shrinkler_d4+2), hl ; mais écrite en little endian
ld hl, shrinkler_d3
rl (hl)
inc hl
rl (hl)
inc hl
ex af, af' ; retrieve previous carry
rl (hl)
inc hl
rl (hl)
jr shrinkler_getbit1
;--------------------------------------------------
shrinkler_getkind:
;Use parity as context
ld (shrinkler_a5+1), de
xor a
ld l, a
inc a
and e
ld h, a
shrinkler_altgetbit:
ld (shrinkler_d6), hl
shrinkler_getbit:
exx
shrinkler_getbit1:
ld a, (shrinkler_d3+1) ; obligé de relire les 8 bits forts la valeur...
add a, a
jr nc, shrinkler_readbit
ld hl, (shrinkler_d6)
add hl, hl
ld de, shrinkler_pr+2 ; cause -1 context
add hl, de
push hl
ld e, (hl)
inc hl
ld d, (hl)
; D1 = One prob
push de
ld b, d
ld c, e ; bc=de=d1 / hl=a1
ld a, $e1
shrinkler_shift4:
srl b
rr c
add a, a
jr c, shrinkler_shift4
ex hl, de
sbc hl, bc ; hl=d1-d1/16
ex hl, de
ld (hl), d
dec hl
ld (hl), e
pop bc ; bc=d1 initial
ld de, (shrinkler_d3)
; input: DE x BC
; output: DEHL
sbc hl, hl
shrinkler_muluw:
add hl, hl
rl e
rl d
jr nc, shrinkler_cont
add hl, bc
jr nc, shrinkler_cont
inc de
shrinkler_cont:
dec a
jr nz, shrinkler_muluw
ld hl, (shrinkler_d2)
xor a
sbc hl, de
pop bc
jr c, shrinkler_one
shrinkler_zero:
; oneprob = oneprob * (1 - adjust) = oneprob - oneprob * adjust
ld (shrinkler_d2), hl
ld hl, (shrinkler_d3)
sbc hl, de
jr shrinkler_d3ret
shrinkler_decrunch:
ld (shrinkler_a5+1), hl
; Init range decoder state
ld hl, shrinkler_dr+1536*2+8
ld bc, 1536*2
ld (hl), c
inc hl
ld (hl), $80
ld de, shrinkler_dr+1536*2+7
lddr
ld c, 8
dec hl
lddr
inc (hl)
shrinkler_lit:
; Literal
scf
shrinkler_getlit:
call nc, shrinkler_getbit
ld hl, shrinkler_d6
rl (hl)
jr nc, shrinkler_getlit
shrinkler_a5:
ld de, $1234
ldi
; After literal
call shrinkler_getkind
jr nc, shrinkler_lit
; Reference
sbc hl, hl
call shrinkler_altgetbit
jr nc, shrinkler_readoffset
shrinkler_readlength:
ld a, 4
call shrinkler_getnumber
shrinkler_d5:
ld hl, $0101
ld de, (shrinkler_a5+1)
add hl, de
ldir
; After reference
call shrinkler_getkind
jr nc, shrinkler_lit
shrinkler_readoffset:
ld a, 3
call shrinkler_getnumber
; return without carry and HL=BC
ld hl, 2
sbc hl, bc
ld (shrinkler_d5+1), hl
jr nz, shrinkler_readlength
shrinkler_one:
; onebrob = 1 - (1 - oneprob) * (1 - adjust) = oneprob - oneprob * adjust + adjust
; move+add out of order!
ld a, (bc)
sub 1
ld (bc), a
inc bc
ld a, (bc)
sbc a, $f0 ; (a1)+#FFF
ld (bc), a
ex de, hl
shrinkler_d3ret:
ld (shrinkler_d3), hl
exx
ret
But the shrinkler_one function is not dummy since it will makes 2 writes in adress 2 and adress 3
So i switch shrinkler_zero, shrinkler_one and the JR Then only shrinkler_d3 is written with garbage, not the memory elsewhere
Another -1
shrinkler_readbit:
ld hl, (shrinkler_d4)
adc hl, hl
ex de, hl
ld hl, (shrinkler_d4+2) ; lu en little endian
jr nz, shrinkler_rb1
adc hl, hl
jr nz, shrinkler_rb2
; HL=DE=0
ld e, 4
ex af, af'
add ix, de
ex af, af' ; injecte la CARRY précédente
ld l, (ix-1)
ld h, (ix-2)
ld e, (ix-3)
ld d, (ix-4) ; DEHL=(a4) nouvelle valeur lue en big endian!
adc hl, hl
ex hl, de
shrinkler_rb1:
adc hl, hl
shrinkler_rb2:
ex af, af' ; save carry
ld (shrinkler_d4+2), hl ; mais écrite en little endian
ld (shrinkler_d4), de
ld hl, shrinkler_d3
rl (hl)
inc hl
rl (hl)
inc hl
ex af, af' ; retrieve previous carry
rl (hl)
inc hl
rl (hl)
jr shrinkler_getbit1
Sorry again. I put whole file because many changes. 276 bytes.
shrinkler_getnumber:
; Out: Number in HL
ld bc, 1
ld hl, shrinkler_d6+2
ld (hl), a
dec hl
ld (hl), b
shrinkler_numberloop:
inc (hl)
inc (hl)
call shrinkler_getbit
jr c, shrinkler_numberloop
dec (hl)
shrinkler_bitsloop:
call shrinkler_getbit
rl c
rl b
dec (hl)
dec (hl)
ret m
jr shrinkler_bitsloop
;--------------------------------------------------
; Out: Bit in C
shrinkler_readbit:
ld hl, (shrinkler_d3+1)
add hl, hl
ld (shrinkler_d3+1), hl
shrinkler_d4l:
ld hl, 0
adc hl, hl
ex de, hl
shrinkler_d4h:
ld hl, $8000
jr nz, shrinkler_rb1
adc hl, hl
jr nz, shrinkler_rb2
; HL=DE=0
ld e, 4
ex af, af'
add ix, de
ex af, af' ; injecte la CARRY précédente
ld l, (ix-1)
ld h, (ix-2)
ld e, (ix-3)
ld d, (ix-4) ; DEHL=(a4) nouvelle valeur lue en big endian!
adc hl, hl
ex hl, de
shrinkler_rb1:
adc hl, hl
shrinkler_rb2:
ld (shrinkler_d4h+1), hl ; mais écrite en little endian
ld (shrinkler_d4l+1), de
shrinkler_d2:
ld hl, 0
adc hl, hl
ld (shrinkler_d2+1), hl
jr shrinkler_getbit1
;--------------------------------------------------
shrinkler_getkind:
;Use parity as context
ld (shrinkler_a5+1), de
xor a
ld l, a
inc a
and e
ld h, a
shrinkler_altgetbit:
ld (shrinkler_d6+1), hl
shrinkler_getbit:
exx
shrinkler_getbit1:
ld a, (shrinkler_d3+2) ; obligé de relire les 8 bits forts la valeur...
add a, a
jr nc, shrinkler_readbit
shrinkler_d6:
ld hl, 0
add hl, hl
ld de, shrinkler_pr+2 ; cause -1 context
add hl, de
push hl
ld e, (hl)
inc hl
ld d, (hl)
; D1 = One prob
push de
ld b, d
ld c, e ; bc=de=d1 / hl=a1
ld a, $e1
shrinkler_shift4:
srl b
rr c
add a, a
jr c, shrinkler_shift4
ex hl, de
sbc hl, bc ; hl=d1-d1/16
ex hl, de
ld (hl), d
dec hl
ld (hl), e
pop bc ; bc=d1 initial
shrinkler_d3:
ld de, 1
; input: DE x BC
; output: DEHL
sbc hl, hl
shrinkler_muluw:
add hl, hl
rl e
rl d
jr nc, shrinkler_cont
add hl, bc
jr nc, shrinkler_cont
inc de
shrinkler_cont:
dec a
jr nz, shrinkler_muluw
ld hl, (shrinkler_d2+1)
xor a
sbc hl, de
pop bc
jr nc, shrinkler_zero
shrinkler_one:
; onebrob = 1 - (1 - oneprob) * (1 - adjust) = oneprob - oneprob * adjust + adjust
; move+add out of order!
ld a, (bc)
sub 1
ld (bc), a
inc bc
ld a, (bc)
sbc a, $f0 ; (a1)+#FFF
ld (bc), a
ex de, hl
jr shrinkler_d3ret
shrinkler_decrunch:
ld (shrinkler_a5+1), hl
; Init range decoder state
ld hl, shrinkler_pr+1
ld bc, 1536*2
ld (hl), $80
dec hl
ld (hl), c
ld de, shrinkler_pr+2
ldir
shrinkler_lit:
; Literal
scf
shrinkler_getlit:
call nc, shrinkler_getbit
ld hl, shrinkler_d6+1
rl (hl)
jr nc, shrinkler_getlit
ld de, (shrinkler_a5+1)
ldi
; After literal
call shrinkler_getkind
jr nc, shrinkler_lit
; Reference
sbc hl, hl
call shrinkler_altgetbit
jr nc, shrinkler_readoffset
shrinkler_readlength:
ld a, 4
call shrinkler_getnumber
shrinkler_d5:
ld hl, $0101
shrinkler_a5:
ld de, 0
add hl, de
ldir
; After reference
call shrinkler_getkind
jr nc, shrinkler_lit
shrinkler_readoffset:
ld a, 3
call shrinkler_getnumber
; return without carry and HL=BC
ld hl, 2
sbc hl, bc
ld (shrinkler_d5+1), hl
jr nz, shrinkler_readlength
shrinkler_zero:
; oneprob = oneprob * (1 - adjust) = oneprob - oneprob * adjust
ld (shrinkler_d2+1), hl
ld hl, (shrinkler_d3+1)
sbc hl, de
shrinkler_d3ret:
ld (shrinkler_d3+1), hl
exx
ret
shrinkler_pr EQU $
The 276 version only allow one call to the routine. For a reentrant version take the latest 284 one and change this routine. You will have 283 bytes version.
shrinkler_readbit:
ld hl, (shrinkler_d3)
add hl, hl
ld (shrinkler_d3), hl
ld hl, (shrinkler_d4)
adc hl, hl
ex de, hl
ld hl, (shrinkler_d4+2) ; lu en little endian
jr nz, shrinkler_rb1
adc hl, hl
jr nz, shrinkler_rb2
; HL=DE=0
ld e, 4
ex af, af'
add ix, de
ex af, af' ; injecte la CARRY précédente
ld l, (ix-1)
ld h, (ix-2)
ld e, (ix-3)
ld d, (ix-4) ; DEHL=(a4) nouvelle valeur lue en big endian!
adc hl, hl
ex hl, de
shrinkler_rb1:
adc hl, hl
shrinkler_rb2:
ld (shrinkler_d4+2), hl ; mais écrite en little endian
ld (shrinkler_d4), de
ld hl, (shrinkler_d2)
adc hl, hl
ld (shrinkler_d2), hl
jr shrinkler_getbit1
sure, i will put the previous one and specify the difference
Excellent work!
For the "1-call-only version", we can remove "ld (shrinkler_a5+1),hl" at "shrinkler_decrunch" label (beginning), and write destination address directly in "shrinkler_a5" label... Then -3 bytes: 273!
Obviously i will document it.the same opt is possible with the reusable one instead of setting HL
-5 bytes and small speed improvement to the 283 bytes version (now 278)
-5 bytes=>changing shrinkler_a5 by iy
speed improvement=>moving shrinkler_d6 outside loop shrinkler_getlit
shrinkler_getnumber:
; Out: Number in HL
ld bc, 1
ld hl, shrinkler_d6+1
ld (hl), a
dec hl
ld (hl), b
shrinkler_numberloop:
inc (hl)
inc (hl)
call shrinkler_getbit
jr c, shrinkler_numberloop
dec (hl)
shrinkler_bitsloop:
call shrinkler_getbit
rl c
rl b
dec (hl)
dec (hl)
ret m
jr shrinkler_bitsloop
;--------------------------------------------------
; Out: Bit in C
shrinkler_readbit:
ld hl, (shrinkler_d3)
add hl, hl
ld (shrinkler_d3), hl
ld hl, (shrinkler_d4)
adc hl, hl
ex de, hl
ld hl, (shrinkler_d4+2) ; lu en little endian
jr nz, shrinkler_rb1
adc hl, hl
jr nz, shrinkler_rb2
; HL=DE=0
ld e, 4
ex af, af'
add ix, de
ex af, af' ; injecte la CARRY précédente
ld l, (ix-1)
ld h, (ix-2)
ld e, (ix-3)
ld d, (ix-4) ; DEHL=(a4) nouvelle valeur lue en big endian!
adc hl, hl
ex hl, de
shrinkler_rb1:
adc hl, hl
shrinkler_rb2:
ld (shrinkler_d4+2), hl ; mais écrite en little endian
ld (shrinkler_d4), de
ld hl, (shrinkler_d2)
adc hl, hl
ld (shrinkler_d2), hl
jr shrinkler_getbit1
;--------------------------------------------------
shrinkler_getkind:
;Use parity as context
push de
pop iy
xor a
ld l, a
inc a
and e
ld h, a
shrinkler_altgetbit:
ld (shrinkler_d6), hl
shrinkler_getbit:
exx
shrinkler_getbit1:
ld a, (shrinkler_d3+1) ; obligé de relire les 8 bits forts la valeur...
add a, a
jr nc, shrinkler_readbit
ld hl, (shrinkler_d6)
add hl, hl
ld de, shrinkler_pr+2 ; cause -1 context
add hl, de
push hl
ld e, (hl)
inc hl
ld d, (hl)
; D1 = One prob
push de
ld b, d
ld c, e ; bc=de=d1 / hl=a1
ld a, $e1
shrinkler_shift4:
srl b
rr c
add a, a
jr c, shrinkler_shift4
ex hl, de
sbc hl, bc ; hl=d1-d1/16
ex hl, de
ld (hl), d
dec hl
ld (hl), e
pop bc ; bc=d1 initial
ld de, (shrinkler_d3)
; input: DE x BC
; output: DEHL
sbc hl, hl
shrinkler_muluw:
add hl, hl
rl e
rl d
jr nc, shrinkler_cont
add hl, bc
jr nc, shrinkler_cont
inc de
shrinkler_cont:
dec a
jr nz, shrinkler_muluw
ld hl, (shrinkler_d2)
xor a
sbc hl, de
pop bc
jr nc, shrinkler_zero
shrinkler_one:
; onebrob = 1 - (1 - oneprob) * (1 - adjust) = oneprob - oneprob * adjust + adjust
; move+add out of order!
ld a, (bc)
sub 1
ld (bc), a
inc bc
ld a, (bc)
sbc a, $f0 ; (a1)+#FFF
ld (bc), a
ex de, hl
jr shrinkler_d3ret
shrinkler_decrunch:
; Init range decoder state
ld hl, shrinkler_dr+1536*2+8
ld bc, 1536*2
ld (hl), c
inc hl
ld (hl), $80
ld de, shrinkler_dr+1536*2+7
lddr
ld c, 8
dec hl
lddr
inc (hl)
; Literal
shrinkler_lit:
scf
ld hl, shrinkler_d6
shrinkler_getlit:
call nc, shrinkler_getbit
rl (hl)
jr nc, shrinkler_getlit
push iy
pop de
ldi
; After literal
call shrinkler_getkind
jr nc, shrinkler_lit
; Reference
sbc hl, hl
call shrinkler_altgetbit
jr nc, shrinkler_readoffset
shrinkler_readlength:
ld a, 4
call shrinkler_getnumber
shrinkler_d5:
ld hl, 0
push iy
pop de
add hl, de
ldir
; After reference
call shrinkler_getkind
jr nc, shrinkler_lit
shrinkler_readoffset:
ld a, 3
call shrinkler_getnumber
; return without carry and HL=BC
ld hl, 2
sbc hl, bc
ld (shrinkler_d5+1), hl
jr nz, shrinkler_readlength
shrinkler_zero:
; oneprob = oneprob * (1 - adjust) = oneprob - oneprob * adjust
ld (shrinkler_d2), hl
ld hl, (shrinkler_d3)
sbc hl, de
shrinkler_d3ret:
ld (shrinkler_d3), hl
exx
ret
Sorry mistake
-4 bytes
shrinkler_getbit:
exx
shrinkler_getbit1:
ld hl, (shrinkler_d3) ; obligé de relire les 8 bits forts la valeur...
add hl, hl
jr nc, shrinkler_readbit
...
shrinkler_readbit:
ld (shrinkler_d3), hl
ld hl, (shrinkler_d4)
Are the following useful?
shrinkler_getkind:
;Use parity as context
;push de
;pop iy
ld iyh,d
ld iyl,e
shrinkler_getlit:
call nc, shrinkler_getbit
rl (hl)
jr nc, shrinkler_getlit
;push iy
;pop de
ld d,iyh
ld e,iyl
shrinkler_d5:
ld hl, 0
;push iy
;pop de
ld d,iyh
ld e,iyl
push de / pop de -> 1 byte
push iy / pop iy -> 2 bytes
ld <reg8>,<ireg8> -> 2 bytes
so no, not usefull
Would it possible to update the two routines each time (1-call & normal one)?
Normal is 274, but actual "1-call" is 276, the last optimisations are not integrated (with the last, I think we are close from 265)...
Thanks Rdd!
Quote from: roudoudou on 09:35, 06 February 18
push de / pop de -> 1 byte
push iy / pop iy -> 2 bytes
ld <reg8>,<ireg8> -> 2 bytes
so no, not usefull
Indeed, it takes a few more bytes, but it also takes less NOPS. ;D
Quote from: Hicks on 11:15, 06 February 18
Would it possible to update the two routines each time (1-call & normal one)?
Normal is 274, but actual "1-call" is 276, the last optimisations are not integrated (with the last, I think we are close from 265)...
Thanks Rdd!
tonight ;)
Quote from: ervin on 12:17, 06 February 18Indeed, it takes a few more bytes, but it also takes less NOPS. ;D
the impact is negligible.
the purpose of Shrinkler is to offer extreme compression for 4K intro
if you need speed, there is several alternatives
Another -1
ld a, $eb
shrinkler_shift4:
srl b
rr c
add a, a
jr c, shrinkler_shift4-1
sbc hl, bc ; hl=d1-d1/16
ex hl, de
...
shrinkler_cont:
sub $b
jr nz, shrinkler_muluw
ld hl, (shrinkler_d2)
sbc hl, de
Wow! That's a great trick!
-1 byte ;D
SHRINKLER_READBIT: LD (SHRINKLER_D3),HL
LD HL,(SHRINKLER_D4)
ADC HL,HL
EX HL,DE
LD HL,(SHRINKLER_D4+2)
JR NZ,READBIT1
ADC HL,HL
JR NZ,READBIT2
; HL=DE=0
LD E,&04
ADD IX,DE
LD L,(IX-1)
LD H,(IX-2)
LD E,(IX-3)
LD D,(IX-4) ; DEHL=(a4) big endian value read!
SCF
ADC HL,HL
EX HL,DE
READBIT1: ADC HL,HL
READBIT2: LD (SHRINKLER_D4),DE
LD (SHRINKLER_D4+2),HL ; written in little endian
LD HL,(SHRINKLER_D2)
ADC HL,HL
LD (SHRINKLER_D2),HL
JR GETBIT1
-2 bytes
shrinkler_readbit:
ld (shrinkler_d3), hl
ld hl, (shrinkler_d4)
adc hl, hl
ex de, hl
ld hl, (shrinkler_d4+2)
jr nz, shrinkler_rb1
adc hl, hl
jr nz, shrinkler_rb2
; HL=DE=0
ld e, 4
add ix, de
ld l, (ix-1)
ld h, (ix-2)
ld e, (ix-3)
ld d, (ix-4) ; DEHL=(a4) big endian value read!
add hl, hl
inc hl
ex de, hl
Wow. We saw similar optimization at the same time
Quote from: Urusergi on 23:21, 19 February 18
-1 byte ;D
SHRINKLER_READBIT: LD (SHRINKLER_D3),HL
LD HL,(SHRINKLER_D4)
ADC HL,HL
EX HL,DE
LD HL,(SHRINKLER_D4+2)
JR NZ,READBIT1
ADC HL,HL
JR NZ,READBIT2
; HL=DE=0
LD E,&04
ADD IX,DE
LD L,(IX-1)
LD H,(IX-2)
LD E,(IX-3)
LD D,(IX-4) ; DEHL=(a4) big endian value read!
SCF
ADC HL,HL
EX HL,DE
READBIT1: ADC HL,HL
READBIT2: LD (SHRINKLER_D4),DE
LD (SHRINKLER_D4+2),HL ; written in little endian
LD HL,(SHRINKLER_D2)
ADC HL,HL
LD (SHRINKLER_D2),HL
JR GETBIT1
-2 bytes :doh: It was obvious :-[
I take my hat off, you're the Master
Also it was obvious the carry always after an ADC HL, HL with zero result. But It took much time to discover.
Quote from: Urusergi on 23:42, 19 February 18
-2 bytes :doh: It was obvious :-[
I take my hat off, you're the Master
-1
This one is not obvious
shrinkler_readbit:
ld (shrinkler_d3), hl
ld hl, (shrinkler_d4)
adc hl, hl
ex de, hl
ld hl, (shrinkler_d4+2)
jr nz, shrinkler_rb1
adc hl, hl
jr nz, shrinkler_rb2
; HL=DE=0
ruti: ld d, (ix) ; DEHL=(a4) big endian value read!
ld e, (ix+1)
ex de, hl
inc ix
inc ix
ret nz
call ruti-1
add hl, hl
inc hl
ex de, hl
Quote from: antoniovillena on 00:07, 20 February 18
-1
This one is not obvious
Very good this one, I haven't checked but I get the impression that is slower? ???
Another -1
shrinkler_readbit:
ld (shrinkler_d3), hl
ld hl, (shrinkler_d4)
adc hl, hl
ex de, hl
ld hl, (shrinkler_d4+2)
jr nz, shrinkler_rb1
adc hl, hl
jr nz, shrinkler_rb2
; HL=DE=0
shrinkler_rb0:
ccf
ld d, (ix) ; DEHL=(a4) big endian value read!
ld e, (ix+1)
ex de, hl
inc ix
inc ix
jr nc, shrinkler_rb0
add hl, hl
inc hl
ex de, hl
shrinkler_rb1:
Quote from: Urusergi on 00:37, 20 February 18
Very good this one, I haven't checked but I get the impression that is slower? ???
This one a little faster (same size):
shrinkler_readbit:
ld (shrinkler_d3), hl
ld hl, (shrinkler_d4)
adc hl, hl
ex de, hl
ld hl, (shrinkler_d4+2)
jr nz, shrinkler_rb1
adc hl, hl
jr nz, shrinkler_rb2
; HL=DE=0
shrinkler_rb0:
ccf
ld d, (ix) ; DEHL=(a4) big endian value read!
ld e, (ix+1)
ex de, hl
inc ix
inc ix
jr nc, shrinkler_rb0
adc hl, hl
ex de, hl
shrinkler_rb1:
great! I will update official sources tonight after testing!
-6 bytes
SHRINKLER_DECRUNCH: ; Init range decoder state
LD HL,&0C00+8+SHRINKLER_DR
LD DE,&0C00+7+SHRINKLER_DR
LD BC,&0C00
LD (HL),C
INC HL
LD (HL),&80
LDDR
LD C,&08
DEC HL
LDDR
INC (HL)
PUSH IY
POP DE
LIT: SCF
LD HL,SHRINKLER_D6
Also, we have to eliminate all sentences like that:
PUSH IY
POP DE
&
PUSH DE
POP IY
It's can be reduced one more byte if we use DE as destination:
SHRINKLER_DECRUNCH: ; Init range decoder state
PUSH DE
LD HL,&0C00+8+SHRINKLER_DR
LD DE,&0C00+7+SHRINKLER_DR
LD BC,&0C00
LD (HL),C
INC HL
LD (HL),&80
LDDR
LD C,&08
DEC HL
LDDR
INC (HL)
POP DE
LIT: SCF
LD HL,SHRINKLER_D6
thanks, i updated first post with 255 bytes (one call version) and 262 bytes (multiple calls version)
now, source=IX and destination=DE
Quote from: Urusergi on 22:03, 20 February 18
-6 bytes
Incredible. You broke the 256 bytes barrier
Another -1
shrinkler_getkind:
;Use parity as context
ld l, 1
shrinkler_altgetbit:
ld a, l
and e
ld h, a
dec hl
ld (shrinkler_d6), hl
; After literal
call shrinkler_getkind
jr nc, shrinkler_lit
; Reference
call shrinkler_altgetbit
-3 bytes in onecall version if shrinkler_pr is even with this code:
shrinkler_decrunch:
; Init range decoder state
ld hl, shrinkler_pr
shrinkler_repeat:
ld a, l
rrca
and $80
ld (hl), a
inc hl
ld a, h
cp (shrinkler_pr+$d00)>>8
jr nz, shrinkler_repeat
same lenght without even condition:
shrinkler_decrunch:
; Init range decoder state
ld hl, shrinkler_pr
shrinkler_repeat:
ld (hl), 0
inc hl
ld (hl), $80
inc hl
ld a, h
cp (shrinkler_pr+$d00)>>8
jr nz, shrinkler_repeat
-1 byte
shrinkler_decrunch:
; Init range decoder state
ld bc, shrinkler_pr
xor a
shrinkler_repeat:
ld (bc), a
xor $80
inc bc
ld h, ($f400-shrinkler_pr)>>8
add hl, bc
jr nc, shrinkler_repeat
-2 bytes to the recall version:
shrinkler_decrunch:
; Init range decoder state
ld hl, shrinkler_dr
xor a
ex af, af'
xor a
ld bc, $070d
ld (hl), 1
shrinkler_repeat:
inc hl
ex af, af'
ld (hl), a
djnz shrinkler_repeat
ld a, $80
dec c
jr nz, shrinkler_repeat
;D like it!
first post updated with 250 bytes onecall and 259 bytes recall versions
Quote from: roudoudou on 09:29, 21 February 18
;D like it!
first post updated with 250 bytes onecall and 259 bytes recall versions
You can change >>8 by /256 if problems with assemblers
Enviado desde mi MI 5C mediante Tapatalk
-1 byte. Sorry, complete code
shrinkler_getnumber:
; Out: Number in HL
ld bc, 1
ld hl, shrinkler_d6+1
ld (hl), a
dec hl
ld (hl), b
shrinkler_numberloop:
inc (hl)
inc (hl)
call shrinkler_getbit
jr c, shrinkler_numberloop
dec (hl)
shrinkler_bitsloop:
call shrinkler_getbit
rl c
rl b
dec (hl)
dec (hl)
ret m
jr shrinkler_bitsloop
;--------------------------------------------------
; Out: Bit in C
shrinkler_readbit:
pop de
ld (shrinkler_d3), hl
ld hl, (shrinkler_d4)
adc hl, hl
ex de, hl
ld hl, (shrinkler_d4+2)
jr nz, shrinkler_rb1
adc hl, hl
jr nz, shrinkler_rb2
; HL=DE=0
shrinkler_rb0:
ccf
ld d, (ix) ; DEHL=(a4) big endian value read!
ld e, (ix+1)
ex de, hl
inc ix
inc ix
jr nc, shrinkler_rb0
adc hl, hl
ex de, hl
shrinkler_rb1:
adc hl, hl
shrinkler_rb2:
ld (shrinkler_d4), de
ld (shrinkler_d4+2), hl ; written in little endian
ld hl, (shrinkler_d2)
adc hl, hl
ld (shrinkler_d2), hl
jr shrinkler_getbit1
shrinkler_getkind:
;Use parity as context
ld l, 1
shrinkler_altgetbit:
ld a, l
and e
ld h, a
dec hl
ld (shrinkler_d6), hl
shrinkler_getbit:
exx
shrinkler_getbit1:
ld hl, (shrinkler_d3)
push hl
add hl, hl
jr nc, shrinkler_readbit
ld hl, (shrinkler_d6)
add hl, hl
ld de, shrinkler_pr+2 ; cause -1 context
add hl, de
ld e, (hl)
inc hl
ld d, (hl)
ld b, d
ld c, e ; bc=de=d1 / hl=a1
ld a, $eb
shrinkler_shift4:
srl b
rr c
add a, a
jr c, shrinkler_shift4-1
sbc hl, bc ; hl=d1-d1/16
ex de, hl
ld b, (hl)
ld (hl), d
dec hl
ld c, (hl)
ld (hl), e
ex (sp), hl
ex de, hl
sbc hl, hl
shrinkler_muluw:
add hl, hl
rl e
rl d
jr nc, shrinkler_cont
add hl, bc
jr nc, shrinkler_cont
inc de
shrinkler_cont:
sub $b
jr nz, shrinkler_muluw
pop bc ; bc=d1 initial
ld hl, (shrinkler_d2)
sbc hl, de
jr nc, shrinkler_zero
shrinkler_one:
; onebrob = 1 - (1 - oneprob) * (1 - adjust) = oneprob - oneprob * adjust + adjust
ld a, (bc)
sub 1
ld (bc), a
inc bc
ld a, (bc)
sbc a, $f0 ; (a1)+#FFF
ld (bc), a
ex de, hl
jr shrinkler_d3ret
shrinkler_decrunch:
; Init range decoder state
ld hl, shrinkler_dr
xor a
ex af, af'
xor a
ld bc, $070d
ld (hl), 1
shrinkler_repeat:
inc hl
ex af, af'
ld (hl), a
djnz shrinkler_repeat
ld a, $80
dec c
jr nz, shrinkler_repeat
shrinkler_lit:
scf
ld hl, shrinkler_d6
shrinkler_getlit:
call nc, shrinkler_getbit
rl (hl)
jr nc, shrinkler_getlit
ldi
; After literal
call shrinkler_getkind
jr nc, shrinkler_lit
; Reference
call shrinkler_altgetbit
jr nc, shrinkler_readoffset
shrinkler_readlength:
ld a, 4
call shrinkler_getnumber
shrinkler_d5:
ld hl, 0
add hl, de
ldir
; After reference
call shrinkler_getkind
jr nc, shrinkler_lit
shrinkler_readoffset:
ld a, 3
call shrinkler_getnumber
ld hl, 2
sbc hl, bc
ld (shrinkler_d5+1), hl
jr nz, shrinkler_readlength
shrinkler_zero:
; oneprob = oneprob * (1 - adjust) = oneprob - oneprob * adjust
ld (shrinkler_d2), hl
ld hl, (shrinkler_d3)
sbc hl, de
; oneprob*adjust < oneprob so carry is always cleared...
shrinkler_d3ret:
ld (shrinkler_d3), hl
exx
ret
Another -1
shrinkler_getnumber:
; Out: Number in BC
ld bc, 1
ld hl, shrinkler_d6+1
ld (hl), a
dec hl
ld (hl), c
shrinkler_numberloop:
inc (hl)
call shrinkler_getbit
inc (hl)
jr c, shrinkler_numberloop
shrinkler_bitsloop:
dec (hl)
dec (hl)
ret m
call shrinkler_getbit
rl c
rl b
jr shrinkler_bitsloop
Another -1
shrinkler_readlength:
call shrinkler_getnumber
shrinkler_d5:
ld hl, 0
add hl, de
ldir
; After reference
call shrinkler_getkind
jr nc, shrinkler_lit
shrinkler_readoffset:
dec a
call shrinkler_getnumber
shrinkler_d3ret:
ld (shrinkler_d3), hl
exx
ld a, 4
ret
QuoteI am not sure 100% about it, but...
After some tests this doesn't work always, but maybe interesting (as comments) for extreme demo coders. So the final size is 248/256 bytes (and with this dirty trick 243/252).
Interresting, i'll try tonight (and i hope i will find test files which work and not!)
-1 byte recall
shrinkler_readbit:
pop de
ld (shrinkler_d3), hl
ld hl, (shrinkler_d4)
ld de, (shrinkler_d4+2)
shrinkler_rb0:
adc hl, hl
ld (shrinkler_d4), hl
ex de, hl
jr nz, shrinkler_rb2
adc hl, hl
jr nz, shrinkler_rb3
; HL=DE=0
shrinkler_rb1:
ccf
ld d, (ix) ; DEHL=(a4) big endian value read!
ld e, (ix+1)
ex de, hl
inc ix
inc ix
jr nc, shrinkler_rb1
jr shrinkler_rb0
shrinkler_rb2:
adc hl, hl
shrinkler_rb3:
ld (shrinkler_d4+2), hl ; written in little endian
ld hl, (shrinkler_d2)
adc hl, hl
ld (shrinkler_d2), hl
jr shrinkler_getbit1
-2 bytes onecall
shrinkler_readbit:
ld (shrinkler_d3+1), hl
shrinkler_d4l:
ld hl, 0
shrinkler_d4h:
ld de, #8000
shrinkler_rb0:
adc hl, hl
ld (shrinkler_d4l+1), hl
ex de, hl
jr nz, shrinkler_rb2
adc hl, hl
jr nz, shrinkler_rb3
shrinkler_rb1:
ccf
ld d, (ix)
ld e, (ix+1)
ex de, hl
inc ix
inc ix
jr nc, shrinkler_rb1
jr shrinkler_rb0
shrinkler_rb2:
adc hl, hl
shrinkler_rb3:
ld (shrinkler_d4h+1), hl
shrinkler_d2:
ld hl, 0
adc hl, hl
ld (shrinkler_d2+1), hl
jr shrinkler_getbit1
-1 byte
shrinkler_readbit:
ld (shrinkler_d3+1), hl
shrinkler_d4l:
ld hl, 0
shrinkler_d4h:
ld de, #8000
shrinkler_rb0:
adc hl, hl
ld (shrinkler_d4l+1), hl
ex de, hl
jr nz, shrinkler_rb2-1
adc hl, hl
jr nz, shrinkler_rb3
shrinkler_rb1:
ld h, (ix)
ld l, (ix+1)
inc ix
inc ix
ccf
jr c, shrinkler_rb0
ex de, hl
jr shrinkler_rb1-3
shrinkler_rb2:
ld l, d ; adc hl, hl with ed prefix
shrinkler_rb3:
ld (shrinkler_d4h+1), hl
I have released a loader for ZX Spectrum with the one call version:
https://github.com/antoniovillena/zx7b/blob/master/shr8k.asm
So from a input binary file assembled with org $8000 you will have a compressed TAP file like this:
https://github.com/antoniovillena/zx7b/blob/master/demo.tap
This one can also applied to the recall version to be 254 bytes. I have both versions here:
https://github.com/antoniovillena/zx7b/blob/master/shrinkler.asm
https://github.com/antoniovillena/zx7b/blob/master/shrinkler_onecall.asm
Also put the shrinkler_decrunch routine first (to avoid call) and removed the shrinkler_d5 self modifying code from shrinkler.asm. Now this routine can work on ROM.
Quote from: antoniovillena on 16:13, 02 March 18
-1 byte
shrinkler_readbit:
ld (shrinkler_d3+1), hl
shrinkler_d4l:
ld hl, 0
shrinkler_d4h:
ld de, #8000
shrinkler_rb0:
adc hl, hl
ld (shrinkler_d4l+1), hl
ex de, hl
jr nz, shrinkler_rb2-1
adc hl, hl
jr nz, shrinkler_rb3
shrinkler_rb1:
ld h, (ix)
ld l, (ix+1)
inc ix
inc ix
ccf
jr c, shrinkler_rb0
ex de, hl
jr shrinkler_rb1-3
shrinkler_rb2:
ld l, d ; adc hl, hl with ed prefix
shrinkler_rb3:
ld (shrinkler_d4h+1), hl
Madram joined the game and sent me a 209 bytes version, entirely rewritten and simplified
First page updated with the source
Ahah, hooray for Madram! :)
That's very interesting... let me ask you one question...
In the recent decrunch routine there is the line:
probs=($+256)&#FF00
How would this be written in MAXAM? Or.. what does this mean?
I guess it means : "probs" has the address of the next address aligned to 256.
$ is the current address. Let's say #1234
so "probs" is #1300.
I don't know Maxam well enough to "translate" this expression to it.
Thanks! That explains it.
In Maxam it would be close to this:
DS $/256*256+256-$
probs...
then next byte would be located at &XX00. :)
Sorry for reviving an older topic, but out of younger interest...
Hi there. I tested some files and it seems that Shrinkler is usually better than anything else. But I lack a manual.
Does anybody know where to get a Shrinkler Manual?
Where can I find the up to date de-crunch routine? (Or is it still the one here in the thread?)
EDIT: BTW: Shrinkler 4.6 was released 2020.2.22
Shrinkler is the BEST compressor, period. The counterpart is that it is terribly slow on depacking.
I can't help you about the manual though, but I'm sure someone else can.
Also, this graph may be of interest to you if you accept a lower compression ratio, but require more speed:
https://github.com/emmanuel-marty/lzsa
(scroll down a bit)
Quote from: Targhan on 22:30, 12 May 20
Shrinkler is the BEST compressor, period. The counterpart is that it is terribly slow on depacking.
I can't help you about the manual though, but I'm sure someone else can.
Also, this graph may be of interest to you if you accept a lower compression ratio, but require more speed:
https://github.com/emmanuel-marty/lzsa
(scroll down a bit)
That graph is interesting, but turns out the data I compress in my games doesn't behave like that.
I was using UCL for a long time, then moved to ZX7 in few projects, and turns out ApLib with apultra compressor it is the best for me hitting the sweet spot speed/compression.
So it all depends on your use case AND your data!
Quote from: GUNHED on 14:51, 12 May 20
EDIT: BTW: Shrinkler 4.6 was released 2020.2.22
i did not check newer version. Hope the decruncher is still the same ;D
the "official" decruncher is still on page 1 of this topic, maybe not the most optimised since i rewrote a little part of the initialisation to gain 1 byte :P
The differences between 4.5 and 4.6 seem to target files longer than 1 MB (they write). Later the day I try to compare them to see if the result of a compression is different in size.
Quote from: Targhan on 22:30, 12 May 20
Shrinkler is the BEST compressor, period. The counterpart is that it is terribly slow on depacking.
Also, this graph may be of interest to you if you accept a lower compression ratio, but require more speed:
https://github.com/emmanuel-marty/lzsa (https://github.com/emmanuel-marty/lzsa)
(scroll down a bit)
Thanks for the graph. That's very interesting.
Now, that raises the question if it would make sense to reprogram the decompressor optimized for decompression speed. :)
Quote from: GUNHED on 07:35, 13 May 20
Now, that raises the question if it would make sense to reprogram the decompressor optimized for decompression speed. :)
won't change anything since there is one 32 bits multiplication per decrunched BIT (not byte)...
i ask Toto for an hardware and instant multiplication then speed may speed up 5 times => still approx 120 nops per decrunched bit (a 4K intro will decrunch in 4 seconds instead of 20)
To ask Tot0 is a good idea, he can put that math stuff inside the CPC Minibooster. :laugh:
Where can we find the most recent decrunch routine please?
Quote from: GUNHED on 15:22, 12 October 23Where can we find the most recent decrunch routine please?
https://www.cpcwiki.eu/forum/programming/modified-shrinkler-without-parity-context/msg186981/#msg186981