Hi everyone.
One last gasp at optimising my sprite code in Chunky Pixel Curator.
8)
The sprite code can go down 2 paths.
One for "mode 1" sprites, and one for "mode 0" sprites.
Mode 1 sprites are about as optimised as they can possibly be, so I'm happy with that.
Mode 0 sprites however have a strange little quirk.
I *think* they are as fast as they can possibly be - I certainly can't figure out how to make them faster.
But there is still that nagging feeling...
LABEL1:
ld a,(hl)
cp c
jr z,LABEL2
ld (de),a
inc e
ld (de),a
dec e
LABEL2:
inc e
inc e
inc l
The idea is that a byte value is read, compared with the transparent value held in C, and then the code either plots 2 pixels (for mode 0 pixel width), or skips ahead 2 pixels if the byte value represents transparency.
Skipping the plotting of the pixels is fine, but when plotting the pixels, it DECs the screen pointer, in order to allow falling through to the same code the runs when the 2 pixels are skipped.
(That is tricky to explain - apologies if it's a confusing mess).
Does anyone know of some amazing trick that will allow me to avoid having to DEC E before LABEL2, in order to put the screen pointer in the right spot?
Or is this about as optimised as it gets?
Thanks for any suggestions.
Quote from: ervin
Does anyone know of some amazing trick that will allow me to avoid having to DEC E before LABEL2, in order to put the screen pointer in the right spot?
Or is this about as optimised as it gets?
Thanks for any suggestions.
Store your datas in another way.
Not amazing, and real ugly, but I think, if I have the maths right, this would be 1 nop faster on a write:
ld a,(hl)
cp c
jr nz,LabelWrite
inc e
jr SkipWrite
.LabelWrite
ld (de),a
inc e
ld (de),a
.SkipWrite
inc e
inc l
So on write, you'd have the slower jump rather than just passing over it, but drop 1 each of inc e and dec e, so 1 nop faster (I think). But the blanks are slower, so it's no good if you have lots of transparency on your sprites.
Thanks for your suggestions guys.
Axelay - I gave it a go, and I got very similar results in terms of speed.
I guess the transparent vs non-transparent pixels balance it out.
Ervin : why do you need sprites. Are you making a game for our beloved Cpc?
Quote from: Ast on 13:05, 10 December 14
Ervin : why do you need sprites. Are you making a game for our beloved Cpc?
Chunky Pixel Curator - WIP (http://www.cpcwiki.eu/forum/programming/chunky-pixel-curator-teaser/)
8)
For transparent = 0 try:
transparent:
inc e
inc e
inc l
start:
repeat N_PIXELS
or (hl)
ret z ; if z go transparent
ld (de),a
inc e
ld (de),a
inc e
inc l
xor a
rend
end:
The stack requires something like this:
stack_table:
repeat N_PIXELS
dw transparent
rend
dw end
Last pixel in the sprite must be transparent and sp should start with the right index on that table based on the number of transparent pixels in the sprite.
8/11 nops
Another suggestion for compressed sprites:
; sp = sprite
; hl = screen
; bc = screen module
draw_sprite:
ret
repeat N_MAX_PIXELS_WIDTH
inc l
rend
transparent:
ret
repeat N_MAX_PIXELS_WIDTH/2
pop de
ld (hl),e
inc l
ld (hl),e
inc l
ld (hl),d
inc l
ld (hl),d
inc l
rend
write_even:
ret
repeat N_MAX_PIXELS_WIDTH/2-1
pop de
ld (hl),e
inc l
ld (hl),e
inc l
ld (hl),d
inc l
ld (hl),d
inc l
rend
write_odd:
pop de
dec sp
ld (hl),e
inc l
ld (hl),e
inc l
ret
skipline:
add hl,bc
ret
end:
The sprite compressed format would look like this:
sprite_desc:
; line 0
dw transparent - (n transparent pixels)
dw write_even - (n pixles to write/2) * 9 ; (9 = write 2 pixels routine len)
db pixels data ; (even block of pixel data)
...
dw transparent - (n transparent pixels)
dw write_odd - (n pixles to write/2) * 9
db pixels data ; (odd block of pixel data)
...
dw skipline
; line 1
...
dw end
This is about 1 nop per transparent and 7.5 per write.
Thanks Rhino - some *excellent* information there!
@Rhino (http://www.cpcwiki.eu/forum/index.php?action=profile;u=174):
I think you got the ret's in wrong places - why the ret's are at the begginings of the subroutines? The draw sprite ret will jump me to transparent label where there awaits another ret for me so I'll jump to write even immediately without incrementing l... Write even also begins with a ret so I the program will most probably hang because the stack points to the pixel bytes...
But great ideas. I've implemented a similar compressed sprite routine (idea by ralferoo) in cpcepsprites but my pixel data isn't on the stack but in a separate buffer... Thanks:)
Quote from: ssr86 on 15:27, 11 December 14
@Rhibo:
I think you got the ret's in wrong places - why the ret's are at the begginings of the subroutines? The draw sprite ret will jump me to transparent label where there awaits another ret for me so I'll jump to write even immediately without incrementing l... Write even also begins with a ret so I the program will most probably hang because the stack points to the pixel bytes...
But great ideas. I've implemented a similar compressed sprite routine (idea by ralferoo) in cpcepsprites but my pixel data isn't on the stack but in a separate buffer... Thanks:)
The labels are at the
end of the routines, the sprite data effectively creates jumps to say (
transparent - 3) for three pixels worth of transparency which is why the instruction at the point of the label is just a return (because 0 transparent pixels would be
transparent - 0 =
transparent) - it's a pretty neat technique, if a little harder to read.
Quote from: andycadley on 16:51, 11 December 14
The labels are at the end of the routines, the sprite data effectively creates jumps to say (transparent - 3) for three pixels worth of transparency which is why the instruction at the point of the label is just a return (because 0 transparent pixels would be transparent - 0 = transparent) - it's a pretty neat technique, if a little harder to read.
:o
wow :)
Thank you for the explanation. I havent't read the minus part correctly...Now I think I understand
Quote from: ssr86 on 15:27, 11 December 14
@Rhibo:
I think you got the ret's in wrong places - why the ret's are at the begginings of the subroutines? The draw sprite ret will jump me to transparent label where there awaits another ret for me so I'll jump to write even immediately without incrementing l... Write even also begins with a ret so I the program will most probably hang because the stack points to the pixel bytes...
But great ideas. I've implemented a similar compressed sprite routine (idea by ralferoo) in cpcepsprites but my pixel data isn't on the stack but in a separate buffer... Thanks:)
Sorry, my code above is not well explained and does not correspond to a full implementation.
This could be an implementation:
; hl = sprite data
; de = screen
; bc = module
draw_sprite:
ld (end+1),sp
di
ld sp,hl
ex hl,de
ret
repeat 6
inc l
rend
transparent:
ret
repeat N_MAX_PIXELS_WIDTH/2
pop de
ld (hl),e
inc l
ld (hl),e
inc l
ld (hl),d
inc l
ld (hl),d
inc l
rend
write_even:
ret
repeat N_MAX_PIXELS_WIDTH/2-1
pop de
ld (hl),e
inc l
ld (hl),e
inc l
ld (hl),d
inc l
ld (hl),d
inc l
rend
write_odd:
pop de
dec sp
ld (hl),e
inc l
ld (hl),e
inc l
ret
skipline:
add hl,bc
ret
long_skip:
pop de
add hl,de
ret
end:
ld sp,0
ei
ret
Using long_skip for >6 transparent pixels, or to do a precise skip through several lines you can gain some aditional nops in transparent pixels.
...
dw long_skip
dw (n pixles to skip)
...
Is it possible to skip line with a single add on the cpc?...
What value should bc be loaded with for the standard screen configuration?
Quote from: ssr86 on 05:14, 12 December 14
Is it possible to skip line with a single add on the cpc?...
What value should bc be loaded with for the standard screen configuration?
Due to the amstrad's awkward screen layout it can't be done with a single add.
you can use set/res to skip to handle the 800/1800/2000/2800/3000/3800 offsets, but if you need to continue then you need to add a value on.
You can avoid it a little by making each char line 1 scanline tall, but the screen becomes shorter.
Quote from: andycadley on 16:51, 11 December 14
The labels are at the end of the routines, the sprite data effectively creates jumps to say (transparent - 3) for three pixels worth of transparency which is why the instruction at the point of the label is just a return (because 0 transparent pixels would be transparent - 0 = transparent) - it's a pretty neat technique, if a little harder to read.
Yes, @Rhino (http://www.cpcwiki.eu/forum/index.php?action=profile;u=174) isn't it more like this?
start:
bytes3:
;; 3 bytes
ldi
bytes2:
;; 2 bytes
ldi
bytes1:
;; 1 byte
ldi
ret
defw bytes3
defw bytes2
defw bytes1
Quote from: ssr86 on 05:14, 12 December 14
Is it possible to skip line with a single add on the cpc?...
What value should bc be loaded with for the standard screen configuration?
If you use a 4x4 pixel size mode for your Chunky Pixel Curator (by the way, congratulations on it!), you can skip screen scanlines only with an add.
To do it, you only need switch between two possible values to add (#2000-sprite width and #e050-sprite width for the standar screen) using the first value for even lines and the second for odd.
Suppose you can use two registers to store these values, then at the beginning of the draw sprite routine you can do:
- if the sprite must be draw on an even scanline, bc = #2000-sprite width, de = #e050-sprite width
- if the sprite must be draw on an odd scanline, bc = #e050-sprite width, de = #2000-sprite width
And in the sprite description data:
sprite_desc:
; line 0
...
dw skip_evenline
; line 1
...
dw skip_oddline
...
and back to the code:
skip_evenline:
add hl,bc
ret
skip_oddline:
add hl,de
ret
Because in the implementation you can use bc for this, but no de, you can do something like:
skip_oddline:
ld de,xxxx
add hl,de
ret
and at the beginning, when you have the right add values in bc and de:
ld (skip_oddline + 1),de
Quote from: arnoldemu on 10:08, 12 December 14
Yes, @Rhino (http://www.cpcwiki.eu/forum/index.php?action=profile;u=174) isn't it more like this?
start:
bytes3:
;; 3 bytes
ldi
bytes2:
;; 2 bytes
ldi
bytes1:
;; 1 byte
ldi
ret
defw bytes3
defw bytes2
defw bytes1
Yes, that's clearer.
But it should be the same for the "write" code into repeat macros?