sprite routine question

ervin · 00:05, 10 December 14

Hi everyone.

One last gasp at optimising my sprite code in Chunky Pixel Curator.

The sprite code can go down 2 paths.
One for "mode 1" sprites, and one for "mode 0" sprites.

Mode 1 sprites are about as optimised as they can possibly be, so I'm happy with that.

Mode 0 sprites however have a strange little quirk.
I *think* they are as fast as they can possibly be - I certainly can't figure out how to make them faster.
But there is still that nagging feeling...

Code Select

LABEL1:

ld a,(hl)
cp c
jr z,LABEL2

ld (de),a
inc e
ld (de),a
dec e

LABEL2:

inc e
inc e
inc l

The idea is that a byte value is read, compared with the transparent value held in C, and then the code either plots 2 pixels (for mode 0 pixel width), or skips ahead 2 pixels if the byte value represents transparency.

Skipping the plotting of the pixels is fine, but when plotting the pixels, it DECs the screen pointer, in order to allow falling through to the same code the runs when the 2 pixels are skipped.
(That is tricky to explain - apologies if it's a confusing mess).

Does anyone know of some amazing trick that will allow me to avoid having to DEC E before LABEL2, in order to put the screen pointer in the right spot?
Or is this about as optimised as it gets?

Thanks for any suggestions.

Ast · 10:29, 10 December 14

Quote from: ervin

Does anyone know of some amazing trick that will allow me to avoid having to DEC E before LABEL2, in order to put the screen pointer in the right spot?
Or is this about as optimised as it gets?

Thanks for any suggestions.

Store your datas in another way.

Axelay · 10:30, 10 December 14

Not amazing, and real ugly, but I think, if I have the maths right, this would be 1 nop faster on a write:

Code Select

ld a,(hl)
cp c
jr nz,LabelWrite
inc e
jr SkipWrite
.LabelWrite
ld (de),a
inc e
ld (de),a
.SkipWrite
inc e
inc l

So on write, you'd have the slower jump rather than just passing over it, but drop 1 each of inc e and dec e, so 1 nop faster (I think). But the blanks are slower, so it's no good if you have lots of transparency on your sprites.

ervin · 12:20, 10 December 14

Thanks for your suggestions guys.

Axelay - I gave it a go, and I got very similar results in terms of speed.
I guess the transparent vs non-transparent pixels balance it out.

Ast · 13:05, 10 December 14

Ervin : why do you need sprites. Are you making a game for our beloved Cpc?

ervin · 13:24, 10 December 14

Quote from: Ast on 13:05, 10 December 14
Ervin : why do you need sprites. Are you making a game for our beloved Cpc?

Chunky Pixel Curator - WIP

Rhino · 01:25, 11 December 14

For transparent = 0 try:

Code Select



transparent:

inc e
inc e
inc l

start:

repeat N_PIXELS

or (hl)
ret z    ; if z go transparent

ld (de),a
inc e
ld (de),a
inc e
inc l

xor a

rend

end:

The stack requires something like this:

Code Select



stack_table:

repeat N_PIXELS

dw transparent

rend

dw end

Last pixel in the sprite must be transparent and sp should start with the right index on that table based on the number of transparent pixels in the sprite.

8/11 nops

Rhino · 10:05, 11 December 14

Another suggestion for compressed sprites:

Code Select



    ; sp = sprite
    ; hl = screen
    ; bc = screen module

draw_sprite:
    ret

    repeat N_MAX_PIXELS_WIDTH

    inc l

    rend

transparent:

    ret

    repeat N_MAX_PIXELS_WIDTH/2

    pop    de
    ld    (hl),e
    inc l
    ld    (hl),e
    inc l
    ld    (hl),d
    inc l
    ld    (hl),d
    inc l

    rend

write_even:

    ret

    repeat N_MAX_PIXELS_WIDTH/2-1

    pop    de
    ld    (hl),e
    inc l
    ld    (hl),e
    inc l
    ld    (hl),d
    inc l
    ld    (hl),d
    inc l

    rend

write_odd:
    pop    de
    dec    sp
    ld    (hl),e
    inc l
    ld    (hl),e
    inc l

    ret

skipline:
    add    hl,bc
    ret

end:

The sprite compressed format would look like this:

Code Select


sprite_desc:

    ; line 0

    dw    transparent - (n transparent pixels)
    dw    write_even - (n pixles to write/2) * 9 ; (9 = write 2 pixels routine len)
    db    pixels data   ; (even block of pixel data)
    ...
    dw    transparent - (n transparent pixels)
    dw    write_odd - (n pixles to write/2) * 9
    db    pixels data  ; (odd block of pixel data)
    ...
    dw    skipline

    ; line 1
    ...
    dw    end

This is about 1 nop per transparent and 7.5 per write.

ervin · 13:44, 11 December 14

Thanks Rhino - some *excellent* information there!

ssr86 · 15:27, 11 December 14

@Rhino:
I think you got the ret's in wrong places - why the ret's are at the begginings of the subroutines? The draw sprite ret will jump me to transparent label where there awaits another ret for me so I'll jump to write even immediately without incrementing l... Write even also begins with a ret so I the program will most probably hang because the stack points to the pixel bytes...

But great ideas. I've implemented a similar compressed sprite routine (idea by ralferoo) in cpcepsprites but my pixel data isn't on the stack but in a separate buffer... Thanks:)

andycadley · 16:51, 11 December 14

Quote from: ssr86 on 15:27, 11 December 14
@Rhibo:
I think you got the ret's in wrong places - why the ret's are at the begginings of the subroutines? The draw sprite ret will jump me to transparent label where there awaits another ret for me so I'll jump to write even immediately without incrementing l... Write even also begins with a ret so I the program will most probably hang because the stack points to the pixel bytes...

But great ideas. I've implemented a similar compressed sprite routine (idea by ralferoo) in cpcepsprites but my pixel data isn't on the stack but in a separate buffer... Thanks:)

The labels are at the end of the routines, the sprite data effectively creates jumps to say (transparent - 3) for three pixels worth of transparency which is why the instruction at the point of the label is just a return (because 0 transparent pixels would be transparent - 0 = transparent) - it's a pretty neat technique, if a little harder to read.

ssr86 · 17:22, 11 December 14

Quote from: andycadley on 16:51, 11 December 14
The labels are at the end of the routines, the sprite data effectively creates jumps to say (transparent - 3) for three pixels worth of transparency which is why the instruction at the point of the label is just a return (because 0 transparent pixels would be transparent - 0 = transparent) - it's a pretty neat technique, if a little harder to read.

wow

Thank you for the explanation. I havent't read the minus part correctly...Now I think I understand

Rhino · 18:02, 11 December 14

Quote from: ssr86 on 15:27, 11 December 14
@Rhibo:
I think you got the ret's in wrong places - why the ret's are at the begginings of the subroutines? The draw sprite ret will jump me to transparent label where there awaits another ret for me so I'll jump to write even immediately without incrementing l... Write even also begins with a ret so I the program will most probably hang because the stack points to the pixel bytes...

But great ideas. I've implemented a similar compressed sprite routine (idea by ralferoo) in cpcepsprites but my pixel data isn't on the stack but in a separate buffer... Thanks:)

Sorry, my code above is not well explained and does not correspond to a full implementation.

This could be an implementation:

Code Select



    ; hl = sprite data
    ; de = screen
    ; bc = module

draw_sprite:
    ld    (end+1),sp
    di
    ld    sp,hl
    ex    hl,de
    ret

    repeat 6

    inc l

    rend

transparent:
    ret

    repeat N_MAX_PIXELS_WIDTH/2

    pop    de
    ld    (hl),e
    inc l
    ld    (hl),e
    inc l
    ld    (hl),d
    inc l
    ld    (hl),d
    inc l

    rend

write_even:
    ret

    repeat N_MAX_PIXELS_WIDTH/2-1

    pop    de
    ld    (hl),e
    inc l
    ld    (hl),e
    inc l
    ld    (hl),d
    inc l
    ld    (hl),d
    inc l

    rend

write_odd:
    pop    de
    dec    sp
    ld    (hl),e
    inc l
    ld    (hl),e
    inc l

    ret

skipline:
    add    hl,bc
    ret

long_skip:
    pop    de
    add    hl,de
    ret

end:
    ld    sp,0
    ei
    ret

Using long_skip for >6 transparent pixels, or to do a precise skip through several lines you can gain some aditional nops in transparent pixels.
...
dw long_skip
dw (n pixles to skip)
...

ssr86 · 05:14, 12 December 14

Is it possible to skip line with a single add on the cpc?...
What value should bc be loaded with for the standard screen configuration?

arnoldemu · 10:06, 12 December 14

Quote from: ssr86 on 05:14, 12 December 14
Is it possible to skip line with a single add on the cpc?...
What value should bc be loaded with for the standard screen configuration?

Due to the amstrad's awkward screen layout it can't be done with a single add.

you can use set/res to skip to handle the 800/1800/2000/2800/3000/3800 offsets, but if you need to continue then you need to add a value on.

You can avoid it a little by making each char line 1 scanline tall, but the screen becomes shorter.

arnoldemu · 10:08, 12 December 14

Quote from: andycadley on 16:51, 11 December 14
The labels are at the end of the routines, the sprite data effectively creates jumps to say (transparent - 3) for three pixels worth of transparency which is why the instruction at the point of the label is just a return (because 0 transparent pixels would be transparent - 0 = transparent) - it's a pretty neat technique, if a little harder to read.

Yes, @Rhino isn't it more like this?

Code Select



start:
bytes3:
;; 3 bytes
ldi
bytes2:
;; 2 bytes
ldi 
bytes1:
;; 1 byte
ldi
ret

defw bytes3
defw bytes2
defw bytes1

Rhino · 17:56, 12 December 14

Quote from: ssr86 on 05:14, 12 December 14
Is it possible to skip line with a single add on the cpc?...
What value should bc be loaded with for the standard screen configuration?

If you use a 4x4 pixel size mode for your Chunky Pixel Curator (by the way, congratulations on it!), you can skip screen scanlines only with an add.
To do it, you only need switch between two possible values to add (#2000-sprite width and #e050-sprite width for the standar screen) using the first value for even lines and the second for odd.

Suppose you can use two registers to store these values, then at the beginning of the draw sprite routine you can do:
- if the sprite must be draw on an even scanline, bc = #2000-sprite width, de = #e050-sprite width
- if the sprite must be draw on an odd scanline, bc = #e050-sprite width, de = #2000-sprite width

And in the sprite description data:

sprite_desc:

; line 0

...
dw skip_evenline

; line 1
...
dw skip_oddline
...

and back to the code:

skip_evenline:
add hl,bc
ret

skip_oddline:
add hl,de
ret

Because in the implementation you can use bc for this, but no de, you can do something like:

skip_oddline:
ld de,xxxx
add hl,de
ret

and at the beginning, when you have the right add values in bc and de:
ld (skip_oddline + 1),de

Rhino · 18:02, 12 December 14

Quote from: arnoldemu on 10:08, 12 December 14
Yes, @Rhino isn't it more like this?

Code Select Expand
start: bytes3: ;; 3 bytes ldi bytes2: ;; 2 bytes ldi bytes1: ;; 1 byte ldi ret defw bytes3 defw bytes2 defw bytes1

Yes, that's clearer.
But it should be the same for the "write" code into repeat macros?

News:

sprite routine question