CPCWiki forum

General Category => Programming => Topic started by: ervin on 00:05, 10 December 14

Title: sprite routine question
Post by: ervin on 00:05, 10 December 14
Hi everyone.

One last gasp at optimising my sprite code in Chunky Pixel Curator.
8)

The sprite code can go down 2 paths.
One for "mode 1" sprites, and one for "mode 0" sprites.

Mode 1 sprites are about as optimised as they can possibly be, so I'm happy with that.

Mode 0 sprites however have a strange little quirk.
I *think* they are as fast as they can possibly be - I certainly can't figure out how to make them faster.
But there is still that nagging feeling...

LABEL1:

ld a,(hl)
cp c
jr z,LABEL2

ld (de),a
inc e
ld (de),a
dec e

LABEL2:

inc e
inc e
inc l


The idea is that a byte value is read, compared with the transparent value held in C, and then the code either plots 2 pixels (for mode 0 pixel width), or skips ahead 2 pixels if the byte value represents transparency.

Skipping the plotting of the pixels is fine, but when plotting the pixels, it DECs the screen pointer, in order to allow falling through to the same code the runs when the 2 pixels are skipped.
(That is tricky to explain - apologies if it's a confusing mess).

Does anyone know of some amazing trick that will allow me to avoid having to DEC E before LABEL2, in order to put the screen pointer in the right spot?
Or is this about as optimised as it gets?

Thanks for any suggestions.
Title: Re: sprite routine question
Post by: Ast on 10:29, 10 December 14
Quote from: ervin

Does anyone know of some amazing trick that will allow me to avoid having to DEC E before LABEL2, in order to put the screen pointer in the right spot?
Or is this about as optimised as it gets?

Thanks for any suggestions.
Store your datas in another way.
Title: Re: sprite routine question
Post by: Axelay on 10:30, 10 December 14
Not amazing, and real ugly, but I think, if I have the maths right, this would be 1 nop faster on a write:


ld a,(hl)
cp c
jr nz,LabelWrite
inc e
jr SkipWrite
.LabelWrite
ld (de),a
inc e
ld (de),a
.SkipWrite
inc e
inc l



So on write, you'd have the slower jump rather than just passing over it, but drop 1 each of inc e and dec e, so 1 nop faster (I think).  But the blanks are slower, so it's no good if you have lots of transparency on your sprites.

Title: Re: sprite routine question
Post by: ervin on 12:20, 10 December 14
Thanks for your suggestions guys.

Axelay - I gave it a go, and I got very similar results in terms of speed.
I guess the transparent vs non-transparent pixels balance it out.

Title: Re: sprite routine question
Post by: Ast on 13:05, 10 December 14
Ervin : why do you need sprites. Are you making a game for our beloved Cpc?
Title: Re: sprite routine question
Post by: ervin on 13:24, 10 December 14
Quote from: Ast on 13:05, 10 December 14
Ervin : why do you need sprites. Are you making a game for our beloved Cpc?
Chunky Pixel Curator - WIP (http://www.cpcwiki.eu/forum/programming/chunky-pixel-curator-teaser/)
8)
Title: Re: sprite routine question
Post by: Rhino on 01:25, 11 December 14
For transparent = 0 try:



transparent:

inc e
inc e
inc l

start:

repeat N_PIXELS

or (hl)
ret z    ; if z go transparent

ld (de),a
inc e
ld (de),a
inc e
inc l

xor a

rend

end:



The stack requires something like this:



stack_table:

repeat N_PIXELS

dw transparent

rend

dw end



Last pixel in the sprite must be transparent and sp should start with the right index on that table based on the number of transparent pixels in the sprite.

8/11 nops
Title: Re: sprite routine question
Post by: Rhino on 10:05, 11 December 14
Another suggestion for compressed sprites:



    ; sp = sprite
    ; hl = screen
    ; bc = screen module

draw_sprite:
    ret

    repeat N_MAX_PIXELS_WIDTH

    inc l

    rend

transparent:

    ret

    repeat N_MAX_PIXELS_WIDTH/2

    pop    de
    ld    (hl),e
    inc l
    ld    (hl),e
    inc l
    ld    (hl),d
    inc l
    ld    (hl),d
    inc l

    rend

write_even:

    ret

    repeat N_MAX_PIXELS_WIDTH/2-1

    pop    de
    ld    (hl),e
    inc l
    ld    (hl),e
    inc l
    ld    (hl),d
    inc l
    ld    (hl),d
    inc l

    rend

write_odd:
    pop    de
    dec    sp
    ld    (hl),e
    inc l
    ld    (hl),e
    inc l

    ret

skipline:
    add    hl,bc
    ret

end:


The sprite compressed format would look like this:


sprite_desc:

    ; line 0

    dw    transparent - (n transparent pixels)
    dw    write_even - (n pixles to write/2) * 9 ; (9 = write 2 pixels routine len)
    db    pixels data   ; (even block of pixel data)
    ...
    dw    transparent - (n transparent pixels)
    dw    write_odd - (n pixles to write/2) * 9
    db    pixels data  ; (odd block of pixel data)
    ...
    dw    skipline

    ; line 1
    ...
    dw    end


This is about 1 nop per transparent and 7.5 per write.
Title: Re: sprite routine question
Post by: ervin on 13:44, 11 December 14
Thanks Rhino - some *excellent* information there!
Title: Re: sprite routine question
Post by: ssr86 on 15:27, 11 December 14
@Rhino (http://www.cpcwiki.eu/forum/index.php?action=profile;u=174):
I think you got the ret's in wrong places - why the ret's are at the begginings of the subroutines? The draw sprite ret will jump me to transparent label where there awaits another ret for me so I'll jump to write even immediately without incrementing l... Write even also begins with a ret so I the program will most probably hang because the stack points to the pixel bytes...

But great ideas. I've implemented a similar compressed sprite routine (idea by ralferoo) in cpcepsprites but my pixel data isn't on the stack but in a separate buffer...  Thanks:)
Title: Re: sprite routine question
Post by: andycadley on 16:51, 11 December 14
Quote from: ssr86 on 15:27, 11 December 14
@Rhibo:
I think you got the ret's in wrong places - why the ret's are at the begginings of the subroutines? The draw sprite ret will jump me to transparent label where there awaits another ret for me so I'll jump to write even immediately without incrementing l... Write even also begins with a ret so I the program will most probably hang because the stack points to the pixel bytes...

But great ideas. I've implemented a similar compressed sprite routine (idea by ralferoo) in cpcepsprites but my pixel data isn't on the stack but in a separate buffer...  Thanks:)
The labels are at the end of the routines, the sprite data effectively creates jumps to say (transparent - 3) for three pixels worth of transparency which is why the instruction at the point of the label is just a return (because 0 transparent pixels would be transparent - 0 = transparent) - it's a pretty neat technique, if a little harder to read.
Title: Re: sprite routine question
Post by: ssr86 on 17:22, 11 December 14
Quote from: andycadley on 16:51, 11 December 14
The labels are at the end of the routines, the sprite data effectively creates jumps to say (transparent - 3) for three pixels worth of transparency which is why the instruction at the point of the label is just a return (because 0 transparent pixels would be transparent - 0 = transparent) - it's a pretty neat technique, if a little harder to read.
:o
wow :)

Thank you for the explanation. I havent't read the minus part correctly...Now I think I understand 
Title: Re: sprite routine question
Post by: Rhino on 18:02, 11 December 14
Quote from: ssr86 on 15:27, 11 December 14
@Rhibo:
I think you got the ret's in wrong places - why the ret's are at the begginings of the subroutines? The draw sprite ret will jump me to transparent label where there awaits another ret for me so I'll jump to write even immediately without incrementing l... Write even also begins with a ret so I the program will most probably hang because the stack points to the pixel bytes...

But great ideas. I've implemented a similar compressed sprite routine (idea by ralferoo) in cpcepsprites but my pixel data isn't on the stack but in a separate buffer...  Thanks:)

Sorry, my code above is not well explained and does not correspond to a full implementation.

This could be an implementation:



    ; hl = sprite data
    ; de = screen
    ; bc = module

draw_sprite:
    ld    (end+1),sp
    di
    ld    sp,hl
    ex    hl,de
    ret

    repeat 6

    inc l

    rend

transparent:
    ret

    repeat N_MAX_PIXELS_WIDTH/2

    pop    de
    ld    (hl),e
    inc l
    ld    (hl),e
    inc l
    ld    (hl),d
    inc l
    ld    (hl),d
    inc l

    rend

write_even:
    ret

    repeat N_MAX_PIXELS_WIDTH/2-1

    pop    de
    ld    (hl),e
    inc l
    ld    (hl),e
    inc l
    ld    (hl),d
    inc l
    ld    (hl),d
    inc l

    rend

write_odd:
    pop    de
    dec    sp
    ld    (hl),e
    inc l
    ld    (hl),e
    inc l

    ret

skipline:
    add    hl,bc
    ret

long_skip:
    pop    de
    add    hl,de
    ret

end:
    ld    sp,0
    ei
    ret



Using long_skip for >6 transparent pixels, or to do a precise skip through several lines you can gain some aditional nops in transparent pixels.
...
dw long_skip
dw (n pixles to skip)
...

Title: Re: sprite routine question
Post by: ssr86 on 05:14, 12 December 14
Is it possible to skip line with a single add on the cpc?...
What value should bc be loaded with for the standard screen configuration?   
Title: Re: sprite routine question
Post by: arnoldemu on 10:06, 12 December 14
Quote from: ssr86 on 05:14, 12 December 14
Is it possible to skip line with a single add on the cpc?...
What value should bc be loaded with for the standard screen configuration?
Due to the amstrad's awkward screen layout it can't be done with a single add.

you can use set/res to skip to handle the 800/1800/2000/2800/3000/3800 offsets, but if you need to continue then you need to add a value on.

You can avoid it a little by making each char line 1 scanline tall, but the screen becomes shorter.
Title: Re: sprite routine question
Post by: arnoldemu on 10:08, 12 December 14
Quote from: andycadley on 16:51, 11 December 14
The labels are at the end of the routines, the sprite data effectively creates jumps to say (transparent - 3) for three pixels worth of transparency which is why the instruction at the point of the label is just a return (because 0 transparent pixels would be transparent - 0 = transparent) - it's a pretty neat technique, if a little harder to read.
Yes, @Rhino (http://www.cpcwiki.eu/forum/index.php?action=profile;u=174) isn't it more like this?



start:
bytes3:
;; 3 bytes
ldi
bytes2:
;; 2 bytes
ldi
bytes1:
;; 1 byte
ldi
ret

defw bytes3
defw bytes2
defw bytes1

Title: Re: sprite routine question
Post by: Rhino on 17:56, 12 December 14
Quote from: ssr86 on 05:14, 12 December 14
Is it possible to skip line with a single add on the cpc?...
What value should bc be loaded with for the standard screen configuration?

If you use a 4x4 pixel size mode for your Chunky Pixel Curator (by the way, congratulations on it!), you can skip screen scanlines only with an add.
To do it, you only need switch between two possible values to add (#2000-sprite width and #e050-sprite width for the standar screen) using the first value for even lines and the second for odd.

Suppose you can use two registers to store these values, then at the beginning of the draw sprite routine you can do:
- if the sprite must be draw on an even scanline, bc = #2000-sprite width, de = #e050-sprite width
- if the sprite must be draw on an odd scanline, bc = #e050-sprite width, de = #2000-sprite width


And in the sprite description data:

sprite_desc:

    ; line 0

    ...
    dw    skip_evenline

    ; line 1
    ...
    dw    skip_oddline
    ...


and back to the code:

skip_evenline:
    add hl,bc
    ret

skip_oddline:
    add hl,de
    ret

Because in the implementation you can use bc for this, but no de, you can do something like:

skip_oddline:
    ld de,xxxx
    add hl,de
    ret

and at the beginning, when you have the right add values in bc and de:
    ld (skip_oddline + 1),de
Title: Re: sprite routine question
Post by: Rhino on 18:02, 12 December 14
Quote from: arnoldemu on 10:08, 12 December 14
Yes, @Rhino (http://www.cpcwiki.eu/forum/index.php?action=profile;u=174) isn't it more like this?



start:
bytes3:
;; 3 bytes
ldi
bytes2:
;; 2 bytes
ldi
bytes1:
;; 1 byte
ldi
ret

defw bytes3
defw bytes2
defw bytes1



Yes, that's clearer.
But it should be the same for the "write" code into repeat macros?
Powered by SMFPacks Menu Editor Mod