sprite routine question

Started by ervin, 01:05, 10 December 14

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

ervin

Hi everyone.

One last gasp at optimising my sprite code in Chunky Pixel Curator.
8)

The sprite code can go down 2 paths.
One for "mode 1" sprites, and one for "mode 0" sprites.

Mode 1 sprites are about as optimised as they can possibly be, so I'm happy with that.

Mode 0 sprites however have a strange little quirk.
I *think* they are as fast as they can possibly be - I certainly can't figure out how to make them faster.
But there is still that nagging feeling...

LABEL1:

ld a,(hl)
cp c
jr z,LABEL2

ld (de),a
inc e
ld (de),a
dec e

LABEL2:

inc e
inc e
inc l


The idea is that a byte value is read, compared with the transparent value held in C, and then the code either plots 2 pixels (for mode 0 pixel width), or skips ahead 2 pixels if the byte value represents transparency.

Skipping the plotting of the pixels is fine, but when plotting the pixels, it DECs the screen pointer, in order to allow falling through to the same code the runs when the 2 pixels are skipped.
(That is tricky to explain - apologies if it's a confusing mess).

Does anyone know of some amazing trick that will allow me to avoid having to DEC E before LABEL2, in order to put the screen pointer in the right spot?
Or is this about as optimised as it gets?

Thanks for any suggestions.

Ast

Quote from: ervin

Does anyone know of some amazing trick that will allow me to avoid having to DEC E before LABEL2, in order to put the screen pointer in the right spot?
Or is this about as optimised as it gets?

Thanks for any suggestions.
Store your datas in another way.
_____________________

Ast/iMP4CT. "By the power of Grayskull, i've the power"

http://amstradplus.forumforever.com/index.php
http://impdos.wikidot.com/
http://impdraw.wikidot.com/

All friends are welcome !

Axelay

Not amazing, and real ugly, but I think, if I have the maths right, this would be 1 nop faster on a write:


ld a,(hl)
cp c
jr nz,LabelWrite
inc e
jr SkipWrite
.LabelWrite
ld (de),a
inc e
ld (de),a
.SkipWrite
inc e
inc l



So on write, you'd have the slower jump rather than just passing over it, but drop 1 each of inc e and dec e, so 1 nop faster (I think).  But the blanks are slower, so it's no good if you have lots of transparency on your sprites.


ervin

Thanks for your suggestions guys.

Axelay - I gave it a go, and I got very similar results in terms of speed.
I guess the transparent vs non-transparent pixels balance it out.


Ast

Ervin : why do you need sprites. Are you making a game for our beloved Cpc?
_____________________

Ast/iMP4CT. "By the power of Grayskull, i've the power"

http://amstradplus.forumforever.com/index.php
http://impdos.wikidot.com/
http://impdraw.wikidot.com/

All friends are welcome !

ervin

Quote from: Ast on 14:05, 10 December 14
Ervin : why do you need sprites. Are you making a game for our beloved Cpc?
Chunky Pixel Curator - WIP
8)

Rhino

For transparent = 0 try:



transparent:

inc e
inc e
inc l

start:

repeat N_PIXELS

or (hl)
ret z    ; if z go transparent

ld (de),a
inc e
ld (de),a
inc e
inc l

xor a

rend

end:



The stack requires something like this:



stack_table:

repeat N_PIXELS

dw transparent

rend

dw end



Last pixel in the sprite must be transparent and sp should start with the right index on that table based on the number of transparent pixels in the sprite.

8/11 nops

Rhino

#7
Another suggestion for compressed sprites:



    ; sp = sprite
    ; hl = screen
    ; bc = screen module

draw_sprite:
    ret

    repeat N_MAX_PIXELS_WIDTH

    inc l

    rend

transparent:

    ret

    repeat N_MAX_PIXELS_WIDTH/2

    pop    de
    ld    (hl),e
    inc l
    ld    (hl),e
    inc l
    ld    (hl),d
    inc l
    ld    (hl),d
    inc l

    rend

write_even:

    ret

    repeat N_MAX_PIXELS_WIDTH/2-1

    pop    de
    ld    (hl),e
    inc l
    ld    (hl),e
    inc l
    ld    (hl),d
    inc l
    ld    (hl),d
    inc l

    rend

write_odd:
    pop    de
    dec    sp
    ld    (hl),e
    inc l
    ld    (hl),e
    inc l

    ret

skipline:
    add    hl,bc
    ret

end:


The sprite compressed format would look like this:


sprite_desc:

    ; line 0

    dw    transparent - (n transparent pixels)
    dw    write_even - (n pixles to write/2) * 9 ; (9 = write 2 pixels routine len)
    db    pixels data   ; (even block of pixel data)
    ...
    dw    transparent - (n transparent pixels)
    dw    write_odd - (n pixles to write/2) * 9
    db    pixels data  ; (odd block of pixel data)
    ...
    dw    skipline

    ; line 1
    ...
    dw    end


This is about 1 nop per transparent and 7.5 per write.

ervin

Thanks Rhino - some *excellent* information there!

ssr86

#9
@Rhino:
I think you got the ret's in wrong places - why the ret's are at the begginings of the subroutines? The draw sprite ret will jump me to transparent label where there awaits another ret for me so I'll jump to write even immediately without incrementing l... Write even also begins with a ret so I the program will most probably hang because the stack points to the pixel bytes...

But great ideas. I've implemented a similar compressed sprite routine (idea by ralferoo) in cpcepsprites but my pixel data isn't on the stack but in a separate buffer...  Thanks:)

andycadley

Quote from: ssr86 on 16:27, 11 December 14
@Rhibo:
I think you got the ret's in wrong places - why the ret's are at the begginings of the subroutines? The draw sprite ret will jump me to transparent label where there awaits another ret for me so I'll jump to write even immediately without incrementing l... Write even also begins with a ret so I the program will most probably hang because the stack points to the pixel bytes...

But great ideas. I've implemented a similar compressed sprite routine (idea by ralferoo) in cpcepsprites but my pixel data isn't on the stack but in a separate buffer...  Thanks:)
The labels are at the end of the routines, the sprite data effectively creates jumps to say (transparent - 3) for three pixels worth of transparency which is why the instruction at the point of the label is just a return (because 0 transparent pixels would be transparent - 0 = transparent) - it's a pretty neat technique, if a little harder to read.

ssr86

#11
Quote from: andycadley on 17:51, 11 December 14
The labels are at the end of the routines, the sprite data effectively creates jumps to say (transparent - 3) for three pixels worth of transparency which is why the instruction at the point of the label is just a return (because 0 transparent pixels would be transparent - 0 = transparent) - it's a pretty neat technique, if a little harder to read.
:o
wow :)

Thank you for the explanation. I havent't read the minus part correctly...Now I think I understand 

Rhino

Quote from: ssr86 on 16:27, 11 December 14
@Rhibo:
I think you got the ret's in wrong places - why the ret's are at the begginings of the subroutines? The draw sprite ret will jump me to transparent label where there awaits another ret for me so I'll jump to write even immediately without incrementing l... Write even also begins with a ret so I the program will most probably hang because the stack points to the pixel bytes...

But great ideas. I've implemented a similar compressed sprite routine (idea by ralferoo) in cpcepsprites but my pixel data isn't on the stack but in a separate buffer...  Thanks:)

Sorry, my code above is not well explained and does not correspond to a full implementation.

This could be an implementation:



    ; hl = sprite data
    ; de = screen
    ; bc = module

draw_sprite:
    ld    (end+1),sp
    di
    ld    sp,hl
    ex    hl,de
    ret

    repeat 6

    inc l

    rend

transparent:
    ret

    repeat N_MAX_PIXELS_WIDTH/2

    pop    de
    ld    (hl),e
    inc l
    ld    (hl),e
    inc l
    ld    (hl),d
    inc l
    ld    (hl),d
    inc l

    rend

write_even:
    ret

    repeat N_MAX_PIXELS_WIDTH/2-1

    pop    de
    ld    (hl),e
    inc l
    ld    (hl),e
    inc l
    ld    (hl),d
    inc l
    ld    (hl),d
    inc l

    rend

write_odd:
    pop    de
    dec    sp
    ld    (hl),e
    inc l
    ld    (hl),e
    inc l

    ret

skipline:
    add    hl,bc
    ret

long_skip:
    pop    de
    add    hl,de
    ret

end:
    ld    sp,0
    ei
    ret



Using long_skip for >6 transparent pixels, or to do a precise skip through several lines you can gain some aditional nops in transparent pixels.
...
dw long_skip
dw (n pixles to skip)
...


ssr86

Is it possible to skip line with a single add on the cpc?...
What value should bc be loaded with for the standard screen configuration?   

arnoldemu

Quote from: ssr86 on 06:14, 12 December 14
Is it possible to skip line with a single add on the cpc?...
What value should bc be loaded with for the standard screen configuration?
Due to the amstrad's awkward screen layout it can't be done with a single add.

you can use set/res to skip to handle the 800/1800/2000/2800/3000/3800 offsets, but if you need to continue then you need to add a value on.

You can avoid it a little by making each char line 1 scanline tall, but the screen becomes shorter.
My games. My Games
My website with coding examples: Unofficial Amstrad WWW Resource

arnoldemu

#15
Quote from: andycadley on 17:51, 11 December 14
The labels are at the end of the routines, the sprite data effectively creates jumps to say (transparent - 3) for three pixels worth of transparency which is why the instruction at the point of the label is just a return (because 0 transparent pixels would be transparent - 0 = transparent) - it's a pretty neat technique, if a little harder to read.
Yes, @Rhino isn't it more like this?



start:
bytes3:
;; 3 bytes
ldi
bytes2:
;; 2 bytes
ldi
bytes1:
;; 1 byte
ldi
ret

defw bytes3
defw bytes2
defw bytes1

My games. My Games
My website with coding examples: Unofficial Amstrad WWW Resource

Rhino

#16
Quote from: ssr86 on 06:14, 12 December 14
Is it possible to skip line with a single add on the cpc?...
What value should bc be loaded with for the standard screen configuration?

If you use a 4x4 pixel size mode for your Chunky Pixel Curator (by the way, congratulations on it!), you can skip screen scanlines only with an add.
To do it, you only need switch between two possible values to add (#2000-sprite width and #e050-sprite width for the standar screen) using the first value for even lines and the second for odd.

Suppose you can use two registers to store these values, then at the beginning of the draw sprite routine you can do:
- if the sprite must be draw on an even scanline, bc = #2000-sprite width, de = #e050-sprite width
- if the sprite must be draw on an odd scanline, bc = #e050-sprite width, de = #2000-sprite width


And in the sprite description data:

sprite_desc:

    ; line 0

    ...
    dw    skip_evenline

    ; line 1
    ...
    dw    skip_oddline
    ...


and back to the code:

skip_evenline:
    add hl,bc
    ret

skip_oddline:
    add hl,de
    ret

Because in the implementation you can use bc for this, but no de, you can do something like:

skip_oddline:
    ld de,xxxx
    add hl,de
    ret

and at the beginning, when you have the right add values in bc and de:
    ld (skip_oddline + 1),de

Rhino

Quote from: arnoldemu on 11:08, 12 December 14
Yes, @Rhino isn't it more like this?



start:
bytes3:
;; 3 bytes
ldi
bytes2:
;; 2 bytes
ldi
bytes1:
;; 1 byte
ldi
ret

defw bytes3
defw bytes2
defw bytes1



Yes, that's clearer.
But it should be the same for the "write" code into repeat macros?

Powered by SMFPacks Menu Editor Mod