News:

Printed Amstrad Addict magazine announced, check it out here!

Main Menu
avatar_ervin

quick question about overwriting bytes for chunky pixel sprites

Started by ervin, 12:03, 03 November 14

Previous topic - Next topic

0 Members and 2 Guests are viewing this topic.

ervin

Hi everyone.

Some of you may remember the Chunky Pixel Collision program I've been writing for the last couple of years.
Well, development continues, and it is now at the point that the program itself is finished!
I just need to put the graphics and sound in (which is quite tedious compared to the programming stuff, I must say).

I just have one little question to ask before I willingly accept that I can optimise no more...

Let's say HL is the current sprite pointer, and DE is the screen pointer.

A:
ld a,(hl)
cp #99
jr z,B
ld (de),a

B:
inc e
inc l


Now, I'm using #99 as the value for a transparent byte.
It could be anything really, except for a valid CPC colour value.
I'm not using 0 because I want to have 16 colours available, instead of 15 colours plus a transparent.

So, if the current byte is #99, it is transparent, so I perform JR Z in order to skip writing to the screen.

My question is whether there is a quicker way.
This is working well, and is fast, but you programmers will be familiar with that niggling feeling in the back of your skull... "there must be a faster way"...

Is there some way of doing this with a combination of AND, OR or XOR that means I can skip the CP and JR Z instructions?

Thanks for any ideas.

MaV

Hm, here's a quickshot which eliminates the comparison: Use #7F as the value for the transparent byte, then test for parity after ld a,(hl). That should save you 2 or 1 µsec depending on whether you take the jump or don't.


A:
ld a, (hl)
jp po, B     ;; is it #7F ?
ld (de), a   ;; nay
B:           ;; yeah
inc e
inc l


I didn't test it (thoroughly) but it should work.
Black Mesa Transit Announcement System:
"Work safe, work smart. Your future depends on it."

arnoldemu

Quote from: MaV on 13:26, 03 November 14
Hm, here's a quickshot which eliminates the comparison: Use #7F as the value for the transparent byte, then test for parity after ld a,(hl). That should save you 2 or 1 µsec depending on whether you take the jump or don't.


A:
ld a, (hl)
jp po, B     ;; is it #7F ?
ld (de), a   ;; nay
B:           ;; yeah
inc e
inc l


I didn't test it (thoroughly) but it should work.

ld a,(hl) will not set the flag. You need at least or a or something to set the flags.



My games. My Games
My website with coding examples: Unofficial Amstrad WWW Resource

ervin

Ah, ok.
Thanks guys.

Looks like it's as fast as it gets then!


Axelay

Do you have a spare register you can preload &99 into to make the comparison against, so cp a,c rather than cp a,&99?

MaV

Quote from: arnoldemu on 14:11, 03 November 14
ld a,(hl) will not set the flag. You need at least or a or something to set the flags.
Crap! :(
Black Mesa Transit Announcement System:
"Work safe, work smart. Your future depends on it."

ervin

Quote from: Axelay on 14:32, 03 November 14
Do you have a spare register you can preload &99 into to make the comparison against, so cp a,c rather than cp a,&99?

Yep, I already do that.
I just put #99 into the example code to more clearly illustrate what is going on.

In the actual code, I've got regC holding #99, so I've got CP C as the comparison.
Which is kind of appropriate really!
8)

arnoldemu

Ok, so this idea wouldn't work then ;)

You could put 99 into another register.

Then you can do:


ld a,(hl)
cp b
jr z,label
...


From looking at your code, transparent takes:
2+2+3+1+1 = 9

opaque takes:
2+2+2+1+1 = 8

trying:


ld a,(hl)
or a
jr z,b
inc a
ld (de),a
b:
inc e
inc l


From looking at your code, transparent takes:
2+1+3+1+1 = 8

opaque takes:
2+1+2+1+1+1 = 8

gives: transparent taking 8 and opaque taking 8.
the pixels would need to be encoded differently so that 0 becomes transparent.

FF is stored as fe compared to ff, so that the inc brings it back to ff. 00 is stored as ff.

This works because your limited to these values:

00
c0
0c
cc
30
f0
3c
fc
03
c3
0f
cf
33
f3
3f
ff

I can't choose dec, because storing ff means storing it as 0 which would be recognised as transparent.

I hope I got my cycle counting correct for jr.





My games. My Games
My website with coding examples: Unofficial Amstrad WWW Resource

ervin

Thanks for that.

Unfortunately I really need #00 to represent a colour, as there will likely be rooms where the full 16 colours are used.
:(

arnoldemu

Quote from: ervin on 14:57, 03 November 14
Thanks for that.

Unfortunately I really need #00 to represent a colour, as there will likely be rooms where the full 16 colours are used.
:(
yes my code allows that. 00 for transparent, ff for actual pixel data 00.

It will find ff, it will recognise it's not zero, it will increment it to 00 and write it. ;)

All that is different is that your pixel data is stored -1, and 00 represents transparent.




My games. My Games
My website with coding examples: Unofficial Amstrad WWW Resource

ervin

Ah, interesting!
(Sorry I was a bit slow in following the technique being described in your previous email).

That's a really clever idea!

opqa

I don't know anything about the game, and maybe what I'm about to say sounds too much obvious and you have already discarded it. But there is always the option to improve speed by sacrificing memory. In your example you could avoid all the jumps and comparison by writing a dedicated printing routine for each kind or type of sprite that share the same transparent/opaque pattern. For instance, for a, let's say, for a 10 bytes wide sprite, if the first line has three transparent bytes, then four opaque, and then another three transparent; you would only store opaque bytes for this sprite in memory and you would do:


(starting from top left corner)
inc e
inc e
inc e

.firstline_loop
ld b,4
ld a,(hl)
ld (de),a
inc e
inc l
djnz firstline_loop

;jump to next line and go on...


There are several possibilities, you can unroll all the loops for even more speed, you can also try to share code for all the lines that exhibit the same transparent/opaque pattern.

ssr86

Try "compressed sprites" with CPCEPsprites app... :-[ ...
Opqa's proposition are "line-compressed" sprites. And there are a few "fetch next code" methods to choose from... Maybe "list of calls" or (if you don't use interrupts) "stack+address list". The generated code won't look too pretty but maybe would still be of some use...

ervin

Thanks for the suggestions guys, but it's not really suitable for my program.
I perform real-time sprite scaling (with insane amounts of self-modifying code) so the sprite routines need to be flexible.

Executioner



Powered by SMFPacks Menu Editor Mod