Author Topic: Dual playfield mode 0 sprites + packing  (Read 2392 times)

0 Members and 1 Guest are viewing this topic.

Offline ssr86

  • CPC664
  • ***
  • Posts: 120
  • Country: pl
    • Awards
Dual playfield mode 0 sprites + packing
« on: 00:30, 09 December 14 »
Here's a method for 2-bits-per-pixel-packing of dual playfield mode 0 sprites. Saves half of the memory needed for the sprites, but (as always the case) at the cost of some speed...
I don't think it will be much useful for any of you, but here goes...

1. Configure the dual playfield palette so that bits  0,1 or 2,3 of the ink number will be used for the background inks.
These would be the "SSBB" or the standard "BBSS" configuration [see next post].
[actually the list of possible configurations was more useful when I started writing this but only because I messed up the bits order in the pixel bytes and had to find some other "good" configurations]
I'll use the standard "BBSS" configuration as used by Arnoldemu in his examples
 
2. Store sprite data in a 2-bits-per-pixel-packed and pre-rotated (one rlca/rrca) form.
This means storing 2 pixel pairs in one byte, so the memory needed for the sprite data is halved.
We use 0,4,8,c for background inks, so for sprite data pixels, bits 2 and 3 are always zero.
So if we have two sprite pixel bytes XY and ST of the bit form x0-y0-x2-y2-x1-y1-x3-y3 and s0-t0-s2-t2-s1-t1-s3-t3, then, after zeroing the redundant bits they become x0-y0-00-00-x1-y1-00-00 and s0-y0-00-00-s1-t1-00-00.
We combine them into one byte and get x0-y0-s0-t0-x1-y1-s1-t1.
Then we rotate all the bytes by one bit right (like rrca) or one bit left (like rlca).
Let's assume that we'll be using right rotation so our packed byte pair becomes t1-x0-y0-s0-t0-x1-y1-s1.
To get the left byte use rrca + and %11001100.
To get the right byte use rlca + and %11001100.

 
Storing sprites in such form saves us half the memory needed for the data and we can unpack the byte pairs during drawing.
Here's an example of a sprite routine doing this:

Code: [Select]
;; de=^sprite
;; hl=^screen
    ld b,%11001100    ;; preload b with bitmask
    ld iyh,sprite_height
draw_looph:
    ld iyl,sprite_height*sprite_width
draw_loopw:
    ld a,(de)    ;; get the byte pair in byte-packed form
    ld c,a        ;; save for later
    rrca        ;; rotate to get them in place
    and b        ;; get left pixel bit pairs

    or (hl)        ;; combine with background
    ld (hl),a    ;; save to screen memory
    inc hl        ;; go to next screen position
    ld a,c        ;; restore the packed byte
    rlca        ;; rotate to get them in place
    and b        ;; get right pixel bit pairs
    or (hl)        ;; combine with background
    ld (hl),a    ;; save to screen memory
    inc hl        ;; go to next screen position
    inc de        ;; next sprite data byte

    dec iyl
    jp nz,draw_loopw
    ;;
    ;; get next line address code
    ;;
    dec iyh
    jp nz,draw_looph
    ret

If we count the execution time of only the inside of the loop we get 22 nops (19 if we can align the screen and sprite data) per byte pair.
For comparison the normal (using unpacked data) routine (inside of the loop presented below) would take 20 nops (16 if aligned) per byte pair:

Code: [Select]
    ld a,(de)
    or (hl)
    ld (de),a
    inc hl
    inc de
    ld a,(de)
    or (hl)
    ld (de),a
    inc hl
    inc de

Storing the data in a prerotated state isn't necassary but makes the code somewhat more elegant - one rlca/rrca for one half-byte
You could store the data normally and have no rlca/rrca for one half-byte and have two rlca/rrca for the other half.
However for using this method for flipping or shifting the prerotated storage is better because it evens the times of drawing the two versions of the sprite.


We could use a similar approach for flipping/shifting dual-playfield mode 0 sprites.
Instead of storing every two consecutive byte pairs in one byte, store half-byte of the normal form and half for the flipped/shifted form.
So the XY above would be a normal form byte and ST could be a byte of the flipped/shifted form.
Note that you wouldn't even have to change the direction of storing/loading data.

Code: [Select]
draw_normal:
;; de=^sprite
;; hl=^screen
    ld c,%11001100    ;; preload c with the bit mask
    ld iyh,sprite_height
dnorm_looph:
    ld b,sprite_width    ;; load b with loop count
dnorm_loopw:
    ld a,(de)
    rrca
    and c        ;; get left pixel pair bits - the normal sprite version bits
   
    or (hl)   
    ld (hl),a
    inc hl
    inc de

    djnz dnorm_loopw
    ;;
    ;; get next line address code
    ;;
    dec iyh
    jp nz,dnorm_looph
    ret

It's 12 nops per byte (10 if screen and data aligned) - I'm counting only the inside of the loop
For comparison the normal (using unpacked data) routine takes 10 nops (8 if aligned) per byte.

Code: [Select]
draw_flipped:    ;; or shifted
;; de=^sprite
;; hl=^screen
    ld c,%11001100    ;; preload c with the bit mask
    ld iyh,sprite_height
dflip_looph:
    ld b,sprite_height*sprite_width
dflip_loopw:
    ld a,(de)
    rlca
    and c        ;; get right pixel pair bits - the flipped (shifted) version bits
   
    or (hl)   
    ld (hl),a
    inc hl
    inc de

    djnz dflip_loopw
    ;;
    ;; get next line address code
    ;;
    dec iyh
    jp nz,dflip_looph
    ret

It takes 12 nops per byte (10 if screen and data aligned) - I'm counting only the inside of the loop.
For comparison the normal (using unpacked data) routine takes 10 nops (8 if aligned) per byte

Note that the only difference between the two routines is the direction of rotation (rlca/rrca)

Another idea would be to use mode 3 with this - use the hidden four bits to store the flipped or shifted version of the sprite...
However - only 4 colors total...
Maybe mode3 could be used in some other way... Maybe use the hidden bits for storing background data?
Haven't really thought about mode3 to tell you the truth so I don't know if even the flipping/shifting would be any good...
« Last Edit: 01:06, 22 April 15 by ssr86 »
like
0
No reactions

Offline ssr86

  • CPC664
  • ***
  • Posts: 120
  • Country: pl
    • Awards
Re: Dual playfield mode 0 sprites + packing
« Reply #1 on: 00:35, 09 December 14 »
List of possible "dual playfield mode 0" palette configurations:

legend:
"BBSS" means that bits 3 and 2 are background ink bits and bits 0,1 are sprite ink bits
LBn - left pixel background ink bit n
RBn - right pixel background ink bit n
LSn - left pixel sprite ink bit n
RSn - right pixel sprite ink bit n
n=0,1

1. BBSS 
 - sprite data bytes of the form LS0 - RS0 - LB0 - RB0 - LS1 - RS1 - LB1 - RB1
 - inks 0, 4, 8, c for background inks
 - inks 1 = 5 = 9 = d, 2 = 6 = a = e and 3 = 7 = b = f for sprite (0 as transparent)
 - use and %00110011 for erasing sprite bytes

ink: 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f
     B0 S1 S2 S3 B1 S1 S2 S3 B2 S1 S2 S3 B3 S1 S2 S3


2. BSBS
 - screen data bytes of the form LS0 - RS0 - LS1 - RS1 - LB0 - RB0 - LB1 - RB1
 - inks 0, 2, 8, a for background
 - inks 1 = 3 = 9 = b, 4 = 6 = c = e and 5 = 7 = d = f for sprite
 - use and %00001111 for erasing sprite bytes

ink: 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f
     B0 S1 B1 S1 S2 S3 S2 S3 B2 S1 B3 S1 S2 S3 S2 S3


3. BSSB
 - screen data bytes of the form LB0 - RB0 - LS1 - RS1 - LS0 - RS0 - LB1 - RB1
 - inks 0, 1, 8, 9 for background
 - inks 2 = 3 = a = b, 4 = 5 = c = d and 6 = 7 = e = f for sprite
 - use and %11000011 for erasing sprite bytes

ink: 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f
     B0 B1 S0 S0 S1 S1 S3 S3 B2 B3 S0 S0 S1 S1 S3 S3


4. SBBS
 - screen data bytes of the form LS0 - RS0 - LB1 - RB1 - LB0 - RB0 - LS1 - RS1
 - inks 0, 2, 4, 6 for background
 - inks 1 = 3 = 5 = d, 8 = a = c = e and 9 = b = d = f for sprites
 - use and %00111100 for erasing sprite bytes

ink: 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f
     B0 S1 B1 S1 B2 S1 B3 S3 S2 S3 S2 S3 S2 S3 S2 S3


5. SSBB
 - screen data bytes of the form LB0 - RB0 - LS0 - RS0 - LB1 - RB1 - LS1 - RS1
 - inks 0, 1, 2, 3 for background
 - inks 4 = 5 = 6 = 7, 8 = 9 = a = b and 9 = b = d = f for sprites
 - use and %11001100 for erasing sprite bytes

ink: 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f
     B0 B1 B2 B3 S1 S1 S1 S1 S2 S2 S2 S2 S3 S3 S3 S3


6. SBSB
 - screen data bytes of the form LB0 - RB0 - LB1 - RB1 - LS0 - RS0 - LS1 - RS1
 - inks 0, 1, 4, 5 for background
 - inks 2 = 3 = 6 = 7, 8 = 9 = c = d and a = b = e = f for sprites
 - use and %11110000 for erasing sprite bytes

ink: 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f
     B0 B1 S1 S1 B2 B3 S1 S1 S2 S2 S3 S3 S2 S2 S3 S3



The remaining possibilities:

The following four give you 2 colors for background (you could use xor for drawing/erasing) and 8 colors for sprites:

7. BSSS
    [this and SBBB seem to be the only "dual-playfield" modes possible on the enterprise computers... (because of the fixbias thing)]
8. SBSS
9. SSBS
10. SSSB

and

11. SBBB
12. BSBB
13. BBSB
14. BBBS

These give you 2 color sprites (you could use xor for drawing/erasing them) and 8 colors for background.
« Last Edit: 21:13, 09 December 14 by ssr86 »
like
0
No reactions

Offline ssr86

  • CPC664
  • ***
  • Posts: 120
  • Country: pl
    • Awards
Re: Dual playfield mode 0 sprites + packing
« Reply #2 on: 23:56, 21 April 15 »
Quote
We could use a similar approach for flipping/shifting dual-playfield mode 0 sprites.
Instead of storing every two consecutive byte pairs in one byte, store half-byte of the normal form and half for the flipped/shifted form.
So the XY above would be a normal form byte and ST could be a byte of the flipped/shifted form.
Note that you wouldn't even have to change the direction of storing/loading data.
I've made a working example for this...

uses 22x24 pixels sprite in dpm0

timings:

draw normal (without the packing etc.): ~3280 nops (for comparison)
draw when packed: 3822 nops (both versions - the difference is only whether the routine uses rlca or rrca instructions inside the loop)
erase sprite: 2551 nops

I think that it's a good tradeoff - you halve the memory needed to normally store the two sprites (when preflipping) and it's faster than using the other flipping/shifting methods (?).
All for additional 2 nops ber pixel byte...

Because I don't have a converter for creating the packed and bit-prerotated data, I do it at startup...

I enclose horizontal flip version but all will be the same for othera, just the logic that chooses which version and when to use would be different...
Also could use it for a two-frame animation or really whatever two dpm0 sprites you want to pack together...

In the end decided to also attach the shifting version...

Would really appreciate some feedback...
« Last Edit: 03:08, 22 April 15 by ssr86 »
like
0
No reactions