News:

Printed Amstrad Addict magazine announced, check it out here!

Main Menu
avatar_Arnaud

#CPCTelera : Colorize sprite WIP

Started by Arnaud, 21:09, 24 September 17

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Arnaud

Hello,

i'm working on a new feature for CPCTelera, replacing a color in a sprite.
The functions for Mode 0 and 1 and the example are done.

All the remarks are welcome  :)

[attachimg=1]

With @Docent optimizations (it's really faster !):

[attachimg=2]

Sykobee (Briggsy)

A useful feature indeed. How does this implementation work?


IIRC Barbarian did similar when unpacking its 3-colour (C64 originated) graphics into 15 colours.

Arnaud

Quote from: Sykobee (Briggsy) on 12:13, 25 September 17
A useful feature indeed. How does this implementation work?

I have two main functions :
void cpct_spriteColorizeM0(u8* sprite, u8* spriteColor, u8 width, u8 height, u8 oldColor, u8 newColor);

Replace a color of sprite or from sprite to another sprite.
Useful to replace colors before drawing the sprite.

void cpct_drawSpriteColorizeM0(u8* sprite, u8* destMem, u8 width, u8 height, u8 oldColor, u8 newColor);

Coding in progress. Replace a color of sprite and draw it in same time.

Next i have to do the same functions for spriteMasked and spriteMaskedAligned and all for Mode 1.
Well, 12 functions to code  :o

Arnaud

- Add cpct_drawSpriteColorize to draw sprite with one color replacement

- Update example :

       
  • Add some stars to use the new function
  • Correct double buffer glitch
Example animation and code updated in first post.

johnlobo


Very good idea, and very useful, indeed.


Willing to see the implementation of the masked aligned sprites finished, to begin to use it.

Arnaud

Quote from: johnlobo on 14:16, 27 September 17
Very good idea, and very useful, indeed.
Thanks  :D

Quote from: johnlobo on 14:16, 27 September 17
Willing to see the implementation of the masked aligned sprites finished, to begin to use it.

First post updated with :

       
  • Add cpct_drawSpriteMaskedAlignedColorizeM0
  • Update example to use it
Of course it's not perfectly optimized but it works.

Docent

#6
Quote from: Arnaud on 21:09, 24 September 17
Hello,

i'm working on a new feature for CPCTelera, replacing a color in a sprite.
For moment i coded the asm functions for Mode 0 and the example.

All the remarks are welcome  :)

[attachimg=1]

Hi,
Here's an optimized version for mode 0:

start:
;get proper function params instead the code below
ld c, #55
ld d, oldcolor
ld e, newcolor
exx
ld hl, src
ld de, dest
ld ixl, width
ld c, height
convertloop:
ld b, ixl ; 2
lineloop:
ld a, (hl) ; 2
exx ; 1
ld l, a ; 1
and c ; 1
cp d ; 1
jr nz, noteq1 ; 2
ld a, e ; 1
noteq1:
ld h, a ; 1
ld a, l ; 1
srl a ; 2
and c ; 1
cp d ; 1
jr nz, noteq2 ; 2
ld a, e ; 1
noteq2:
rlca    ; 1
or h ; 1
exx ; 1
ld  (de), a ; 2
inc hl ; 2
inc de ; 2
djnz lineloop ; 3
dec c ; 1
jr nz, convertloop ; 2


It is ~ 40% faster(30 vs 47) , but uses alternate register set.
I'd also turn convertPixel into a macro, so you'll save 8 mcycles per call and get rid of push/pop hl there (another 7mcycles) unless it is called also from other places.

Arnaud

First post updated with :

- Add following functions

       
  • cpct_drawSpriteMaskedColorizeM0
  • cpct_spriteMaskedColorizeM0
-> All functions for Mode 0 are done.

- Optimizations (but i have still work to do), thanks @Docent

- Update example with the new functions

Arnaud

Add first function for mode1 :

       
  • cpct_spriteColorizeM1
[attachimg=1]

SRS

#9
uuh looks like c64.  :o

;)

Arnaud

Quote from: SRS on 19:48, 01 October 17
uuh looks like c64.  :o

;)


You are right  :D , this is not the best color choice. Fortunately the example have better colors.

SRS

I really like the way you improve telera. If I ever get the endurance needed I have two or three ideas to prog for the CPC using cpctelera

Arnaud

#12
All mode 1 functions are coded (zip added in first post) and i have done some little optimizations.

I lost a lot of time because in my test project i forgot to disable firmware and when i used alternate registers it make crash the CPC randomly  :doh:

[attachimg=1]

Screenshot with all colorize functions called

Now i have to add commentaries in the example project and write the information header of all functions.

Docent

#13
Quote from: Arnaud on 20:24, 03 October 17
All mode 1 functions are coded (zip added in first post) and i have done some little optimizations.

I lost a lot of time because in my test project i forgot to disable firmware and when i used alternate registers it make crash the CPC randomly  :doh:

[attachimg=1]

Screenshot with all colorize functions called

Now i have to add commentaries in the example project and write the information header of all functions.


When using alternate registers you need to disable interrupts or patch interrupt vectors to stop the system from being called.
If you have problems with using the routine I posted earlier, you can try the one below - it only uses af' and is only 2 mcycles slower than the one with alternate registers (and still 30-40% faster)

; hl - source data ptr
; de - dest data ptr
; c  - src color
; ixl - dest color
; ixh - width
; a' - height
start:
ld a, width
ld (convertloop+2), a
ld a, height
convertloop:
ld ixh, #width ; 2
ex af, af' ; 1
lineloop:
ld a, (hl) ; 2
and #55 ; 2
cp c ; 1
jr nz, next ; 2
ld  a, ixl ; 2
next:
ld b, a ; 1
ld a, (hl) ; 2
and #aa ; 2
rra ; 1
cp c ; 1
jr nz, next2 ; 2
ld a, ixl ; 2
next2:
rlca ; 1
or b ; 1
ld (de),a ; 2
inc hl ; 2
inc de ; 2
dec ixh ; 2
jr nz, lineloop ; 2
ex af, af' ; 1
dec a l 1
jr nz, convertloop ; 2


I had a bit of free time so I made a routine using alternate registers set for mode 1. Its a lot faster - it takes only 55 mcycles for the main loop to process 4 pixels, so its faster per pixel than the mode 0 one :)

start:
ld c, #88
ld d, oldcolor
ld e, newcolor
ld ixl, width
exx
ld hl, src
ld de, dest
ld c, height
convertloop:
ld b, ixl ; 2
lineloop:
ld a, (hl) ; 2
exx ; 1

ld l, a ; 1
and c ; 1
cp d ; 1
jr nz, noteq1 ; 2
ld a, e ; 1

noteq1:
ld h, a ; 1

srl l ; 2
ld a, l ; 1
and c ; 1
cp d ; 1
jr nz, noteq2 ; 2
ld a, e ; 1

noteq2:
rlca ; 1
or h ; 1
ld h, a ; 1

srl l ; 2
ld a, l ; 1
and c ; 1
cp d ; 1
jr nz, noteq3 ; 2
ld a, e ; 1

noteq3:
rlca ; 1
rlca ; 1
or h ; 1
ld h, a ; 1

srl l ; 2
ld a, l ; 1
and c ; 1
cp d ; 1
jr nz, noteq4 ; 2
ld a, e ; 1

noteq4:
rlca ; 1
rlca ; 1
rlca ; 1
or h ; 1

exx ; 1
ld  (de), a ; 2
inc hl ; 2
inc de ; 2
djnz lineloop ; 3
dec c ; 1
jr nz, convertloop ; 2



btw: ld (pixels_AB), a    takes 3 mcycles, not 2


Arnaud

Hello @Docent,
i'm trying to use your code in my project and i have problems.

The remplaced color should be red (second ballon, first is original) :

[attachimg=1]

I don't see the error in my code, maybe the color convertion is not compatible with your code ? (see asm/cpct_spriteColorizeM1.asm)

Thanks,
Arnaud.

Docent

#15
Quote from: Arnaud on 21:24, 10 October 17
Hello @Docent,
i'm trying to use your code in my project and i have problems.

The remplaced color should be red (second ballon, first is original) :

[attachimg=1]

I don't see the error in my code, maybe the color convertion is not compatible with your code ? (see asm/cpct_spriteColorizeM1.asm)

Thanks,
Arnaud.

Hi,
Yes, try to replace srl l with sla l and rlca with rrca and it should work correctly with your color codes.

Arnaud

Quote from: Docent on 12:03, 11 October 17
Hi,
Yes, try to replace srl l with sla l and rlca with rrca and it should work correctly with your color codes.


Great that it ! Thanks @Docent.

Now mode 0.

Arnaud

#17
Hello,
now i'm working on optimization with M0.

Here two simple pieces of code, i wonder if they can be optimized :

cpct_drawSpriteMaskedColorizeM1.asm

    inc hl                 ;; [2] Next byte sprite Color source
    ld    a, (hl)          ;; [2] A = (HL) current Byte of sprite Color
    dec hl                 ;; [2] Previous byte sprite Mask source



cpct_drawSpriteMaskedAlignedColorizeM0.asm

drawByte:   
    push hl                ;; [4] Store HL (Current Source address)
                       
mask_table = .+1           ;; Placeholder for Mask Table
    ld   h, #00            ;; [2] H = Masked table adress (High Byte)
    ld   l, a              ;; [1] Access mask table element (table must be 256-byte aligned)
    ld   a, (de)           ;; [2] Get the value of the byte of the screen where we are going to draw
    and (hl)               ;; [2] Erase background part that is to be overwritten (Mask step 1)
    or   l                 ;; [1] Add up background and sprite information in one byte (Mask step 2)
    ld  (de), a            ;; [2] Save modified background + sprite data information into memory   

    pop hl                 ;; [3] Recover HL (Current Source address)


Are the push hl and pop hl can be replaced by something faster.

Thanks,
Arnaud

Edit : Sources updated in first post

arnoldemu

@Arnaud:

I generally use BC.

I have 1 table with the mask, and after that the table for re-colour:

e.g.
B=0x01 -> 0x0100->0x01ff is mask
B=0x02 -> 0x0200->0x02ff is re-colour.

I can use inc b/dec b to go to each table.

I then have DE pointing to graphics and HL to screen.

Maybe this helps?
My games. My Games
My website with coding examples: Unofficial Amstrad WWW Resource

arnoldemu

Quote from: Arnaud on 21:24, 10 October 17
Hello @Docent,
i'm trying to use your code in my project and i have problems.

The remplaced color should be red (second ballon, first is original) :

[attachimg=1]

I don't see the error in my code, maybe the color convertion is not compatible with your code ? (see asm/cpct_spriteColorizeM1.asm)

Thanks,
Arnaud.
@Arnaud: Are you using a lookup table to re-colour?
My games. My Games
My website with coding examples: Unofficial Amstrad WWW Resource

Arnaud

#20
Quote from: arnoldemu on 19:08, 22 October 17
@Arnaud: Are you using a lookup table to re-colour?


Yes, it's in CPCTelera function.

Edit : Comparison with optimized version (in GIF) in first post.

Docent

#21
Quote from: Arnaud on 18:35, 22 October 17
Hello,
now i'm working on optimization with M0.

Here two simple pieces of code, i wonder if they can be optimized :

drawByte:   
    push hl                ;; [4] Store HL (Current Source address)
                       
mask_table = .+1           ;; Placeholder for Mask Table
    ld   h, #00            ;; [2] H = Masked table adress (High Byte)
    ld   l, a              ;; [1] Access mask table element (table must be 256-byte aligned)
    ld   a, (de)           ;; [2] Get the value of the byte of the screen where we are going to draw
    and (hl)               ;; [2] Erase background part that is to be overwritten (Mask step 1)
    or   l                 ;; [1] Add up background and sprite information in one byte (Mask step 2)
    ld  (de), a            ;; [2] Save modified background + sprite data information into memory   

    pop hl                 ;; [3] Recover HL (Current Source address)


Are the push hl and pop hl can be replaced by something faster.



Sure, try this:

readPixelB:
rlca                   ;; [1] A = xAxA xAxA << 1  : AxAx AxAx
or  h                  ;; [1] A |= H (xBxB xBxB)  : ABAB ABAB
mask_table = .+1           ;; Placeholder for Mask Table
ld   h, #00            ;; [2] H = Masked table adress (High Byte)
ld   l, a              ;; [1] Access mask table element (table must be 256-byte aligned)
  exx                    ;; [1] Switch to Alternate registers
   
ld   a, (de)           ;; [2] Get the value of the byte of the screen where we are going to draw
exx
and (hl)               ;; [2] Erase background part that is to be overwritten (Mask step 1)
or   l                 ;; [1] Add up background and sprite information in one byte (Mask step 2)
exx
ld  (de), a            ;; [2] Save modified background + sprite data information into memory

inc  hl                ;; [2] Next byte sprite source
inc  de                ;; [2] Next byte sprite colorized
djnz lineLoop          ;; [3] Decrement B (Width) if B != 0 goto lineLoop


It is 5 mcycles faster.

Docent

#22
Quote from: Docent on 08:45, 23 October 17

Sure, try this:

readPixelB:
rlca                   ;; [1] A = xAxA xAxA << 1  : AxAx AxAx
or  h                  ;; [1] A |= H (xBxB xBxB)  : ABAB ABAB
mask_table = .+1           ;; Placeholder for Mask Table
ld   h, #00            ;; [2] H = Masked table adress (High Byte)
ld   l, a              ;; [1] Access mask table element (table must be 256-byte aligned)
  exx                    ;; [1] Switch to Alternate registers
   
ld   a, (de)           ;; [2] Get the value of the byte of the screen where we are going to draw
exx
and (hl)               ;; [2] Erase background part that is to be overwritten (Mask step 1)
or   l                 ;; [1] Add up background and sprite information in one byte (Mask step 2)
exx
ld  (de), a            ;; [2] Save modified background + sprite data information into memory

inc  hl                ;; [2] Next byte sprite source
inc  de                ;; [2] Next byte sprite colorized
djnz lineLoop          ;; [3] Decrement B (Width) if B != 0 goto lineLoop


It is 5 mcycles faster.

I had a closer look at cpct_drawSpriteMaskedAlignedColorizeM0.asm with my optimizations above applied and there are two additional optimizations you can make. As a bonus you'll also get rid of self modifying code.
1. replace
   ld (startLine), de     ;; [6] Store DE start line (DestMem)
with push de
and
   ld de, #0000           ;; [3] DE = Start Line
with pop de
Add pop de also after end: label. This change will give you 2 mcycles per sprite line.

2.  Remove
    ld a, d                ;; [1] Store High Byte (D) of MaskTable adress (DE)
   ld (mask_table), a     ;; [4] Store A (High Byte of MaskTable)

and insert  ld b, d   between converlPixel and ld d, a

and replace
   ld   h, #00            ;; [2] H = Masked table adress (High Byte)
with
       ld h, b
This change will give you another 1mcycle per sprite byte.

Arnaud

Thanks @Docent,
your optimizations also works on cpct_drawSpriteMaskedAlignedColorizeM1.asm

Arnaud

#24
Hello,
i need a little help with my latest optimizations. I'm separating the functions to set the color to replace and the functions to make the replacement itself.

The function to set the color is cpct_setReplaceColorsM0 and the function to change the colorsis cpct_spriteColorizeM0 (the project is in color0.zip)

Here my problem (in drawing.c), the baloons are not properly colored (only first call at cpct_setReplaceColors works) when i use the same argument color in the two cpct_setReplaceColors call :

void ColorSprite(u8 color)
{
    // Replace the two colors 1 and 2 of sprite baloon
    cpct_setReplaceColorsM0(1, color);
    cpct_spriteColorizeM0(g_baloon, gSpriteColorized, G_BALOON_W, G_BALOON_H);                // Colors are consecutives       
   
    cpct_setReplaceColorsM0(2, color + 1);
    cpct_spriteColorizeM0(gSpriteColorized, gSpriteColorized, G_BALOON_W, G_BALOON_H);
}


When use another variable to store the color (color2) it works :
void ColorSprite(u8 color)
{
    u8 color2 = color + 1;
    // Replace the two colors 1 and 2 of sprite baloon
    cpct_setReplaceColorsM0(1, color);
    cpct_spriteColorizeM0(g_baloon, gSpriteColorized, G_BALOON_W, G_BALOON_H);                // Colors are consecutives       
   
    cpct_setReplaceColorsM0(2, color2);
    cpct_spriteColorizeM0(gSpriteColorized, gSpriteColorized, G_BALOON_W, G_BALOON_H);
}


I don't see the problem, thanks for help.
Arnaud.

Powered by SMFPacks Menu Editor Mod