Hello,
i'm trying to copy a sprite into another sprite before drawing it to the screen, i have coded a function in C but it's really too slow.
[attach=3]
void BlitBackbuffer(u8 destX, u8 destY, u8 *srcSprite, u8 srcWidth, u8 srcHeight)
{
u8 yDest = srcHeight;
u8* destMem = sBackBuffer + destX + VIEW_CX*destY;
while (yDest != 0)
{
cpct_memcpy(destMem, srcSprite, srcWidth);
srcSprite += srcWidth;
destMem += VIEW_CX;
--yDest;
}
}
Is someone already do this in assembly or have idea to speed up my code ?
Here my test project with CPCTelera 1.4.2
Thanks,
Arnaud.
To speed up you may make dedicated functions for each possible width of sprites (src or dst regarding of your datas)
Quote from: Arnaud on 18:12, 17 August 17
Hello,
i'm trying to copy a sprite into another sprite before drawing it to the screen, i have coded a function in C but it's really too slow.
[attach=3]
void BlitBackbuffer(u8 destX, u8 destY, u8 *srcSprite, u8 srcWidth, u8 srcHeight)
{
u8 yDest = srcHeight;
u8* destMem = sBackBuffer + destX + VIEW_CX*destY;
while (yDest != 0)
{
cpct_memcpy(destMem, srcSprite, srcWidth);
srcSprite += srcWidth;
destMem += VIEW_CX;
--yDest;
}
}
Is someone already do this in assembly or have idea to speed up my code ?
Here my test project with CPCTelera 1.4.2
Thanks,
Arnaud.
Try this asm copy routine:
ld a, yDest
ld de, destMem
ld hl, srcSprite
ld bc, srcWidth
copy_loop:
push bc
push de
ldir
ld bc, #VIEW_CX
pop de
ex de, hl
add hl, bc
ex de, hl
pop bc
dec a
jr nz, copy_loop
You'll probably need some additional code to store registers required by sdcc and setup initial registers differently.
How about this as your main.c?
#include <cpctelera.h>
#include "test.h"
#define VIEW_X 0
#define VIEW_Y 0
u8 buffer0[G_TEST_W*G_TEST_H];
u8 buffer1[G_TEST_W*G_TEST_H];
void main(void)
{
u8* pBuffer;
u8 shift;
u8 x;
cpct_disableFirmware();
pBuffer=buffer0;
shift=0;
x=VIEW_X;
while(1)
{
cpct_memcpy(pBuffer,g_test,G_TEST_W*G_TEST_H);
cpct_drawSprite(pBuffer,cpctm_screenPtr(CPCT_VMEM_START,x,VIEW_Y),G_TEST_W,G_TEST_H);
cpct_memset_f8(pBuffer,0x00,G_TEST_W*G_TEST_H);
if (shift==0){
pBuffer=buffer1;
shift=1;
x+=G_TEST_W;
}
else{
pBuffer=buffer0;
shift=0;
x-=G_TEST_W;
}
cpct_drawSprite(pBuffer,cpctm_screenPtr(CPCT_VMEM_START,x,VIEW_Y),G_TEST_W,G_TEST_H);
}
}
Or you could do it with one buffer.
(I should have written that way in the first place.)
#include <cpctelera.h>
#include "test.h"
#define VIEW_X 0
#define VIEW_Y 0
u8 buffer[G_TEST_W*G_TEST_H];
void main(void)
{
u8 shift;
u8 x;
cpct_disableFirmware();
shift=0;
x=VIEW_X;
while(1)
{
cpct_memcpy(buffer,g_test,G_TEST_W*G_TEST_H);
cpct_drawSprite(buffer,cpctm_screenPtr(CPCT_VMEM_START,x,VIEW_Y),G_TEST_W,G_TEST_H);
cpct_memset_f8(buffer,0x00,G_TEST_W*G_TEST_H);
if (shift==0){
shift=1;
x+=G_TEST_W;
}
else{
shift=0;
x-=G_TEST_W;
}
cpct_drawSprite(buffer,cpctm_screenPtr(CPCT_VMEM_START,x,VIEW_Y),G_TEST_W,G_TEST_H);
}
}
@ervin (http://www.cpcwiki.eu/forum/index.php?action=profile;u=82), @Docent (http://www.cpcwiki.eu/forum/index.php?action=profile;u=1534), @roudoudou (http://www.cpcwiki.eu/forum/index.php?action=profile;u=1714) : thanks for advice, it works faster.
Here the CPCTelera code of the copy :
void CopyData(u8 yDest, u8* destMem, u8 *srcSprite, u8 srcWidth)
{
__asm
push ix; Save ix before making changes
ld ix, #0; ix points to the top of the stack
add ix, sp
ld a, 4(ix); yDest
ld e, 5(ix); destMem
ld d, 6(ix)
ld l, 7(ix); srcSprite
ld h, 8(ix)
ld c, 9(ix); srcWidth
ld b, #0
copy_loop :
push bc
push de
ldir
ld bc, #VIEW_CX
pop de
ex de, hl
add hl, bc
ex de, hl
pop bc
dec a
jr nz, copy_loop
pop ix; Restore IX before returning
__endasm;
}
Hello,
i need more help ;) , because i also want to draw masked sprite.
I tried to modify the previous assembly code, it seems easy but i wasn't able to do this (i am really lost with asm).
Here my C code i'd like to convert in asm, it's adapted from the CPCTelera cpct_drawSpriteMaskedAlignedTable :
void CopyDataMasked(u8 yDest, u8* destMem, u8 *srcSprite, u8 srcWidth, u8* maskTable) {
while (yDest != 0) {
u8 i = 0;
for (i = 0; i < srcWidth; i++) {
u8 sprite = srcSprite[i];
u8 mask = maskTable[sprite];
u8 dest = destMem[i];
dest &= mask;
dest |= sprite;
destMem[i] = dest;
}
srcSprite += srcWidth;
destMem += VIEW_CX;
--yDest;
}
}
My goal is to modify the previous assembly code (Reply#5) with adding this part from CPCTelera (cf. cpct_drawSpriteMaskedAlignedTable.asm), in order to have transparency :
ld a, (bc) ;; [2] Get next byte from the sprite
ld l, a ;; [1] Access mask table element (table must be 256-byte aligned)
ld a, (de) ;; [2] Get the value of the byte of the screen where we are going to draw
and (hl) ;; [2] Erase background part that is to be overwritten (Mask step 1)
or l ;; [1] Add up background and sprite information in one byte (Mask step 2)
ld (de), a ;; [2] Save modified background + sprite data information into memory
I think i have to modify this part of code (Reply#5) but i don't know how do it :
ex de, hl
add hl, bc
ex de, hl
Thanks,
Arnaud.
I think the part of code you have to modify is the ldir.
You must change the ldir for a loop doing the transparency part.
@Arnaud (http://www.cpcwiki.eu/forum/index.php?action=profile;u=1424) You may start by looking at the code generated by SDCC from your C source and modifying it. That would be an interesting way to start if you still need to train your asm abilities.
With respect to what you want to do, you are solving part of the problem. The problem of bliting a sprite into another sprite buffer requires knowing the sizes of both. Your code assumes a constant size for the "canvas" sprite. I would rather prefer writing a solution for any sprite size, as that would be valid for CPCtelera users. In fact, that one is in the todo list.
What about starting with your first attempts with the problem? We may assist you in creating a good solution and that would also help you improve your asm skills :)
Quote from: Arnaud on 07:56, 19 August 17
Hello,
i need more help ;) , because i also want to draw masked sprite.
I tried to modify the previous assembly code, it seems easy but i wasn't able to do this (i am really lost with asm).
Here my C code i'd like to convert in asm, it's adapted from the CPCTelera cpct_drawSpriteMaskedAlignedTable :
void CopyDataMasked(u8 yDest, u8* destMem, u8 *srcSprite, u8 srcWidth, u8* maskTable) {
while (yDest != 0) {
u8 i = 0;
for (i = 0; i < srcWidth; i++) {
u8 sprite = srcSprite[i];
u8 mask = maskTable[sprite];
u8 dest = destMem[i];
dest &= mask;
dest |= sprite;
destMem[i] = dest;
}
srcSprite += srcWidth;
destMem += VIEW_CX;
--yDest;
}
}
My goal is to modify the previous assembly code (Reply#5) with adding this part from CPCTelera (cf. cpct_drawSpriteMaskedAlignedTable.asm), in order to have transparency :
ld a, (bc) ;; [2] Get next byte from the sprite
ld l, a ;; [1] Access mask table element (table must be 256-byte aligned)
ld a, (de) ;; [2] Get the value of the byte of the screen where we are going to draw
and (hl) ;; [2] Erase background part that is to be overwritten (Mask step 1)
or l ;; [1] Add up background and sprite information in one byte (Mask step 2)
ld (de), a ;; [2] Save modified background + sprite data information into memory
I think i have to modify this part of code (Reply#5) but i don't know how do it :
ex de, hl
add hl, bc
ex de, hl
Thanks,
Arnaud.
Here you go..
ld a, yDest
ld de, destMem
ld hl, maskTable
exx
ld hl, srcSprite
exx
copy_loop:
ex af, af'
ld c, srcWidth
push de
line_loop:
exx
ld a, (hl) ; sprite = srcSprite[i];
inc hl
exx
ld l, a ;
ld a, (de) ; dest = destMem[i];
and (hl) ; dest &= maskTable[sprite];
or l ; dest |= sprite;
ld (de),a ; destMem[i] = dest;
inc de
dec c
jr nz, line_loop
ld bc, #VIEW_CX
pop de
ex de, hl
add hl, bc ; destMem += VIEW_CX;
ex de, hl
ex af, af'
dec a
jr nz, copy_loop
As earlier, you need some additional code to store registers required by sdcc and setup initial registers differently.
have fun :)
btw: maskTable need to be 256 bytes aligned.
@Docent (http://www.cpcwiki.eu/forum/index.php?action=profile;u=1534): Thanks a lot for the conversion
@ronaldo (http://www.cpcwiki.eu/forum/index.php?action=profile;u=1227) : You are right, and i have add and rename some parameters in order to make the function useful to CpcTelera user.
Well to begin, i'm trying to adapt the code in Cpctelera format in a s file, after i'll add / rename parameters.
I carefuly read the parameters in the right order, but it crashes :doh:
Here the asm code.
extern void drawBackBufferMaskedAlignedTable(u8 dest_y, u8* dest_mem, u8 *src_sprite, u8 src_width, u8* mask_table) __z88dk_callee;
_drawBackBufferMaskedAlignedTable::
push ix; Save ix before making changes
ld ix, #0; ix points to the top of the stack
add ix, sp
ld a, 4(ix); yDest
ld e, 5(ix); destMem
ld d, 6(ix)
ld l, 10(ix); maskTable
ld h, 11(ix)
exx
ld l, 7(ix); srcSprite
ld h, 8(ix)
exx
copy_loop_masked :
ex af, af'
ld c, 9(ix); srcWidth
ld b, #0
push de
line_loop :
exx
ld a, (hl); sprite = srcSprite[i];
inc hl
exx
ld l, a;
ld a, (de); dest = destMem[i];
and (hl); dest &= maskTable[sprite];
or l; dest |= sprite;
ld(de), a; destMem[i] = dest;
inc de
dec c
jr nz, line_loop
ld bc, #44 ;VIEW_CX
pop de
ex de, hl
add hl, bc; destMem += VIEW_CX;
ex de, hl
ex af, af'
dec a
jr nz, copy_loop_masked
pop ix; Restore IX before returning
And the full project.
Thanks for help,
Arnaud
What is the effect of __z88dk_callee ?
Don't you have to restore the stack yourself at the end of the function ?
According to cpctelera commentary in code i have to put returning address in the stack again as this function uses __z88dk_callee convention.
So you should be right, i have to restore the stack at the end of the function, i'm trying to do this in a small example.
Well, i don't know what to do.
When my asm code is inline it works.
void copyData(u8* sprite, u8* memory, u8 width, u8 height)
{
sprite;
memory;
width;
height;
__asm
push ix; Save ix before making changes
ld ix, #0; ix points to the top of the stack
add ix, sp
ld l, 4(ix); sprite
ld h, 5(ix)
ld e, 6(ix); memory
ld d, 7(ix)
ld c, 8(ix); width
ld b, #0
ld a, 9(ix); height
copy_loop:
push bc
push de
ldir
pop de
ex de, hl
ld bc, #VIEW_CX
add hl, bc
ex de, hl
pop bc
dec a
jr nz, copy_loop
pop ix; Restore IX before returning
__endasm;
}
When i put it in a s file it doesn't work.
I don't know if i must set the __z88dk_callee function attribute, some cpctelera function use it, other not, but with ou without it crash in my code.
I guess some register are not restored or the SP.
Weird.
Is it ok to use the alternate register set ?
You don't have firmware or cpctelera code under interrupts using it ?
With __z88dk_callee i don't think you need to push the return address.
Just adjust stack according to the parameters pushed.
Quote from: Xifos on 12:40, 21 August 17
Weird.
It's necessarily logic :D
Quote from: Xifos on 12:40, 21 August 17
Is it ok to use the alternate register set ?
I don't know
Quote from: Xifos on 12:40, 21 August 17You don't have firmware or cpctelera code under interrupts using it ?
I disabled interrupt
Quote from: Xifos on 12:40, 21 August 17
With __z88dk_callee i don't think you need to push the return address.
Just adjust stack according to the parameters pushed.
I was thinking about it, but in SDCC generated asm it seems the compilator update the SP according the parameters size :
call _drawBackBuffer
ld hl, #6
add hl, sp
ld sp, hl
In reply #13, you were talking about the copyData function, when you said it was not working from an s file ?
In that case, forget about the alternate register that's not the problem.
Without the __z88dk_callee thing , the copyData function should work, even from an asm file.
(except that you must be sure the #VIEW_CX is defined)
I must say i am not at ease with the linker...
Quote from: Arnaud on 18:08, 20 August 17
@Docent (http://www.cpcwiki.eu/forum/index.php?action=profile;u=1534): Thanks a lot for the conversion
@ronaldo (http://www.cpcwiki.eu/forum/index.php?action=profile;u=1227) : You are right, and i have add and rename some parameters in order to make the function useful to CpcTelera user.
Well to begin, i'm trying to adapt the code in Cpctelera format in a s file, after i'll add / rename parameters.
I carefuly read the parameters in the right order, but it crashes :doh:
Here the asm code.
extern void drawBackBufferMaskedAlignedTable(u8 dest_y, u8* dest_mem, u8 *src_sprite, u8 src_width, u8* mask_table) __z88dk_callee;
_drawBackBufferMaskedAlignedTable::
push ix; Save ix before making changes
ld ix, #0; ix points to the top of the stack
add ix, sp
ld a, 4(ix); yDest
ld e, 5(ix); destMem
ld d, 6(ix)
ld l, 10(ix); maskTable
ld h, 11(ix)
exx
ld l, 7(ix); srcSprite
ld h, 8(ix)
exx
copy_loop_masked :
ex af, af'
ld c, 9(ix); srcWidth
ld b, #0
push de
line_loop :
exx
ld a, (hl); sprite = srcSprite[i];
inc hl
exx
ld l, a;
ld a, (de); dest = destMem[i];
and (hl); dest &= maskTable[sprite];
or l; dest |= sprite;
ld(de), a; destMem[i] = dest;
inc de
dec c
jr nz, line_loop
ld bc, #44 ;VIEW_CX
pop de
ex de, hl
add hl, bc; destMem += VIEW_CX;
ex de, hl
ex af, af'
dec a
jr nz, copy_loop_masked
pop ix; Restore IX before returning
And the full project.
Thanks for help,
Arnaud
You need to terminate your asm function by ret - otherwise when called it wont return and will continue execution.
Ret is not needed if you put asm code between __asm/__endasm directives in c source code (and you don't use __naked for function definition) - sdcc will add it itself.
Of course the ret !
I should have seen it !
:doh:
Quote from: Docent on 14:45, 21 August 17
You need to terminate your asm function by ret - otherwise when called it wont return and will continue execution.
Ret is not needed if you put asm code between __asm/__endasm directives in c source code (and you don't use __naked for function definition) - sdcc will add it itself.
Yes that it ! :D
I really need to acquire some asm basic knowledge.
Now i'll try to make working the second function.
I have added a new parameter sprite_width in replacement of constant VIEW_CX.
To temporary store this parameter i put it in asm variable, is the good way to do ?
extern void drawBackBuffer(u8 *sprite, u8 sprite_width, u8* memory, u8 width, u8 height);
_drawBackBuffer::
push ix; Save ix before making changes
ld ix, #0; ix points to the top of the stack
add ix, sp
ld l, 4(ix); sprite
ld h, 5(ix)
ld c, 6(ix); sprite_width
ld b, #00
ld (sprite_width), bc;
ld e, 7(ix); memory
ld d, 8(ix)
ld c, 9(ix); width
ld a, 10(ix); height
copy_loop:
push bc
push de
ldir
pop de
ex de, hl
ld bc, (sprite_width)
add hl, bc
ex de, hl
pop bc
dec a
jr nz, copy_loop
pop ix; Restore IX before returning
ret
sprite_width:
.dw #0000
Here the code for back buffer copy :
- drawBackBuffer
- drawBackBufferMasked
- drawBackBufferMaskedAlignedTable
I modified order of parameters and add commentary header.
All remarks are welcome :D
I'll make a better example project and propose all to cpctelera.
Small optimisation :
_drawBackBuffer::
push ix; Save ix before making changes
ld ix, #0; ix points to the top of the stack
add ix, sp
ld l, 4(ix); sprite
ld h, 5(ix)
ld a, 10(ix); buffer_width
ld (buffer_width+1), a;
ld b, #00
ld a, 7(ix); height
ld e, 8(ix); memory
ld d, 9(ix)
copy_loop:
ld c, 6(ix); width
push de
ldir
pop de
ex de, hl
buffer_width:
ld c, 0
add hl, bc
ex de, hl
dec a
jr nz, copy_loop
pop ix; Restore IX before returning
ret
Quote from: demoniak on 20:36, 22 August 17
Small optimisation :
_drawBackBuffer::
push ix; Save ix before making changes
ld ix, #0; ix points to the top of the stack
add ix, sp
ld l, 4(ix); sprite
ld h, 5(ix)
ld a, 10(ix); buffer_width
ld (buffer_width+1), a;
ld b, #00
ld a, 7(ix); height
ld e, 8(ix); memory
ld d, 9(ix)
copy_loop:
ld c, 6(ix); width
push de
ldir
pop de
ex de, hl
buffer_width:
ld c, 0
add hl, bc
ex de, hl
dec a
jr nz, copy_loop
pop ix; Restore IX before returning
ret
You managed to squezze 15 tstates from the copyloop - nice!
But I don't like selfmodifyng code, so I thought I try to get rid of it. I went with a few undocumented instructions and saved 10 tstates more than your code and 25 from my initial version - over 21% speedup :)
_drawBackBuffer::
push ix; Save ix before making changes
ld ix, #0; ix points to the top of the stack
add ix, sp
ld l, 4(ix); sprite
ld h, 5(ix)
ld c, 6(ix); width
ld a, 7(ix); height
ld e, 8(ix); memory
ld d, 9(ix)
ld b, 10(ix); buffer_width
ld ixh, b ; undocumented opcode: 0xdd, 0x60
ld ixl, c ; undocumented opcode: 0xdd, 0x69 ; 8
ld b, #00
copy_loop:
push de
ldir
pop de
ex de, hl
ld c, ixh ; undocumented opcode: 0xdd, 0x4c
add hl, bc
ex de, hl
ld c, ixl ; undocumented opcode: 0xdd, 0x4d)
dec a
jr nz, copy_loop
pop ix; Restore IX before returning
ret
I am sorry it's off topic but :
Does sdcc assembler support instructions written directly with ixl/ixh and iyl/iyh ?
I'am still using .db #0xDD or #0xFD in my code.
Here the optimized version thanks to @Docent (http://www.cpcwiki.eu/forum/index.php?action=profile;u=1534) and @demoniak (http://www.cpcwiki.eu/forum/index.php?action=profile;u=5).
Quote from: Xifos on 10:20, 23 August 17
I am sorry it's off topic but :
Does sdcc assembler support instructions written directly with ixl/ixh and iyl/iyh ?
I'am still using .db #0xDD or #0xFD in my code.
In cpctelera the undocumented opcode are used with macro :
;; Macro: ld__ixh_b
;; Opcode for "LD IXH, B" instruction
;;
.macro ld__ixh_b
.DW #0x60DD ;; ld ixh, b
.endm
Here the example project and last sources :
[attachimg=1]