Demoniak is on the right way, but he uses the Maxam-style assembler syntax, but to do this, you need to use Norecess' assembler syntax converter, because SDCC uses the ASXXX assembler, which uses a different syntax. You have to change the code to this (untested):
void main() {
int addr;
for (addr=0xc000;addr<0xffff;addr++) {
poke(addr);
}
}
void poke(int a) {
__asm
ld h,5(ix)
ld l,4(ix)
ld (hl),255
__endasm;
}
Another important thing is that you can't use variables, which are passed in the function declaration, directly in the assembler code. You have to read them from the stack. That's why you have to manually load 'hl' with the value of 'a'.
The speed should be faster than BASIC even when using the firmware routines, because the code wouldn't be interpreted and it creates smaller executeable code, of course.