Hi,
Usually we program in Z80 by calling subroutines with either the parameters in registere or pushed onto the stack. e.g.
org #8000
ld b, 5
loop: push bc
ld hl, message1
call strout
ld hl, message2
call strout
pop bc
djnz loop
halt
strout: ld a, (hl)
or a
ret z
push hl
call #bb5a
pop hl
inc hl
jr strout
message1: text "Hello, World!", 0
message2: text "abcdefghijklmnopqrstuvwxyz", 0
What if instead of the logic outside of the stack but using the stack for calling and returning in the usual sense with call and ret, we used the stack for the actual logic and calling 'using ret' and popping parameters that are already always on the stack.
Would we gain performance and smaller applicaiton size, or the opposite?
e.g.
org #8000
ld sp, logic
ret
; note: logic can be any mix of calls, literals or labels for which will be called (with ret), or moved to registers with pop
dw endprog
dw loopend, message2, strout, message1, strout
loop: dw 5 loopstart
logic:
loopstart: pop bc
ld b, c
ret
loopend: djnz loopagain
ret
loopagain: ld sp, loop
ret
endprog: halt
strout: pop de ; for hl later
ld hl, 0 ; uses firmware so change back to normal stack behaviour
add hl, sp
ex de, hl hl = pointer to string, de = original stack
push de ; preserve original stack
ld sp, fwstack
stroutloop: ld a, (hl)
or a
jr z, stroute
push hl
call #bb5a
pop hl
inc hl
jr stroutloop
stroute: push de ; restore original stack
pop sp
ret
fwstacke: defs 20
fwstack:
message1: text "Hello, World!", 0
message2: text "abcdefghijklmnopqrstuvwxyz", 0
In many ways the logic flow is more readable with the 2nd example and it lends itself really nicely to reverse polish maths expressions also. Loops don't need to be in globally non-used registers, they could be also labels to words at memory locations (variables).
Any thoughts?
ps if it isn't obvious, the logic is written backwards as it is on the stack and every ret is actually a call to the next command on the stack. Each command pops it's parameters from the stack.
With the above examples, not including the strout function...
The first example uses 99 TS and 19 bytes against the second example that uses 71 TS and 11 bytes.
At a glance, if not having to change the stack too frequently to satisfy external calls, then it looks to be the faster and smaller code - and we know that the firmware doesn't have to be called. An interrupt handler might of course have to jump to somewhere that changes the stack to normal behaviour.
What are the reasons you wouldn't generate such code as e.g. 2 if you were compiling a high level language? It appears to be a nice fit.
The problem with "return oriented programming" is that it's difficult to handle branched control flow. And because the stack will almost inevitably get written to at various points (such as on interrupt) it's difficult to run the logic multiple times unless you also keep a whole copy elsewhere.
It's sometimes a useful technique for code generation logic, where you're only going to run the results once, for the reasons you've outlined above.
Just because it is on the stack, it doesn't make it change. An interrupt can of course preserve the stack as I did in strout.
I actually thought it is more readable.
ps if it isn't obvious, the logic is written backwards as it is on the stack and every ret is actually a call to the next command on the stack. Each command pops it's parameters from the stack.
Sorry I am a bit wrong here the code should be forward on the stack as ret is traversing the stack upward but I think the point is clear.
logic: dw loopstart 5
loop: dw strout, message1, strout, message2, loopend
dw endprog
Quote from: zhulien on 11:58, 11 June 23then it looks to be the faster and smaller code
Let's say, we ignore the strout function and only have a look at this alternative way of coding:
- classic way: 18 bytes (until halt)
- alternative way: 30 bytes (until halt)
So it isn't smaller.
Quote from: zhulien on 11:58, 11 June 23An interrupt handler might of course have to jump to somewhere that changes the stack to normal behaviour.
If you don't want to destroy the "logic table" it is not possible to have interrupts anymore, as they will at least push the PC. So it will always destroy one entry in the logic table.
And for the same reason it's not possible to use the stack for the tasks, for which it was designed, at all (sub calls, temp. register storage). So it will be nearly impossible to write some more complex code in this way.
Btw, there is no POP SP ;)
Good points, I forgot about the return address from an interrupt point of view, the entire application in this mode would need to run with interrupts disabled.
Still as an idea I wonder if there can be benefits, e.g. creating a stack of compiled sprite addresses for a screen refresh, or processing a list of something else...
I guess it would be like programming in basic with no gosub command and only goto for flow control.