Hi eveyone,
I'm here with what might be an interesting problem, but maybe also something that can be done much easier and I just can't see the nose in front of my face:
In a mode 1 graphic, I need an assembly routine that quickly changes the pen colors as follows:
0 [00] -> 0 [00]
1 [01] -> 2 [10]
2 [10] -> 3 [11]
3 [11] -> 1 [01]
Working with the proverbially witched way the bitmaps are encoded on the CPC (if you don't know it yet, go see this page (https://gist.github.com/neuro-sys/eeb7a323b27a9d8ad891b41144916946#video-mode)), I tried to find a simple sequence of shifts and bit operations that can be applied to a whole byte (=4 px), so I don't have to work on every single pixel.
I came up with the following plan: (o=original color bit, n=new color bit)
n0 = o1
n1 = o0 xor o1
and ended up with this routine:
;(assuming a holds the original byte)
ld b,a ;backup
rlca:rlca:rlca:rlca ;switch bits 0 and 1 of all 4 pixels
ld c,a ;backup this shifted byte
xor b ;n1 = o0 xor o1
and #f0 ;isolate bits 0
ld b,a ;backup
ld a,c ;restore shifted original byte
and #0f ;isolate bits 1 (n0 = o1)
or b ;put the two halves together
;done
...but somehow I have the feeling that I'm not seeing the wood for the trees and there must be a much simpler solution...
Any thoughts?
Have a nice evening!
Use tables, precompute your changes
Yes, of course, but I forgot to mention that it should be quick AND memory economic
;)
My first thought is why rotate the actual pixel choice rather than just rotate palette entries?
Beyond that, yeah probably just a 256 byte, page aligned, precomputed table of results is the best way. Shifting and computing, even if you can use RLD, is probably going to be a lot slower.
Quote from: andycadley on 17:50, 31 July 23My first thought is why rotate the actual pixel choice rather than just rotate palette entries?
Well, there is a reason ;-)
Quote from: andycadley on 17:50, 31 July 23Beyond that, yeah probably just a 256 byte, page aligned, precomputed table of results is the best way. Shifting and computing, even if you can use RLD, is probably going to be a lot slower.
I tought about using RLD, but it would have to be done twice in order to reset (hl) to the original byte.
As for the timing: It isn't really THAT critical, it's not inside, say, a sprite routine. It's a drawing thing that can take some nops. Anyway, maybe I'm over-optimizing, I just love these kinds of puzzles...
Quote from: roudoudou on 17:36, 31 July 23Use tables, precompute your changes
@arnolde (
@arnoldemu, is that you?!)
Exactly what
@roudoudou said. Since your data source will always provide the same destination data, use a simple translation table for that. And since the size of that table is exactly 256 bytes, there is a perfect opportunity of aligning your data on a 256 bytes boundary (huge optimisation here).
Regarding the memory consumption, depending of the context you can eventually generate those tables at program startup (with Z80 code).
And if really you don't want to go the table-route: I noticed that your original code does not use D and E registers, so I would suggest using at least AND D and AND E instead of AND #F0 and AND #0F (saving 2 NOPs in the process)
LD L,(HL)
One moment...
This document has error
Mode 1
bit7 bit6 bit5 bit4 bit3 bit2 bit1 bit0
p0(0) p1(0) p2(0) p3(0) p0(1) p1(1) p2(1) p3(1)
First is Low bit.
Quote from: arnolde on 17:41, 31 July 23Yes, of course, but I forgot to mention that it should be quick AND memory economic
;)
With 4 pens, it cannot be memory expensive
Or i miss something in your request
LD C,A |
AND #0F |
LD B,A |
LD A,C |
RLCA : RLCA : RLCA : RLCA |
XOR B |
LD A,(HL) |
|
|
RRD |
|
|
|
AND 0F |
|
XOR (HL) |
|
LD (HL),A |