The V9990 E-VDP-III or 'Graphics 9000' is a Video Display Processor (VDP) by Yamaha which is based on an unreleased VDP (V9978) that was intended for the unreleased MSX3.
This chip is available on graphics cards for the MSX and CPC.
Documentation
Programming Documentation Manual for the V9990 VDP from Yamaha as a PDF.
Technical
The official 'Application Manual' describes the V9990 ports, registers and commands. These additional notes apply to the V9990 used in the CPC Powergraph video board and explain details that are not clearly described or are omitted from the manual. Note, these notes do not describe 100% of the operation, they describe the features and combinations tested to date.
In the following "databus" means that the v9990 doesn't assert the bus, it doesn't provide data, so the value read is "floating-bus" or the value on the z80 databus at the time. More commonly on CPC it's &ff, but there are some configurations of CPC where this differs.
Kanji ROM
Testing is limited because the powergraph doesn't have a Kanji ROM.
- Reading from the Kanji ROM ports without a Kanji ROM returns databus values
Ports
- If a command doesn't require data to be read or written then reading from P#2 will cause /WAIT to be asserted to the Z80.
- Status (Port #5) is readable always regardless of if soft reset is active or not.
- Control (Port #7) is writable always regardless of soft reset is active or not.
- MCS bit in Port #7 is always writable and the value is always readable through status "MCS" bit. This is documented. Soft reset state doesn't have any effect.
- Reading from the write only ports (or unused ports) returns databus values.
- The bits in the interrupt port (#6) are always updated (e.g. when a command is completed bit 2 is set). An interrupt request to the Z80 will only be triggered if the corresponding interrupt request is enabled in register R#9.
- If a command transfers data, and it expects data to be written (e.g. LMMC) to the command data port (P#2), then you can't use a read of the command data port to clear the data request. It must be a write. Similarly, if a command requests data to be read (e.g. LMCM) then you can use a write to the command data port to clear the data request.
Registers
- If you read from a write-only register you will see data-bus value because the V9990 doesn't assert data on the bus.
- Some registers have additional bits which are not documented. The mask describes which bits are read/write and which are unchanged.
Where the mask has a '1' bit, this bit is read/write. Where the mask has a '0' bit this bit remains at 0 and can't be changed.
- Register 7 (SCREEN MODE): Mask is &FF. - Register 9 (INTERRUPT READ/WRITE): Mask is &87 - Register 15 (BACK DROP COLOR): Mask is &FF - Register 22 (SCROLL CONTROL): Mask is &C1 - Register 25 (SPRITE PATTERN GENERATOR TABLE BASE ADDRESS): Mask is &CF - Register 26 (LCD CONTROL): Mask is &FF - Register 27 (PRIORITY CONTROL): Mask is &FF
- Register index is masked with &3f. e.g. This means reading "register 64" is the same as reading register 0.
Palette
- Palette red data has a mask of &9F (bits red and color key bit), blue and green have a mask of 0x01f.
e.g.
LD BC,&FF64 LD A,14 OUT (C),C LD BC,&FF63 LD A,0 OUT (C),A LD BC,&FF61 LD A,&FF ;; <- reading this component back returns &9F OUT (C),A LD A,&FF ;; <- reading this component back returns &1F OUT (C),A LD A,&FF ;; <- reading this component back returns &1F OUT (C),A
- Bits 1,0 of R14 define a ternary counter. If counter is set to 3 then reads will return 0 and writes are ignored. If auto increment is enabled, it will increment as normal. When ternary counter is 2 or 3, then palette index will increment and ternary counter will go to 0.
e.g.
LD BC,&FF64 LD A,14 OUT (C),C LD BC,&FF63 LD A,3 OUT (C),A LD BC,&FF61 LD A,&FF OUT (C),A ;; write is not done.
LD BC,&FF64 LD A,14 OUT (C),C LD BC,&FF63 LD A,3 OUT (C),A LD BC,&FF61 IN A,(C) ;; read returns last value written to port
e.g.
LD BC,&FF64 LD A,14 OUT (C),C LD BC,&FF63 LD A,2 OUT (C),A LD BC,&FF61 LD A,&FF OUT (C),A ;; palette index will be 1, and ternary counter is 0.
Interrupts
- If interrupt requestes are enabled in R#9, then powergraph generates a maskable interrupt on CPC.
Reset
The following has been tested using the "software reset" which is triggered through the control port.
- After reset all registers are reset to 0.
- The selected register (port 4) is set to 0 and port read and write increment is not inhibited.
- If reset is held using the "software reset" then reading or writing to the VRAM data port P#0 will cause the CPC to hang. I believe the V9990 is continuously asserting /WAIT to the Z80 but I can't confirm through code.
- Reset will stop any commands in progress and will clear pending interrupts.
- If reset is held: 3,4,7,8,9,10,11,12,13,14 and 15 reads databus, port 1 reads 0, port 5 and 6 read status.
VRAM
- VRAM read/write via registers, 0,1,2 and 3,4,5 and port 0 use *logical* addresses and not physical addresses. Writing in one mode, and then reading back in another can yield data in a different order because the addresses are translated from logical to physical based on the mode.
- In terms of physical addresses, if you consider physical address 0-&3ffff to be VRAM0 and physical address &40000-&7ffff to be VRAM1, then in P1 mode, the physical address equals logical address and in bitmap modes, every even address maps to &0-&3ffff (every even is VRAM0) and every odd address maps to &40000-&7ffff (VRAM 1). This is described in the PDF.
- setting a partial vram write address is ok (not setting completed address with r0,r1 and r2 ). e.g. set r0,r1,r2 to &80000, now set r1 to 1. Write will go to 000001. Now set r2 to 2. Write will go to 000201. Now set R3 to 3. Write will go to 030201. There doesn't appear to be a write buffer.
- v9990 seems to have a 1 byte read buffer. Setting a partial address (not setting complete address with r3,r4 and r5) and reading from port 0 will return the data from the previous address for the first read, but further reads returns correct values. This is only true of r3 and r4. e.g. set r3,r4,r5 to &80000. Now set r3 to 1. First read will come from &000000, second and subsequent reads now come from &000001. Now set r4 to 2. First read comes from &000001, second and subsequent reads come from &000201. Now set r5 to 3. First read comes from &030201, subsequent reads come from &030201.
- If vram size is set to 128KB a byte is written through port 0 and then vram size is set to 512KB you can read throught port 0 the same byte in multiple locations. The written data is effectively mirrored in the address space. The PDF describes part of this because it describes recommended addresses to use where the location is common to all vram sizes.
Command Engine
VRAM '11'
- When VRAM size is set to '11' ('10' is 512KB) the resulting operations and vram size act like a vram size of 512KB was set.
'Stand-by mode'
- "Stand-by mode" where both bits of DSPM are 1 appear to be identical in operation to the bitmap mode when display is enabled. The vram is read/write just like in vram mode. The PDF claims that Kanji ROM is accessible. I didn't see with the CPC Powergraph.
Coordinates
- Most commands seem to use internal image space coordinates for operation.
- sx,sy,dx and dy registers use image space coordinates.
- Image space width can be 256, 512, 1024 or 2048 pixels. Image space coordinates are not the same as the display width and display height.
- Internal coordinates using image space are masked before use using image space width and height.
- the image space height is computed from the vram size, bits per pixel and image space width based on an equation like this:
image space width in bytes = (image size width * image bpp)/8 height = vram size bytes / image space width in bytes;
For a vram size of 512KB this gives image space heights of &1000,&800,&400,&200,&100 and &80.
- Confirmed with PSET, BMLX and LMMC. When calculating the logical vram address of each line:
1) P1 mode is always 256 pixels wide, 4bpp (width (XIMM) and bpp (CLRM) settings with R6 are ignored). Therefore logical vram address of Y is defined as: (256/2)*y. 2) P2 mode is always 512 pixels wide, 4bpp (width (XIMM) and bpp (CLRM) settings with R6 are ignored). Therefore logical vram address of Y is defined as: (512/2)*y 3) In bitmap mode width (XIMM) and bpp (CLRM) settings are used. HSCN and C25M are ignored. Therefore logical vram address of Y is defined as: XIMM/4 when 2bpp, XIMM/2 when 4bpp, XIMM when 8bpp and XIMM*2 when 16bpp. 4) For the command-engine, standby mode is identical in operation to the bitmap mode. 5) Although the logical vram address of each line is calculated this way, a lot of the operations act as if it is in bitmap mode and when plotting pixels take into account the bpp setting from CLRM.
Registers
LOP
- Testing indicates TP operates on the data that will be written (SC) and happens *before* write mask.
- DC is read from physical VRAM. WC is the result of LOP on SC and DC and is the data that will be written. WC is the data that is masked using the write mask.
Write Mask
- R46 (Write mask) is used when writing to physical VRAM0.
- R47 (Write mask) is used when writing to physical VRAM1.
- Testing indicates write mask takes effect just before the write to VRAM is committed.
So you need to know when each mask register is used and you need to understand the logical->physical VRAM mapping.
Write Mask in bitmap and standby modes
- Even logical VRAM addresses (logical address bit 0 = 0) map to VRAM0 and use R46 for masking.
- Odd logical VRAM address (logical address bit = 1) map to VRAM1 use R47 for masking.
Commands
- Registers specific to each command can be initialized in any order. "op" register (R52) should be written last so that it uses the correct register settings.
- CE will be set to 1 in status when operation has started. CE will be set to 0 in status when operation has completed. CE will be set to 1 in interrupt when operation has completed.
Command Recommendations
The following are recommended to ensure correct and expected operation:
- To ensure full and correct operation it is advised to set ALL registers that a command uses. Setting registers once and performing multiple executions of commands can give incorrect operation or cause a hang sometimes.
- Always setup DY before commands that use it. There is a bug where pixels may not be committed to the expected vram location. Seen with LMMV instruction. Setting DY before each use of LMMV works around this.
- Always set NX,NY before using LMMV especially in P1 mode. If you don't do this then the command may hang.
CMMC
- Data written by CPU is byte by byte and it is 1 bit per pixel. Each byte therefore represents 8 pixels. If the value of a bit is 0 then background colour register is used for pixel, if bit is 1 then foreground colour register is used for pixel. Data is used bit 7 first, 6 is next, all the way down to bit 0.
- Font is defined left to right, top to bottom, line by line. It doesn't follow the same form as Kanji ROM.
- If NX is not a multiple of 8, then unused bits are used on the next line. i.e. the command uses all 8 bits until they are exhausted at which point another byte is fetched and so on.
- Pixels are written according to physical vram. Foreground low byte is for physical VRAM0. Background low bytes is for physical VRAM 1. Foreground high byte is for physical VRAM1, and background high byte is for physical VRAM 1.
- To draw a font in the same colour ensure the same colour is repeated for all pixels in FC and BC (i.e. for 8 bit mode set FC upper and lower byte to the same, and set BC upper and lower byte to the same.)
- For LOP, SC comes from background colour or foreground colour depending on the value of the bit.
- NX and NY can be any size, they are not restricted to 16x16.
CMMM
- This is similar to CMMC except data is read from VRAM and not from the CPU.
CMMK
- This command is similar in operation to CMMM.
- This command will operate without Kanji ROM.
- It is not clear where the pixel data is coming from when no Kanji ROM is connected. It is not forced to ff or 0. It also doesn't come from vram. What is often seen is that the first 2 bytes are different from the rest, then the remainder of the bytes are the same and look to be based on BC low byte bit value.
POINT
- When reading 2bpp and 4bpp data with the POINT command in bitmap modes the pixel is moved into the topmost bits.
e.g.
if byte in vram is aabbccdd, bitmap mode and 2bpp: x = 0 aaxxxxx x = 1 bbxxxxx x = 2 ccxxxxx x = 3 ddxxxxx
- The undefined 'x' bits are from the upper byte of a proceeding 16-bit POINT command. If the data in vram is &34,&12. A 16-bit pixel is &1234. A read in 2bpp/4bpp will use the bits from &12 from that operation. Therefore to force these bits do a 16-bit read of a known value then do the 2bpp/4bpp read.
BMLL
- If source and destination ranges overlap then there can be some unexpected data written. You should avoid overlapped areas or keep them at least 2 bytes apart. Result of overlapping areas can differ between runs. Investigation is going on to determine how this data can be predicted or forced.
- When DIX=0, then both source and destination vram addresses are incremented. When DIX=1, then both source and destination vram addresses are decremented.
- I can't currently see what DIY does in respect of BMLL. It seems it has no effect. This may be an error in the documentation.
SRCH
- To know if a match has been found always look at the status bit. BD will be 1 for a match and 0 for no match. The border x coordinate is always set to something even if there is no match.
- When searching sx and sy are masked before use. If the width is 256, then sx is masked with 255 before use.
- The resulting "found" x position is based on the unmasked value. i.e. if a matching pixel is at 30, and you specify sx>width, then the match is reported to be found at width+30.
- When searching in reverse, if no match is found a X coordinate of 0x07ff will be reported.
- When searching forwards, if no match is found then a multiple of the width is reported depending on sx. If sx<width then width is reported, if width<sx<width*2 then width*2 is reported etc. The "match" flag will not be set as expected.
BMLX
- When DIX=1, under some conditions the two bytes at the start of the first line are in a different order than expected.
BMXL
- When the vram address is calculated from dx and dy, dx and dy are masked before use.
This means that transfer wraps within the same line and column. i.e. if drawing in reverse and overlapping x=0, and width is 256, then pixels will wrap to 256-x on the same line and if overlapping x=256 then pixels will wrap to 0.
LMMC
- A new byte is fetched only when the pixels for that byte have been exhausted. Therefore for 2bpp is NX is not a multiple of 4 and for 2bpp NX is not a multiple of 2 then the remaining pixels from the byte will be used on the next line.
LMMV
As recommended in the PDF set FC so that the same pixel is in all bits for the chosen mode, otherwise the result may not be what you expect.
- LMMV seems to operate as if it is in bitmap mode all the time.
- For LOP, SC comes from FC.
- When DIX=0, then FC low then FC high are used. The value restarts each line and FC high/low are not directly related to physical VRAM like WM is. e.g. When 8bpp, fc=&1234, DIX=0 and nx=4 then after LMMV has executed we will see the bytes in vram: &34,&12,&34,&12.
- When DIX=1, then the result is a bit strange. The order FC is used switches depending on the DX value.
LMMV - P1 mode
- In P1 mode, LMMV seems to operate as if in bitmap mode.
- For 2bpp,4bpp and 8bpp, low byte of FC is always read for the colour. FC high byte remains from the last bitmap LMMV command. i.e. To define the upper byte of FC for subsequent LMMV commands in P1 mode, set bitmap mode, setup LMMV, set FC, set NX>1 and execute LMMV command. Now execute commands in P1 mode.
ADVN
- The use of internal x,y coordinates and when they are updated is the same as PSET.
PSET
- For LOP, SC comes from FC.
- The internal x and y coordinates are masked before use. When width is 256 then mask is 0x0ff, when 512 the mask is 0x01ff, when 1024 the mask is 0x03ff and when 2048 the mask is 0x07ff. Therefore attempting to write to x=300 on a 256 wide screen actually writes to pixel x=0x02c because it is masked before use.
- During testing I found that the 'advance' part of the PSET command differs slightly from the official document.
- The document claims that when AXE and AXM bits of the command are 0 then both DX and DY registers are read and the internal x and y are updated. I didn't find this. I found that when AXE and AXM are 0 then only the DX register is read and the internal x coordinate is updated from it. DY is not read at this time.
- If DY register is written the internal y coordinate is updated immediately. Writing to both register 38 and 39 update the internal y coordinate.
- The internal X and Y coordinates are incremented or decremented based on AXE, AXM, AYE and AYM. The internal x and y will wrap within the width and height of the display. The width is wrapped using masking. e.g. When the screen is 256 pixels wide, x is 0 and is decremented it will wrap to 255 and if x is 255 and is incremented it will wrap to 0. The same is true of the Y coordinate.