CPCWiki forum

General Category => Amstrad CPC hardware => Topic started by: doragasu on 00:33, 15 January 17

Title: ROM board with a tiny DMA engine
Post by: doragasu on 00:33, 15 January 17
I'm thinking about making a ROM board for the CPCs (not plus). I want to use a CPLD instead of discrete logic for the ROM register and the decoding logic.

While having a look to the signals in the CPC EXP connector, I saw the BUSREQ and BUSAK signals and thought: why not trying to fit in the CPLD a tiny DMA controller to accelerate copying data from ROM? Sounds nice to me, but I as I have not studied CPC internals, I don't know if this is possible, and if it has some limitations. Some things that I think might be troublesome:
Any suggestion or warning is welcome  ;D
Title: Re: ROM board with a tiny DMA engine
Post by: rpalmer on 03:30, 15 January 17
doragasu,

Good luck trying to get a DMA to work, but from what I understand the Z80 "wait" signal is used to get video data and if you use this as well you will have trouble.

The BUSREQ/BUSACK are matched pairs for block transfers and the transfer must be such that the data transfer speed does exceed the speed of the memory itself.

One other factor to account for is interrupt processing which further complicate a DMA implementation and another is that external memory expansions may have other limitations.

rpalmer
Title: Re: ROM board with a tiny DMA engine
Post by: doragasu on 14:06, 15 January 17
Thanks for the info. It looks like it will not be easy, so I think I'll abandon the idea  :( . I have found an expansion board using an Intel 8237 DMA controller, but it is for Aleste CPC clones, that had extra pins for DMA in the expansion port...
Title: Re: ROM board with a tiny DMA engine
Post by: robcfg on 14:43, 15 January 17
Could you take some pictures of that board, please?


Sent from my iPhone using Tapatalk
Title: Re: ROM board with a tiny DMA engine
Post by: doragasu on 17:40, 15 January 17
I'm sorry but I don't have such pictures. When I say "I found an expansion board...", I mean I found data about it on the Internet. Fortunately somebody did already took photographs, you can find them here (http://www.cpcwiki.eu/index.php/Magic_Sound_Board).
Title: Re: ROM board with a tiny DMA engine
Post by: robcfg on 19:47, 15 January 17
Cool, thanks!


Sent from my iPhone using Tapatalk
Title: Re: ROM board with a tiny DMA engine
Post by: doragasu on 20:25, 15 January 17
OK, I have been reading here and there, and having a look to the CPC6128 schematics, and I think I'm missing some details to decide about whether the DMA is or not realizable, so here are some more questions...

1.- READY signal pulse duration: I have read (http://cpctech.cpc-live.com/docs/instrtim.html) that each microsecond, the GA pulls low READY signal to introduce WAIT states on the CPU while it fetches two bytes from memory. The reference above also states that this causes instruction executed times to be rounded to 1us, so if I understand properly this mechanism, between one and three wait states are inserted depending on the instruction being executed. Is this right? How is this accomplished? I don't think the GA is partially decoding instructions to know how many WAIT states to insert, I'm sure it must be a quite simple mechanism (that doesn't come to my mind).

2.- CPU WAIT states insertion: READY signal is used to insert WAIT states on the CPU. If I understand this mechanism correctly, this signal must only be asserted during memory reads and writes. So how does the GA know when to pull low READY signal? Again I don't think the GA partially decodes instructions...

3.- /CPU signal: while browsing schematics, I can see that /CPU and /CASADDR signals from the GA, are both used to multiplex CPU and CRTC addresses (also rows and columns). So, I suppose /CPU signal should be 1 MHz with a 25% duty cycle, right? Also how are /CPU and READY signals related? Is the rising edge of /CPU synchronized with the falling edge of READY?

Is there detailed documentation about how exactly these signals work?
Title: Re: ROM board with a tiny DMA engine
Post by: PulkoMandy on 11:34, 16 January 17
You can look at the "gate array decapped" thread, where pople have reverse engineered the gate array from pictures of the die. It is indeed a rather simple chip.


It just locks the CPU bus whenever it needs to access the RAM. Sometimes the CPU does not need the BUS, and no extra wait state results. Sometimes the CPU needs to access the bus, and it has to wait.


If you want to do DMA, you can use the BUSRQ/BUSAK signals to lock the z80 from accessing the bus. But, you can't stop the gate array! This means your DMA engine will need to watch for the WAIT/READY actions from the gate array and only use the remaining cycles. Possible, but probably not that simple.
Title: Re: ROM board with a tiny DMA engine
Post by: doragasu on 22:45, 16 January 17
Thanks for the info!

So it just generates 1 wait state each microsecond "without looking" what's happening on the CPU, right? And if the CPU is not reading/writing, it just ignores the wait state, right?
Title: Re: ROM board with a tiny DMA engine
Post by: PulkoMandy on 11:05, 17 January 17
Yes, that's the idea. I'm not sure about the exact sequencing of what happens on the pins, but the CPU will wait only when it tries to access the bus, during wait states it can still perform any internal operations.
Title: Re: ROM board with a tiny DMA engine
Post by: rpalmer on 12:44, 17 January 17
Pulkomandy,

The WAIT/READY as I understand it is mostly for Video data access.
You can use this on a real Z80 DMA to pause the transfers, but as to how effective a DMA becomes is another issue.

See attached, CE/WAIT details for how this works.

rpalmer
Title: Re: ROM board with a tiny DMA engine
Post by: doragasu on 15:26, 17 January 17
I have quickly browsed the Z80 DMA engine, and I have seen that it takes at least 3 cycles (750ns@4MHz) to read a byte, and another 3 to write it. Cycle time of the RAM chips is 270ns, so maybe ìt could be cut to 2 cycles to read and 2 to write. This is of course without taking into account the READY signal.

Unfortunately there is no access to /CAS and /RAS signals. It would have allowed to do page reads/page writes in 1 cycle each (excepting the first one).

It would be interesting to see the timing of the /CAS and /RAS signals generated by the GA.
Title: Re: ROM board with a tiny DMA engine
Post by: doragasu on 19:48, 24 January 17
I have coded a preliminary implementation of the DMA, that reads from flash/ROM (activating fn_oe and fn_ce) and writes to CPC RAM (activating n_wr and n_mreq). On this preliminary implementation, each read/write takes two CPU cycles (I suppose this can be optimized, more on this later). I have also implemented wait states.

I wrote a dirty simulation that copies 3 bytes using DMA, from ROM at $C000 to RAM at $1000. On the second byte write I have inserted a wait state, and two on the third write. This is the resulting simulation. Tips/corrections are welcome:

(http://i.imgur.com/7OgoLwk.png)

Now let's start with the questions:

1. How does it look? Is there anything wrong? Should this work?
2. As I wrote before, this can be optimized. At least I'm sure that I can read from Flash/ROM in a single cycle (Flash chip is rated 90 ns). But can I write to RAM also in a single cycle? I suspect it might be possible. Why? Because of two reasons:
- First: GA reads. I think I read somewhere (I cannot find where) that each microsecond, the GA inserts a wait state, and during it, it reads TWO BYTES. So if in a single cycle, two bytes are read, it should be possible to read just one, shouldn't it?
- Second: The wait states mechanism. If my interpretation of CPC6128 schematics AND the wait states mechanism is right, for example if the CPU wants to read a byte, and the GA inserts a wait state at T2 (see graph below), the CPU buses are "disconected" from RAM (multiplexed) for the CRTC/GA to read the data. When the wait state is removed, the CPU continues the read operation: CPU address and control signals are applied to the RAM, that must perform the read IN A SINGLE CYCLE (T3 in graph below). Is this correct? If affirmative, a single cycle must be enough for the RAM to be read (or I am missing something).

(http://i.imgur.com/24q7a3U.png)

But there is also something that contradicts this... I had a look to the SRAM datasheet (HM4864-2) and it has a 270 ns cycle time... a bit more than a 250 ns cycle... So maybe my interpretation of the schematics and/or the wait states mechanism is wrong...
Title: Re: ROM board with a tiny DMA engine
Post by: Bryce on 23:46, 24 January 17
Liked. Not because I think you'll manage it, but because you've gone to the bother of trying it.

Bryce.
Title: Re: ROM board with a tiny DMA engine
Post by: doragasu on 08:39, 25 January 17
Liked. Not because I think you'll manage it, but because you've gone to the bother of trying it.

Bryce.

Hehe, thanks for my first like on this forum.

It looks like most people thinks that it is not doable, or that it is very very difficult to get DMA to work, but I'm still missing a detailed explanation about why. I understand that I must make wait states to pause the DMA (using READY signal), and that interrupts might cause problems. But if I implement the wait states and if interrupts are disabled when using DMA (what makes sense, because during DMA the CPU is halted), I don't see why this should not work. What am I missing?
Title: Re: ROM board with a tiny DMA engine
Post by: arnoldemu on 10:15, 25 January 17
Hehe, thanks for my first like on this forum.

It looks like most people thinks that it is not doable, or that it is very very difficult to get DMA to work, but I'm still missing a detailed explanation about why. I understand that I must make wait states to pause the DMA (using READY signal), and that interrupts might cause problems. But if I implement the wait states and if interrupts are disabled when using DMA (what makes sense, because during DMA the CPU is halted), I don't see why this should not work. What am I missing?
I don't know enough about the cpc hardware signals so what I am saying can't be treated as fully correct.

From what I understand, using ready to pause the cpu should be ok. I don't see how it would cause a problem for the gate-array or the cpu.

True it may cause interrupts to be delayed but i don't think they will be missed.
Gate-Array will assert them until the cpu answers.

What you may see is that if interrupts delay by more than 32*64us then the next interrupt is moved and the one that synchronises with the vsync may not happen.

BASIC and firmware may see this as a missed interrupt. But for programs that enable and use the DMA and know about it, this is no problem at all.

Which method of DMA are you choosing?

Is it one that interleaves it's access with the cpu so that it runs at the same time and cpu speed is not changed or one that will halt the cpu?

The Atari ST blitter and Amiga blitter can do both.

My thoughts about DMA is that transfering data using the cpu is about 2/3us per cycle at the best, more generally it's around 5us (LDI). so if you can read/write bytes faster than that then the DMA will be better. 1us read, 1 us write is not bad, 1 us read/write would be great.

Another thought, the gate-array controls access to ram for the z80, there is a 74ls which opens the bus to the z80. What I don't know is if the bus is open and it closes the bus for it's access or not. Now, because the gate-array controls the ram, it can read two bytes by toggling cas/ras (not sure which) quickly to read two bytes at a time. You will not get access to these signals from the z80 side.

I think of the ram living on the gate-array side and it is the gate-keeper and opens the gates for the z80 when it wants to.
Title: Re: ROM board with a tiny DMA engine
Post by: rpalmer on 12:13, 25 January 17
"Now, because the gate-array controls the ram, it can read two bytes by toggling cas/ras (not sure which) quickly to read two bytes at a time. You will not get access to these signals from the z80 side."

The RAS/CAS signal are for setting up the internal RAM address and not read/write two bytes at a time. You can think of this as setting up the lower address and upper address of RAM within each chip.

The use of the ready line is (as i understand it) for access to RAM by the CRTC, so any use of it will most likely cause the display to "glitch" during display refresh.  This may not be a problem in the short term, but will get annoying if it occurs to many times.

Another issue is interrupts which the CPC uses for the time the machine was last reset/switched on. Stopping this will effective leave timings out of sync for programs which rely on it (not to mention VSYNC issues).

Also the GA is not the master, but rather a cooperative friend as it interlaces video access with normal Z80 access and this is why the use of DMA is so difficult to get right on the CPC.

You could implement the Z80 DMA chip, but then the DMA transfer would be suspended whenever an interrupt is seen or when the "READY" line is active, but then you need the ready line to suspend the Z80. So it is like trying turn the page of a book while you stand on it.


rpalmer
Title: Re: ROM board with a tiny DMA engine
Post by: doragasu on 19:53, 25 January 17
Thanks for the information and thoughts, it is much appreciated!

Which method of DMA are you choosing?

Is it one that interleaves it's access with the cpu so that it runs at the same time and cpu speed is not changed or one that will halt the cpu?

I'm using #BUSREQ to stop the CPU, I'm afraid. I didn't even thought about interleaving, and THAT for sure would be difficult. As the READY signal from GA is not directly tied to the latch (it goes through a 82 ohm resistor), I might drive the line and add additional wait states. But that for sure would be a challenge, and also I'm afraid I might stress the GA READY output.

My thoughts about DMA is that transfering data using the cpu is about 2/3us per cycle at the best, more generally it's around 5us (LDI). so if you can read/write bytes faster than that then the DMA will be better. 1us read, 1 us write is not bad, 1 us read/write would be great.

I'm positive that I can read from flash in 1 cycle. I suppose I can write in two cycles, maybe one could be possible. But as I do not know exactly how the GA works, I will not know until I test it. Add one wait state, and I should be able to read/write once each microsecond. But I could be wrong.

You will not get access to these signals from the z80 side.

Yeah, too bad I cannot access the signals, and also I have seen no detailed timing information about them.
Title: Re: ROM board with a tiny DMA engine
Post by: doragasu on 20:12, 25 January 17
The RAS/CAS signal are for setting up the internal RAM address and not read/write two bytes at a time. You can think of this as setting up the lower address and upper address of RAM within each chip.

I think he is talking about Page Mode read, which can substantially accelerate sequential reads. You select the row using #RAS and then select the column using #CAS. Then if you want to read another column that is in the same row (page), you can immediately select the column using again #CAS without the need to select the row. Using this method, reads should take around 160 ns, instead of the usual 270 ns. But this doesn't explain how the GA is supposed to read two bytes in a wait cycle (250 ns).

Another issue is interrupts which the CPC uses for the time the machine was last reset/switched on. Stopping this will effective leave timings out of sync for programs which rely on it (not to mention VSYNC issues).

Hum, I have to investigate this issue. Anyway IIRC #BUSREQ has priority over #IRQ, so maybe I do not even need to disable interrupts, the DMA will just delay them (and the user must make sure the transfer is small enough not to delay them too much).

You could implement the Z80 DMA chip, but then the DMA transfer would be suspended whenever an interrupt is seen or when the "READY" line is active, but then you need the ready line to suspend the Z80.

About the interrupts, I'm not sure as I wrote above. And about pausing DMA during WAIT cycles, I don't think that it would slow DMA too much (and for sure not more than it already does to the CPU).
Title: Re: ROM board with a tiny DMA engine
Post by: robcfg on 20:30, 25 January 17
As for how the GA works, see the "Gate Array Decapped" thread, in the Hardware section of the forum.


Sent from my iPhone using Tapatalk
Title: Re: ROM board with a tiny DMA engine
Post by: gerald on 21:11, 25 January 17
Also the GA is not the master, but rather a cooperative friend as it interlaces video access with normal Z80 access and this is why the use of DMA is so difficult to get right on the CPC
It's the exact opposite.
The GA is the master, the Z80 just obey the READY/Waitn signal to synchronise to its allocated DRAM access slot.
If you obey the rules, there should not be any issue having a DMA access.
Rules are :
- You must request the bus to the Z80 (Busreq/busask)
- one access bus every 1ms, that is between 2 WAITn you can only access one address. You shall respect the WAITn !
- The address shall be stable half a 4MHz cycle before the read (like the in Z80 timing diagram). This is to allow the address to be stable for the RAS cycle.
Title: Re: ROM board with a tiny DMA engine
Post by: doragasu on 22:28, 25 January 17
Rules are :
- You must request the bus to the Z80 (Busreq/busask)
- one access bus every 1ms, that is between 2 WAITn you can only access one address. You shall respect the WAITn !
- The address shall be stable half a 4MHz cycle before the read (like the in Z80 timing diagram). This is to allow the address to be stable for the RAS cycle.

I was suspecting there was a restriction like this. Why only one address per microsecond?
Title: Re: ROM board with a tiny DMA engine
Post by: gerald on 22:37, 25 January 17
I was suspecting there was a restriction like this. Why only one address per microsecond?
I made a typo : one access per microsecond (µm), not milisecond (ms).
The GA is doing all the magic of translating the access to the DRAM (ras/cas generation, address multiplexing), and only one access per microsecond is supported.
Title: Re: ROM board with a tiny DMA engine
Post by: doragasu on 08:26, 26 January 17
OK, clear then. I have to give a read to the GA decapping thread to read the juicy details. Anyway, for ROM to RAM writes I should be able to do a read/write every microsecond even with this restriction.

Thanks!
Title: Re: ROM board with a tiny DMA engine
Post by: doragasu on 23:29, 26 January 17
I have started analysing the GA using the schematic from the GA decapped thread. Here you can have a look to some of the signals during a complete sequence (1 us):

(http://i.imgur.com/m7GCAp1.png)

Almost everything makes sense... excepting the READY signal. If you look at the #RAS and #CAS signals, you can see that each microsecond three reads are made: two for the GA (#CPU = 1) and one for the CPU (#CPU = 0). The two GA reads are done using page read mode, as I suspected (#RAS is strobed once, and #CAS is strobed twice). The CPU read is done only if #MREQ = 0 (and it is not a refresh cycle), otherwise #CAS is not lowered. It makes perfect sense...

... BUT... READY signal is lowered when the CPU performs the read, and not when the GA does it!!! Everywhere I have read just the opposite, so maybe my interpretation of the signals is just wrong...
Title: Re: ROM board with a tiny DMA engine
Post by: Bryce on 23:58, 26 January 17
I assume you don't own a logic analyser? If I find time I'll wire up a GA and give you a screenshot of what the reality looks like.

Bryce.
Title: Re: ROM board with a tiny DMA engine
Post by: doragasu on 08:35, 27 January 17
I assume you don't own a logic analyser? If I find time I'll wire up a GA and give you a screenshot of what the reality looks like.

That would be very appreciated. I have a cheap 12 MHz Aliexpress one, but it is not fast enough to spot changes coming from rising and falling edges of a 16 MHz clock. If you could show in the same capture #PHI, #CAS, #RAS, READY, and #CPU, that could help a lot. #MREQ, #M1, #244EN
can be also interesting (and in fact any other control signal, but these are the ones I'm most intrigued about).
Title: Re: ROM board with a tiny DMA engine
Post by: gerald on 19:24, 27 January 17

Almost everything makes sense... excepting the READY signal. If you look at the #RAS and #CAS signals, you can see that each microsecond three reads are made: two for the GA (#CPU = 1) and one for the CPU (#CPU = 0). The two GA reads are done using page read mode, as I suspected (#RAS is strobed once, and #CAS is strobed twice). The CPU read is done only if #MREQ = 0 (and it is not a refresh cycle), otherwise #CAS is not lowered. It makes perfect sense...

... BUT... READY signal is lowered when the CPU performs the read, and not when the GA does it!!! Everywhere I have read just the opposite, so maybe my interpretation of the signals is just wrong...
Indeed, there is a error in the state decoding.
- READY : U304 output should not be inverted (ie it's a NAND2 with one inverted input)
- CCLK : one inverter is missing on the output (ie U306 is a NOR2)

Note that the schematic is a work in progress  ;)

Bonus, a trace captured on a 464 + 40010
 [ You are not allowed to view attachments ]
Title: Re: ROM board with a tiny DMA engine
Post by: doragasu on 21:32, 27 January 17
@gerald (http://www.cpcwiki.eu/forum/index.php?action=profile;u=250) Thanks a lot! It looks I almost nailed it, other than the READY signal, my drawing looks the same as the capture :)

- READY : U304 output should not be inverted (ie it's a NAND2 with one inverted input)

I suppose you mean it's an AND2 with one inverted input. I'll have to update my drawing. Thanks again for the info and for the RE work on the GA!
Title: Re: ROM board with a tiny DMA engine
Post by: gerald on 21:50, 27 January 17
I suppose you mean it's an AND2 with one inverted input. I'll have to update my drawing. Thanks again for the info and for the RE work on the GA!
Yes  :picard:
Title: Re: ROM board with a tiny DMA engine
Post by: Bryce on 11:38, 28 January 17
I assume you don't need me to capture it now? :)

Bryce.
Title: Re: ROM board with a tiny DMA engine
Post by: gerald on 13:04, 28 January 17
I assume you don't need me to capture it now? :)
If you're like me, nothing will prevent you to get your analyser out. But it may not be the best time with all the rust you should have on your bench  ;)
Actually, that a trace I captured when developing my RAM/ROM extension card.
Title: Re: ROM board with a tiny DMA engine
Post by: Bryce on 13:33, 28 January 17
Tell me about it, I have never vacuumed this room as often as I have in the last 3 days! :D

Bryce.
Title: Re: ROM board with a tiny DMA engine
Post by: doragasu on 18:44, 29 January 17
I assume you don't need me to capture it now? :)

Bryce.

I don't need it, but thank you very much anyway!

Working on the schematic right now...
Title: Re: ROM board with a tiny DMA engine
Post by: doragasu on 08:55, 06 February 17
One more question: How much current can I safely draw from the expansion port 5V rail? I made a quick estimation, and I think my cart should not draw more than 50 mA: 19 mA the CPLD, 8 mA the transceivers, 22 mA the flash chip. Flash chip power draw is computed for 4e6 reads per second, and transceivers for inputs/outputs switching at 4 MHz (what will happen only for the clock line). So real power consumption should be much lower.

I don't think 50 mA can cause problems, but I'm asking just to be on the safe side...
Title: Re: ROM board with a tiny DMA engine
Post by: Bryce on 11:22, 06 February 17
50mA is absolutely no problem to take from the expansion port. Other expansions pull a lot more (>200mA).

Bryce.
Title: Re: ROM board with a tiny DMA engine
Post by: rpalmer on 13:50, 06 February 17
While bryce is correct that 50ma will not hurt, it must also be considered against what expansions are before/after yours as they too will draw from the CPC if they have no separate power supplied.
Title: Re: ROM board with a tiny DMA engine
Post by: doragasu on 15:06, 06 February 17
As always, thanks for the quick responses!

While bryce is correct that 50ma will not hurt, it must also be considered against what expansions are before/after yours as they too will draw from the CPC if they have no separate power supplied.

Of course, but I'm afraid that is out of my control ;)
Title: Re: ROM board with a tiny DMA engine
Post by: Bryce on 15:08, 06 February 17
On Centronics versions of the CPC, the supply traces to the expansion port are seriously wide and can easily supply whatever your PSU can throw at it.
It's only on edge connector CPCs that I'd be worried. The last few millimeters of 5V positive trace go down 0.4mm width. Theoretically it could supply about 1.2A, however it will be getting hot at about 1A and would instantly burn through if any short circuit occured.

Bryce.
Title: Re: ROM board with a tiny DMA engine
Post by: zhulien on 01:08, 11 February 17
i couldn't find much info on the net for these hd64b180rop (http://www.microbeetechnology.com.au/store/hd64b180rop-6mhz-enhanced-z80-microprocessor.html) but it says it has enhancements such as dma inbuilt.  I couldn't find whether they are pin compatible with z80 or not either.
Title: Re: ROM board with a tiny DMA engine
Post by: Bryce on 22:50, 12 February 17
i couldn't find much info on the net for these hd64b180rop (http://www.microbeetechnology.com.au/store/hd64b180rop-6mhz-enhanced-z80-microprocessor.html (http://www.microbeetechnology.com.au/store/hd64b180rop-6mhz-enhanced-z80-microprocessor.html)) but it says it has enhancements such as dma inbuilt.  I couldn't find whether they are pin compatible with z80 or not either.

Eh.... No. It has 64pins and a Z80 has 40pins, so it's definitely not pin compatible.

Bryce.
Title: Re: ROM board with a tiny DMA engine
Post by: zhulien on 06:10, 14 February 17
hehe, yes, i missed that 64pins note. silly me.
Title: Re: ROM board with a tiny DMA engine
Post by: doragasu on 08:56, 14 February 17
A render while I wait for the PCBs to arrive...

(http://i.imgur.com/nRwkCap.png)

And now a question:

Currently I have to map on the I/O range 10 registers, the one for the ROM selection, and other 9 for the DMA engine. The ROM select register is at 0xDFXX, but where should I map all the other nine? For the current tests I'm using bits A7 ~ A15, so they are mapped at 0xD80X, 0xD88X, 0xD90X, 0xD98X, etc... but I'm not sure this is a good place to map them. As always, suggestions are welcome.
Title: Re: ROM board with a tiny DMA engine
Post by: Bryce on 10:28, 14 February 17
A fine looking piece of kit. What's the third row of holes above the header for?

Regarding the address assignments, I'd take a look at the table here: http://www.cpcwiki.eu/index.php/I/O_Port_Summary (http://www.cpcwiki.eu/index.php/I/O_Port_Summary) and choose some area that doesn't clash with other popular hardware or choose an address that's already used for something that shouldn't or wouldn't be used with your device at the same time.

Bryce.
Title: Re: ROM board with a tiny DMA engine
Post by: arnoldemu on 11:20, 14 February 17
A render while I wait for the PCBs to arrive...



And now a question:

Currently I have to map on the I/O range 10 registers, the one for the ROM selection, and other 9 for the DMA engine. The ROM select register is at 0xDFXX, but where should I map all the other nine? For the current tests I'm using bits A7 ~ A15, so they are mapped at 0xD80X, 0xD88X, 0xD90X, 0xD98X, etc... but I'm not sure this is a good place to map them. As always, suggestions are welcome.

The CPC I/O ports are partially decoded.

So your choice of D8xx and D9xx may not be good:

Hardware deviceRead/WritePort bits
b15b14b13b12b11b10b9b8b7b6b5b4b3b2b1b0
Gate-ArrayWrite Only01--------------
RAM ConfigurationWrite Only0---------------
CRTCRead/Write-0----r1r0--------
ROM selectWrite only--0-------------
Printer portWrite only---0------------
8255 PPIRead/Write----0-r1r0--------
Expansion PeripheralsRead/Write-----0----------
Ideally choose a range within Expansion Peripherals and one that is not yet taken.

Your rom select port is fine you can freely use that.


Title: Re: ROM board with a tiny DMA engine
Post by: doragasu on 08:56, 15 February 17
A fine looking piece of kit. What's the third row of holes above the header for?

Third row is connected pin by pin to the second one. It's to be able to solder to the PCB a card edge connector instead of a DIL pin header, in case you prefer connecting the cart directly to the CPC instead of using an MX4. (BTW the render has a flaw, the MX4 connector must be soldered on the other side of the PCB, but I'm too lazy to change it).

Thanks all for the info on the port mappings.
Title: Re: ROM board with a tiny DMA engine
Post by: Bryce on 10:17, 15 February 17
Third row is connected pin by pin to the second one. It's to be able to solder to the PCB a card edge connector instead of a DIL pin header, in case you prefer connecting the cart directly to the CPC instead of using an MX4. (BTW the render has a flaw, the MX4 connector must be soldered on the other side of the PCB, but I'm too lazy to change it).

Thanks all for the info on the port mappings.

Nice solution.

Bryce.
Title: Re: ROM board with a tiny DMA engine
Post by: doragasu on 20:30, 17 February 17
I have been working on the DMA. Now I synchronize writes with the READY signal. In the following simulation:
Byte read/write (steps 5 to 7) take 1us (4 clock cycles). If my current understanding of the CPC internals is correct, this should work.

(http://i.imgur.com/XlQjagD.png)


I have synthesized the design, and it fits in a 2€ CPLD  :) (currently using 104/128 macrocells).


Bonus: I have implemented another DMA mode, anyone dares guessing what is it for?  :P
(http://i.imgur.com/S9IZ7h8.png)

Title: Re: ROM board with a tiny DMA engine
Post by: arnoldemu on 12:34, 18 February 17
I believed it would be possible. Thank you for making it true :)

Is the new dma mode a blitter with masking like on Amiga?

Title: Re: ROM board with a tiny DMA engine
Post by: doragasu on 14:25, 18 February 17
Well, I don't still know if it's possible. It should, according to my understanding of the CPC. But I cannot be sure until I test this on the real machine (captures above are simulations, with stimulus generated by me).

The first DMA mode works with the CPU stopped. Although the "bonus" DMA mode works in parallel with the CPU, I'm afraid it is not a blitter. I do not know how the Amiga blitter works, but I suppose it works in parallel with the CPU, accessing memory while the CPU does not need it. I don't think that can be done with the CPC because I suppose the RAM chips are almost always busy. It looks like the GA is always reading from RAM, even when screen is not being drawn (during CRT vertical blanking for instance).

What could maybe be done is a mode running in parallel with the CPU and stealing cycles from it. E.g. it could run in parallel with CPU, and let the CPU access RAM once each 2 microseconds (instead of once each microsecond). But I don't think this could be very useful...
Title: Re: ROM board with a tiny DMA engine
Post by: AugustoRuiz on 22:52, 12 March 18
Doragasu cannot log in right now, so I'm posting news on his behalf...


Quote from: doragasu
After a long hiatus, I am resuming this project. I got the PCB fabricated and assembled a prototype. Plugged it to a CPC, and after a bit of debugging, I got basic ROM switch function almost working. And I say almost because ROMs apparently get properly enumerated (the ones printing messages on screen, print them as intended), but once the "READY" prompt appears, the CPC freezes: there is no response to keypresses... I would appreciate help to debug this thing, because I am currently out of ideas.
Also while debugging this, I have noticed something that looks strange. If I unplug the CPC and test continuity between #EXP (pin 48) and GND, I find there is a short. In the schematic I have, this pin is only connected to the PIO and to a 2k2 pull-up, so I suppose I should not see this short! I have 2 CPCs, and this is happening in both of them, so if the PIO has been damaged (presubably because of the ROMBA), it has happened in both of them. So is my PIO broken, or am I missing something? Can anyone please check if there is a short between pin 48 of the expansion port and GND.
Title: Re: ROM board with a tiny DMA engine
Post by: Bryce on 10:29, 13 March 18
Answer to doragasu: If you look at the schematics of the disk drive section of the 6128, you'll see that /EXP is connected to GND via LK7. I suspect that Amstrad intended using the EXP signal to detect whether a disk drive is connected (the DDI-1 does the same thing when connected to a 464). Because of this, the EXP signal can't be used on the CPC for anything else.
So your CPCs aren't broken, that's the way it's meant to be.

Bryce.
Title: Re: ROM board with a tiny DMA engine
Post by: rpalmer on 13:08, 13 March 18
Bryce,

You can still use the /EXP pin on the expansion port provided that the Disk Interface is not connected.

rpalmer
Title: Re: ROM board with a tiny DMA engine
Post by: Bryce on 13:10, 13 March 18
On a 464 yes, but on a 6128 the "interface" is always connected.

Bryce.
Title: Re: ROM board with a tiny DMA engine
Post by: gerald on 17:33, 13 March 18
And on a 6128, if you open the link, your cpc will try to boot from a CPM floppy. No more basic  :o
Title: Re: ROM board with a tiny DMA engine
Post by: rpalmer on 22:48, 13 March 18
And on a 6128, if you open the link, your cpc will try to boot from a CPM floppy. No more basic  :o

The reason being is the DISC ROM initialisation. If the ROM is detected zero then boot cp/m else the initialisation continues and returns to BASIC.
see below (I have disassembled to the DISCROM from KDS and added comments over a decade ago):

LC1BC
;***************************************************************
;  INITIALISATION OF DISC ROM
;  INPUT
;      C FLAG IF ROM NOT IN SLOT 7
;  CALLED BY
;      NONE.
;
;***************************************************************
       JR   C,LC1C4
       CALL &B912       ;CHECK CURRENT ROM
       OR   A           ;ROM ZER0 ?
       JR   Z,LC1DC     ;YES, REBOOT CP/M
LC1C4  PUSH IY          ;NO
       PUSH DE
       LD   DE,&FB00    ;-&4FF
       ADD  HL,DE        ;SIZE OF WORK SPACE NEEDED FOR DISC ROM
       PUSH HL
       INC  HL
       PUSH HL
       POP  IY
       CALL LC5DD       ;SETUP DPB'S AND SEND SPECIFY COMMAND
       CALL LCCA0       ;SETUP AMSDOS CALLS TO DISC ROM
       POP  HL
       POP  DE
       POP  IY
       SCF
       RET
;
LC1DC
; PURPOSE  BOOT CP/M SYSTEM.
; INPUT
;      NONE
; CALLED BY
;      LC1BC - COMMON ENTRY POINT FOR AUTO BOOT and |CPM
;
       LD   SP,&C000
       LD   IY,&AC48    ;Chain Address to next ticker block
       LD   DE,&AD33    ;ADDRESS OF AREA TO CLEAR
       LD   BC,&A5      ;165 BYTES
       CALL LCAAF       ;CLEAR MEMORY FROM &AD33 TO &ADD8
       LD   HL,&AD41
       DEC  (HL)
       LD   A,&81
       LD   (03),A
       XOR  A
       LD   (04),A
       LD   HL,LC033    ;START OF CONTROL-? JUMP ENTRIES
       LD   DE,&BE80    ;LOCATION TO STORE JUMP TABLE
       LD   BC,&3F      ;SIZE = 63 BYTES
       LDIR             ;DOWNLOAD RSX JUMP TABLE
       CALL LC0C0
       CALL LC5DD       ;SETUP DPB'S AND SEND FDC SPECIFY COMMAND
LC20A  LD   C,&41       ;SECTOR &41
       LD   DE,00       ;(REG E) DRIVE A, (REG D) TRACK = 0
       LD   HL,&0100    ;LOAD ADDRESS
       CALL LC666       ;READ IN BOOT SECTOR
       CALL C,LC2AC     ;CHECK SECTOR CONTENTS
       JR   NC,LC224    ;READ FAILED OR EMPTY BOOT SECTOR
       EX   DE,HL       ;PUT LOAD ADDRESS INTO REG DE
       LD   BC,LC17F    ;ADDRESS OF JUMP VECTOR
       LD   SP,&AD33    ;SET STACK POINTER
       JP   LC177       ;SETUP INTERRUPT ADDRESS & JUMP TO &100
LC224  LD   A,&0F       ;CP/M BOOT FAILED AERT NUMBER
       CALL LCAB8       ;DISPLAY ALERT
       JR   LC20A       ;TRY AGAIN
Title: Re: ROM board with a tiny DMA engine
Post by: doragasu on 01:38, 17 March 18
At last I recovered access to my account  ;D

Thanks for the replies. Finally I found the problem and got the ROMBA working (at least as a simple Rombox).

Now I have Augusto working on a test program for the DMA modes  ;)
Title: Re: ROM board with a tiny DMA engine
Post by: doragasu on 23:04, 17 April 18
Guess what, DMA is working!!! Both DMA modes in fact:
- We can play digital audio with approximately 0% CPU overhead. The limits are the bits per sample ( 8 ) and the maximum sample length (64 KiB). Sampling rate can be configured on the fly, and ranges from 3906 Hz to 50 kHz (and beyond).
- We can transfer data from external ROM to internal CPC RAM at 1 byte per microsecond. While doing DMA the Z80 is stopped, but this is much faster than doing the copy using software routines.

Too bad the CPLD is not bigger, because on a bigger one, I could try making a 2D DMA engine, and THAT could be really useful to speed up rendering sprites!
Title: Re: ROM board with a tiny DMA engine
Post by: doragasu on 23:07, 18 April 18
Quick and dirty DMA test ROM by Augusto Ruiz. The ROM does the following:

1. Display the CPCTelera logo while playing "Hadouken" audio sample (using DMA).
2. Copy the background (16 KiB) to the video memory 50 times using a SW routine, then scroll it a bit.
3. Copy the background (16 KiB) to the video memory 50 times using DMA, then scroll it a bit.
4. Print the screen green.

https://www.youtube.com/watch?v=Yejb4n5Q0BQ

Using DMA is about 5 times faster!  :o 8)
Title: Re: ROM board with a tiny DMA engine
Post by: GUNHED on 01:29, 19 April 18
Quite nice!  :) :) :)


And everybody else always told me that DMA wouldn't be possible on CPC.  ;)
Title: Re: ROM board with a tiny DMA engine
Post by: Munchausen on 01:40, 19 April 18
This is truly amazing work!


I wonder about the potential for it to operate along side the M4, perhaps they can be mapped separately?
Title: Re: ROM board with a tiny DMA engine
Post by: Bryce on 09:44, 19 April 18
Quite nice!  :) :) :)

And everybody else always told me that DMA wouldn't be possible on CPC.  ;)

And everybody else was correct. The "DMA" that Doragasu is showing isn't true DMA, because the CPU is being halted while the transfers are happening. In true DMA the CPU would continue to execute commands while the RAM is being accessed by others.

But I'm still impressed with this project, very cool indeed.

Bryce.
Title: Re: ROM board with a tiny DMA engine
Post by: rpalmer on 23:57, 19 April 18
And everybody else was correct. The "DMA" that Doragasu is showing isn't true DMA, because the CPU is being halted while the transfers are happening. In true DMA the CPU would continue to execute commands while the RAM is being accessed by others.

But I'm still impressed with this project, very cool indeed.

Bryce.

Sorry Bryce, but a DMA can "suspend" a Z80 to transfer data (see attached Product Specification for how a Z80 DMA can operate). Doragasu DMA engine would be operating something like "Byte-at-a-time" mode (as defined in the attached PDF).

rpalmer
Title: Re: ROM board with a tiny DMA engine
Post by: Bryce on 09:40, 20 April 18
Ok, that's slightly closer to what I would consider "real DMA". I thought he was loading the entire 16K in a single halt. However, the CPU is still not getting to do a lot while the transfer is occuring is it?

Bryce.
Title: Re: ROM board with a tiny DMA engine
Post by: tjohnson on 09:53, 20 April 18
I don't really understand what I'm looking at, the logo moves stops then the sound plays then the screen changes and wobbles, is the dma taking place to transfer the sound while the logo is moving?

Sent from my E5823 using Tapatalk

Title: Re: ROM board with a tiny DMA engine
Post by: Munchausen on 15:19, 20 April 18
I don't really understand what I'm looking at, the logo moves stops then the sound plays then the screen changes and wobbles, is the dma taking place to transfer the sound while the logo is moving?


It's just a quick timing test.


After the logo, an unchanging screen is displayed (with red at top and bottom). Actually, the screen is displayed 50 times, copying from ROM, but it just the same screen so it appears that nothing changed. Then the screen wobbles, and afterwards the same screen is displayed again. This time, the screen is again displayed 50 times, but now copying from ROM using DMA. Finally, the display goes green. Notice how much shorter the time to display the screen 50 times with DMA is than without.


I guess that in principle, this can be applied to any data being copied from ROM to RAM. I didn't really follow why it can't be used for sprites? Is it because it would have to be copied to a non-linear set of locations in video RAM? I guess that you could set up a number of small DMA transfers (one for each scan line of the sprite?) I guess it is not useful for small sprites because the setup time overhead would dominate the time taken for the transfer so it would be pointless?
Title: Re: ROM board with a tiny DMA engine
Post by: doragasu on 20:04, 22 April 18
@Bryce (http://www.cpcwiki.eu/forum/index.php?action=profile;u=225) Your initial guess is correct: I request the bus to the CPU, so the CPU is completely halted while the 16 KB are transferred. I could try not requesting the bus, and then inserting wait states (to pause Z80 while DMA does transfers), and also lowering the DMA transfer rate (e.g. 1 write each 2 us instead of 1 write per us) to let the CPU do things in parallel. But I am not sure that would be more useful than what I have running right now, because it would imply doubling the time to do a DMA transfer, while heavily slowing CPU speed (50%?) during the transfer. BTW I disagree when you say this is not DMA. It is not a requirement for DMA that the CPU runs in parallel (this is just a desirable feature), and in fact many old systems halt the CPU for doing ROM to RAM DMA (e.g.: the Gameboy Advance and the Genesis/Megadrive).

@tjohnson (http://www.cpcwiki.eu/forum/index.php?action=profile;u=2129) What is happening is what @Munchausen (http://www.cpcwiki.eu/forum/index.php?action=profile;u=792) has explained, with the addition of the "hadouken" audio sample play. The DMA can also be used to play digital audio samples, this time without halting the CPU (let's call this "Audio DMA"). When you start an Audio DMA transfer, samples are read from the specified ROM address and written directly to the "DAC" without disturbing the CPU operation (in fact I synchronize reads so the CPU can continue reading from ROM while the DMA engine also reads audio samples from the ROM!), and at the programmed sampling rate. Basically this allows playing audio samples with almost 0% CPU usage (just the writes to the DMA registers to start the operation).

@Munchausen (http://www.cpcwiki.eu/forum/index.php?action=profile;u=792) About moving sprites, basically it is what you have already written: for this to be useful, the DMA engine needs to be aware of the screen layout, to be able to copy "square/rectangle" rom regions. You can program transfers for each line, but as starting a DMA transfer requires 6 writes to I/O region... Also for moving sprites, the DMA engine should have some masking capabilities (e.g. if the pixel colour is 0xF, skip write). I could implement these features on a bigger CPLD/FPGA, but I would not like to increase the cost of the cartridge.


Title: Re: ROM board with a tiny DMA engine
Post by: doragasu on 20:09, 22 April 18
I wonder about the potential for it to operate along side the M4, perhaps they can be mapped separately?

I do not know about M4 internals, but unfortunately I don't think they can "cooperate" (e.g. use DMA to transfer data from M4 memory to internal CPC RAM) because current implementation only allows reading from the ROM embedded inside the cartridge.
Title: Re: ROM board with a tiny DMA engine
Post by: Munchausen on 00:58, 23 April 18
@Munchausen (http://www.cpcwiki.eu/forum/index.php?action=profile;u=792) About moving sprites, basically it is what you have already written: for this to be useful, the DMA engine needs to be aware of the screen layout, to be able to copy "square/rectangle" rom regions. You can program transfers for each line, but as starting a DMA transfer requires 6 writes to I/O region... Also for moving sprites, the DMA engine should have some masking capabilities (e.g. if the pixel colour is 0xF, skip write). I could implement these features on a bigger CPLD/FPGA, but I would not like to increase the cost of the cartridge.


Thanks for the explanation. I'm quite amazed how well this works. I don't know what the extra cost would be, but it would be very cool if you could DMA whole sprites, with masking!


Quote from: doragasu
I do not know about M4 internals, but unfortunately I don't think they can "cooperate" (e.g. use DMA to transfer data from M4 memory to internal CPC RAM) because current implementation only allows reading from the ROM embedded inside the cartridge.


I was thinking more that if you could have both connected at the same time, it would be possible to download things using the M4 and copy them into your DMA ROM expansion. But the M4 already maps 16 ROMs, so is it possible to perhaps map your expansion to roms 17-32?


I guess the next question would be, if you can copy to the AY with the only cost being the setup overhead, can you use it to do DMA to other peripherals? Though I'm not sure there is a really good use case.

Title: Re: ROM board with a tiny DMA engine
Post by: Duke on 00:59, 24 April 18
I was thinking more that if you could have both connected at the same time, it would be possible to download things using the M4 and copy them into your DMA ROM expansion. But the M4 already maps 16 ROMs, so is it possible to perhaps map your expansion to roms 17-32?
Actually M4 maps 0 to 31, but only if they are used, unused rom slots, can be used by any other hardware (no romdis or other action is taken by M4). So there should be no problems as long as it doesn't clash with the M4 rom number itself (default 6).
Title: Re: ROM board with a tiny DMA engine
Post by: doragasu on 09:44, 24 April 18
I have currently mapped ROMs 0 to 511 on $DF00 and $DF01 IO addresses. Changing the mapping is just a matter of modifying the mux in the VHDL code, very easy. But you have to be careful with modifications because the CPLD is almost full (96% used so far) and I have yet to add a bit of code to allow writing to the Flash from the CPC itself. If the new mapping requires more logic, it might not fit.

Currently I do not suppport DMA to IO range. I'm starting to repeat myself a lot, but OK, I could implement it on a bigger CPLD/FPGA...

For the CPLD, currently I am using a Lattice LCMXO256C. It has 128 macrocells and costs 2 Eur. I could swap it for a LCMXO640C, that has 320 macrocells, and I suppose all the features we are discussing should fit inside it. But price rises from 2 Eur to 7.5 Eur. For that price maybe it would be better using a small FPGA instead of a CPLD. But that would require redesigning the PCB.

Currently the DMA is more of a proof of concept that something really useful. It happened just because a lot of the CPLD was unused and I wanted to do something with all that logic. But if this picks interest, I might consider designing a second iteration of the cart with a simple sprite engine  :D
Title: Re: ROM board with a tiny DMA engine
Post by: Munchausen on 11:48, 24 April 18
I for one would love it. DMA to any IO device would be awesome!


I started to wonder the other day about what can be achieved if the RAM is exchanged for dual port SRAM...
Title: Re: ROM board with a tiny DMA engine
Post by: GUNHED on 17:17, 24 April 18
This project is getting more and more amazing. Will be fun to code a game for it using 1 us transfers.


First thing I would like to do is a video player ;-)

Title: Re: ROM board with a tiny DMA engine
Post by: LambdaMikel on 08:53, 26 April 18
... finally Captain Future Video in HiRes TrueColor and 50 FPS  :D
Anybody signed up for doing the DIY 4 GB RAM extension yet?
 
Title: Re: ROM board with a tiny DMA engine
Post by: GUNHED on 15:29, 26 April 18
I want a pair of them!  :) :) :)