News:

Printed Amstrad Addict magazine announced, check it out here!

Main Menu
avatar_ComSoft6128

Rombox V SSD

Started by ComSoft6128, 21:44, 29 December 17

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

ComSoft6128

Hi,

Just a general query.
How does an old style Rombox (E.G. Rombo) or the modern equivalent (Mega-Flash) compare in access speed with a SSD drive?
To the user it might look the same when a program is loaded but is it?

Cheers,

Peter

Bryce

It depends on the complete system. An SSD connected to a CPC would be the same speed as a ROMBox because the top speed is determined by the Z80, not the media device. An SSD on a PC can achieve speeds that an old EPROM would never reach.

Bryce.

rpalmer

comsoft6128,

As I understand a SSD is simply an equivalent SATA/PATA storage device. Modern SSDs can achieve speeds way more than the speed of the CPC can get since the CPC is clocked at a little less than 4 MHz.

To give you an example some modern SSDs can get read/write speeds of the order of 400+ Megabytes/second (i would not be surprised if 1 Gigbyte/sec were available) where the CPC can at best transfer about 140 Kilobytes/second (SF-II).

To put this into perspective, the CPC would see no difference in access to a SSD than that of the ROM/RAM Disc (in fact a device which operates faster than what the CPC is will simply twiddle its thumbs waiting for the CPC if designed correctly).

rpalmer.

GUNHED

#3
IDE8255 can transfer up to 180 KB/s.  :)  But RAM / ROM access can be way quicker. An LD A,(HL) instruction reads one byte and takes 2 us. That is 0,5 MB/s (theoretical maximum).  :)
http://futureos.de --> Get the revolutionary FutureOS (Update: 2023.11.30)
http://futureos.cpc-live.com/files/LambdaSpeak_RSX_by_TFM.zip --> Get the RSX-ROM for LambdaSpeak :-) (Updated: 2021.12.26)

rpalmer

GUNHEAD,

While I don't wish to nit-pick, that instruction is not how I/O works on a Z80. The CPC uses the I/O instruction INI or IN r,(C) where r is the register to load to.

You need an INI followed by INC B to get just one byte of data from an I/O Port (the INC B is to restore the B register). The INI and INC B instruction total 24 T-States (20 for INI and 4 for INC). This means that the max transfer is 4MHz/(1024*24), which 162.76K/s and we know the CPC is actually about 3.5 MHz (due to interruptions via the CRTC) so the transfer is going to be about 140K/s.

rpalmer

GUNHED

Quote from: rpalmer on 22:40, 01 January 18
GUNHEAD,

While I don't wish to nit-pick, that instruction is not how I/O works on a Z80. The CPC uses the I/O instruction INI or IN r,(C) where r is the register to load to.

You need an INI followed by INC B to get just one byte of data from an I/O Port (the INC B is to restore the B register). The INI and INC B instruction total 24 T-States (20 for INI and 4 for INC). This means that the max transfer is 4MHz/(1024*24), which 162.76K/s and we know the CPC is actually about 3.5 MHz (due to interruptions via the CRTC) so the transfer is going to be about 140K/s.

rpalmer


Well, In don't want to nit-pick either. But in the case of the 8255IDE you can use four subsequent INI instructions then just reload B (LD B,D for example, which is just one us). You can make a block or a loop around a block. Then you reach the speed which I told you. I have the routines up and running. See Wiki page for further details.  :)


In addition you're wrong about the CRTC, it extends Z80 commands to full us frames, but the Z80 still works with 4 MHz, because the CPC has a 16 MHz crystal divided by 4.


There are numerous Z80 timing sheets for the CPC out there, just take a look.  :)

http://futureos.de --> Get the revolutionary FutureOS (Update: 2023.11.30)
http://futureos.cpc-live.com/files/LambdaSpeak_RSX_by_TFM.zip --> Get the RSX-ROM for LambdaSpeak :-) (Updated: 2021.12.26)

rpalmer

GUNHED,

Quote from: GUNHED on 23:21, 02 January 18In addition you're wrong about the CRTC, it extends Z80 commands to full us frames, but the Z80 still works with 4 MHz, because the CPC has a 16 MHz crystal divided by 4.

I suspect that you do not really understand how the CRTC (and GA chip) access the DRAM then to display the picture.

Let me inform you of HOW IT DOES THIS

1. The CRTC (6845) triggers the GA to get display data (principally via the DISPEN signal - see the CPC main board schematic). Only the CRTC register settings determine when to access video data to display a picture and not the GA otherwise it would be pointless to have the CRTC at all.

2. The GA then interrupts the current instruction (or next instruction) to get said data, so in conclusion the CRTC DOES indirectly slow the Z80 - end of story.

This means that at times the Z80 is NOT running at 4MHz all the time, but a little less than that at times, hence it is often stated to be approximately 3.5 MHz overall.

If all of this is confusing then I have nothing to add to simplify it further, it is what it is.

Quote from: GUNHED on 23:21, 02 January 18Well, In don't want to nit-pick either. But in the case of the 8255IDE you can use four subsequent INI instructions then just reload B (LD B,D for example, which is just one us). You can make a block or a loop around a block. Then you reach the speed which I told you. I have the routines up and running. See Wiki page for further details.

What is the hardware for 8255IDE you are using? If address select of the 8255 is using A8/A9 then you cannot use 4 INIs as 2 of the INI's do not access the 16-bit data of the IDE (they access the control register and the non-data lines of the IDE port). If you use other addresses to control the 8255 then you are accessing the data as 8-Bit only, so only half of the true 16-bit IDE is available and so the max transfer rate is HALF of the true full data transferred to the IDE. Of course in 8-bit mode that is not the case, again you did not say much about the hardware configuration and so your speed transfer is unqualified and more than likely calculated wrong.

I was also just saying that a single instruction you said "LD A,(HL)" (its speed being 7 T-states) would infer a max transfer speed of 180K/s when in fact the instruction required is INI to get data from an I/O device (its speed is 20 T-states). So you can see that the speed was wrong based solely on what you said.

Yes you can increase the transfer speed by doing many INIs followed by a "LD B,D" and this would give a higher transfer speed at 4MHz, but as I said the CPC Z80 is interrupted for video access so the max speed IS approximately 3.5/4 or 87.5% of any MAX based on the 4 MHz clock.

rpalmer



arnoldemu

Quote from: rpalmer on 03:08, 03 January 18
What is the hardware for 8255IDE you are using?
I believe @GUNHED is using Yarek's 8255IDE (see here http://8bit.yarek.pl/interface/yamod.ide8255/ ). He also has an internal 4mb from yarek. Yarek's design has 4 ports so you can do INI 4 times before reloading B.

My games. My Games
My website with coding examples: Unofficial Amstrad WWW Resource

arnoldemu

#8
Quote from: rpalmer on 03:08, 03 January 18
I suspect that you do not really understand how the CRTC (and GA chip) access the DRAM then to display the picture.

Let me inform you of HOW IT DOES THIS

1. The CRTC (6845) triggers the GA to get display data (principally via the DISPEN signal - see the CPC main board schematic). Only the CRTC register settings determine when to access video data to display a picture and not the GA otherwise it would be pointless to have the CRTC at all.

2. The GA then interrupts the current instruction (or next instruction) to get said data, so in conclusion the CRTC DOES indirectly slow the Z80 - end of story.

This means that at times the Z80 is NOT running at 4MHz all the time, but a little less than that at times, hence it is often stated to be approximately 3.5 MHz overall.

If all of this is confusing then I have nothing to add to simplify it further, it is what it is.
I find that a a little confusing especially point 2 where you say it "interrupts the current instruction". For me this causes confusion with z80 or peripheral interrupts.

My take on it: As you have said the CRTC supplies memory addresses and says when the border is active or not. It controls the timing of the display in terms of vsync and hsync and that is it.

The GA:
- fetches the data for the display and outputs the pixels (or border if it's active) based on the current mode.
- controls ram/rom
- controls access to the main ram itself (it controls the 'gateway').
- outputs the pixels to the screen
- controls the video blanking.
- tweaks and then outputs the hsync (as composite sync)
- tweaks and then outputs the vsync (as composite sync)

To stop the GA and Z80 fetching from memory at the same time the GA has priority and tells the z80 to wait.
My games. My Games
My website with coding examples: Unofficial Amstrad WWW Resource

rpalmer

Quote from: arnoldemu on 09:55, 03 January 18That is a little confusing especially point 2 where you say it "interrupts the current instruction". This causes confusion with z80 or peripheral interrupts.

It does the interruption via the 'WAIT" signal into the Z80 and times this with the instruction op-code fetch cycle or interrupt acknowledge cycle (and is able to know this by the /M1 signal). So without knowing when the Z80 is fetching instructions, the access to video data would interfere with the instruction data itself and make the z80 go into whoop-whoop land.

This form of interruption enables the video display to ALWAYS guarantee access to video data when required since the Z80 is always executing an instruction from memory (excluding the HALT instruction).

Note 1: schematic ICs 114 and 115 act as data bus isolators to ensure the z80 data bus is not corrupted during video data access via the CRTC.
Note 2: This is why it is near impossible for a DMA to be added since the WAIT signal and video access are so tightly constrained by this relationship that DMA execution is at best as fast as the CPU to perform a DMA function with internal memory.

rpalmer

CloudStrife

Quote from: rpalmer on 03:08, 03 January 18
GUNHED,

I suspect that you do not really understand how the CRTC (and GA chip) access the DRAM then to display the picture.

[...]

This means that at times the Z80 is NOT running at 4MHz all the time, but a little less than that at times, hence it is often stated to be approximately 3.5 MHz overall.

[...]

And what's is funny is that your 3.5MHz value is not usefull in this case and YOUR value are wrong...

Yes the GA insert WAIT state that slow down the Z80 on certain instruction...
But on CPC, INC B take 1µs. So 4 T-State @ 4 MHz without slowing...
And for INI ? It take 5 µs. So 20 T-State @ 4 MHz without slowing... Wait ?!? What ? 20 T-State ? INI doesn't 20 T-State, it take 16 T-State and are effectivly slowed by the GA !
So a total of 6 µs. 1MHz / 6µs = ~162.76 KiB/s

And if we don't use any WAIT state ? 4 T-state and 16 T-state...
4 MHz / 20 T-state = ~195.31 KiB/s

So at least when you are patronizing, try to use the right value... And the right term, a wait state IS a wait state, not an interrupt...

And please, stop this fucking 3.5 MHz bullshit, the Z80 RUN AT 4 MHz, the fact that is slowed by other peripherical change nothing to this fact... it's the same on MSX, it's the same on Spectrum, it's the same on PC and this value is not even right...
For exemple in this case the CPC have an equivalent speed of a Z80 of 3.33MHz, not 3.5MHz. If you want to do timing calculation on the CPC, you forget the T-state and just took a CPC NOP timing table.


robcfg

Guys, calm down please.


Nobody is perfect and we all try to do our best. So let's learn more about the current subject.


I'd say we should have a benchmark tool to test this kind of transfers, and check the results against the tool's code.


I don't think it will be able to reach the theoretical maximum speed but we may well discover interesting facts...

GUNHED

Quote from: arnoldemu on 09:50, 03 January 18
I believe @GUNHED is using Yarek's 8255IDE (see here http://8bit.yarek.pl/interface/yamod.ide8255/ ). He also has an internal 4mb from yarek. Yarek's design has 4 ports so you can do INI 4 times before reloading B.


Thank you. That's exactly it.  :) 
http://futureos.de --> Get the revolutionary FutureOS (Update: 2023.11.30)
http://futureos.cpc-live.com/files/LambdaSpeak_RSX_by_TFM.zip --> Get the RSX-ROM for LambdaSpeak :-) (Updated: 2021.12.26)

rpalmer

#13
Quote from: CloudStrife on 12:57, 03 January 18INI doesn't 20 T-State, it take 16 T-State
I stand corrected on this as I misread the Z80 PDF. Sorry people  :doh: . This does not alter my argument to GUNHED about transfer speed and how he stated it in the first place.

Quote from: CloudStrife on 12:57, 03 January 18Yes the GA insert WAIT state that slow down the Z80 on certain instruction...

So the GA can be stopped by not executing these CERTAIN instructions then  :o
Seriously, the GA interrupts ANY instruction execution via the WAIT signal (to delay it) and not certain instructions.


Quote from: CloudStrife on 12:57, 03 January 18So at least when you are patronizing, try to use the right value... And the right term, a wait state IS a wait state, not an interrupt...

While I may have missed used the "interrupt" term, an interrupt means "stop the continuous progress of (an activity or process)" according to googling it.
So insertion of a forced WAIT state by the GA to stop the execution of a CPU instruction before releasing it is what again?
I will caveat that the insertion a wait state might be seen as a delay in instruction execution (which it technically is overall), but when forced at the start of a process it is effectively an interrupt.

Quote from: CloudStrife on 12:57, 03 January 18And please, stop this fucking 3.5 MHz bullshit, the Z80 RUN AT 4 MHz, the fact that is slowed by other peripherical change nothing to this fact... it's the same on MSX, it's the same on Spectrum, it's the same on PC and this value is not even right... For exemple in this case the CPC have an equivalent speed of a Z80 of 3.33MHz, not 3.5MHz. If you want to do timing calculation on the CPC, you forget the T-state and just took a CPC NOP timing table.

Why swear to emphasis a point.
Yes the Z80 is fed a 4 MHz clock, but with interruptions to instruction execution (again via the WAIT signal from the GA) it is NOT effectively an unhindered 4MHz Z80 which many people assume the CPC is running at. So basing ANY calculations on the Z80 instruction manual without consideration of the instruction execution interruptions has to be "qualified" which I HAVE NOT SEEN  STATED ANYWHERE BY ANYONE when calculations for a given speed are said.

This is why I said "approximately" to emphasize that this is the equivalent unhindered z80 speed and by "approximately" I mean I have stated there is room for variance to an exact valve one might experience in the real world (be it 3.5, 3.33, 3.75 or what ever it happens to be at the time).

rpalmer

Bryce

Quote from: rpalmer on 22:44, 03 January 18
Seriously, the GA interrupts ANY instruction execution via the WAIT signal (to delay it) and not certain instructions.

That's not quite correct. The /WAIT signal only instructs the CPU that the buses aren't available. The CPU only stops executing commands when it encounters a command that needs the buses. Purely internally related commands would continue to be executed. So GUNHEDs statement is correct.

Bryce.

rpalmer

Quote from: Bryce on 22:55, 03 January 18The /WAIT signal only instructs the CPU that the buses aren't available

Sorry bryce, but that is not correct either.

The Z80 CPU manual states:

/WAIT. Wait (input, active low). /WAIT indicates to the CPU that the addressed memory or I/O devices are not ready for data transfer. The CPU continues to enter the Wait state as long as the signal is active. Extended /WAIT periods can prevent the CPU from properly refreshing dynamic memory.

There is no mention of buses.

rpalmer


andycadley


It may not say it explicitly, but that's the practical impact of what it is saying. /WAIT signals to the CPU that an external device is in the process of responding and so the Z80 can't read (or write) from the address/data buses at that time - hence the reference to the Z80 potentially being unable refresh dynamic RAM (since it can't assert the values of IR on the bus as usual). Purely internal operations of the Z80 can continue though and you can see this in way instructions get stretched on the CPC if you break down the low level timing of each state.


This is all a bit moot to the original question though. ROM will always be faster than an SSD I/O peripheral on the CPC, simply because the Z80 can execute directly from ROM and thus there isn't any "load time" as such. Of course a modern SSD is almost certainly fast enough that you could memory map as if it were ROM and the two would become indistinguishable, but that would seem an odd thing to do given the size limitations it would inevitably impose.

Bryce

Quote from: rpalmer on 23:40, 03 January 18
Sorry bryce, but that is not correct either.

The Z80 CPU manual states:

/WAIT. Wait (input, active low). /WAIT indicates to the CPU that the addressed memory or I/O devices are not ready for data transfer. The CPU continues to enter the Wait state as long as the signal is active. Extended /WAIT periods can prevent the CPU from properly refreshing dynamic memory.

There is no mention of buses.

rpalmer

Sorry for not using the exact wording of the manual, but: /WAIT indicates to the CPU that the addressed memory or I/O devices are not ready for data transfer = The buses can't be used.
If the CPU has other commands in the queue that don't require the bus it will continue executing them. I have designed and built hardware that relies on this fact. The wait pin was originally implemented to allow the use of slower memory devices, hence it only stalls IO and not internal execution.
Bryce.

rpalmer

Quote from: Bryce on 09:05, 04 January 18/WAIT indicates to the CPU that the addressed memory or I/O devices are not ready for data transfer = The buses can't be used.

The clever designers of the CPC did allow the buses (both address and data) to be used for the video data access during the "WAIT" states of the CPU and this is may be the reason why it is so difficult (if not impossible) to develop a DMA circuit to transfer data with the internal memory.

How was this bus usage achieved you might well ask, well ICs 104, 105, 109 and 113 and MUXs which toggle between Z80 CPU addresses and CRTC generated addresses. When the GA detects the CPU is reading an Op-code the /M1 signal goes low, a wait state can be inserted by the GA (via a detection of the DISPEN signal from the CRTC) to handle the video data access. During the WAIT state the GA will toggle the MUXs for the video data access and trigger a read of the main memory at which point the data from the read will be latched internally by the CA for further processing. The GA will then release the WAIT signal leaving the CPU to continue on its merry way. All this happens within 1 cycle of the 4 MHZ clock fed into the CPU... wonderful and elegant design ;D .

rpalmer

Bryce

Ok, I need to be even MORE specific (although I think you are deliberately trying to mis-understand at the moment) :)

/WAIT indicates to the CPU that the addressed memory or I/O devices are not ready for data transfer = The buses can't be used BY THE Z80.

The CPC wasn't the only piece of hardware to make use of the wait signal. It was a commonly known method, not an Amstrad designers cleverness.

Bryce.

rpalmer

Quote from: Bryce on 10:33, 04 January 18/WAIT indicates to the CPU that the addressed memory or I/O devices are not ready for data transfer = The buses can't be used BY THE Z80.

I was going to include that in my response, but chose to leave it out so as not to be patronizing (or condescending) to you or be seen as that by anyone else  ::)

Quote from: rpalmer on 09:48, 04 January 18The clever designers of the CPC did allow the buses

I probably should have said "The design of the CPC incorporated an existing method to interlace video access with normal CPU access......" which would have been more correct and specific to the thread's line of discussion.

rpalmer

Bryce

Top marks. We have clearly nailed this subject now and possibly got ourselves a place in the "Pedantic hall of fame" (if that exists) :D

Btw: Didn't comment on it up to now, but you are correct that this is one of the reasons why DMA would be a nightmare to implement on a CPC.

Bryce.


Powered by SMFPacks Menu Editor Mod