Author Topic: CPC Z80 Commands and how long they take...  (Read 5908 times)

0 Members and 1 Guest are viewing this topic.

Offline opqa

  • CPC664
  • ***
  • Posts: 73
  • Country: es
  • Liked: 82
Re: CPC Z80 Commands and how long they take...
« Reply #40 on: 13:39, 11 January 15 »
Well now, the second part, how I think all the above "engages" with the GA. In this part some of the things I'm going to tell are just hypothesis and assumptions.

The GA helds the wait signal 3 out of every 4 cycles. I'm going to call G1 to the only state where the Wait signal is inactive, and G2, G3, G4 to the other three.

I strongly suspect that the GA accesses memory during G3 and G4, and that it isolates the Z80 from the memory during G2, G3 and G4. At least partially, but it might be the case that complete isolation is not performed until G3.

The key point in the schematic of the CPC6128 is that the DATA bus is latched in the direction memory -> z80, and that this latch is controlled by the same READY signal that is connected to the WAIT pin of the z80. So if a read is performed in G2, it's going to read the value that was latched in G1. This way the actual value being held by the memory during this cycle might be different.

Anyway, none of the above affects timing, with my notation, the basic timing would be the following:

Code: [Select]
GA-state  G1 G2 G3 G4
Wait          |  |  |

Let's see how this engages naturally with an opcode fetch for instance.

Code: [Select]
GA-state  G1 G2 G3 G4
T-state   T2 R1 R2 T1
Wait          |  |  |
Read       |

Note that none of the above wait signals take real effect on the z80, as none is produced during T2, any other is simply ignored. Also note that this is the natural synchronization schema, and if we try any other we we'll end up with this one. For instance, let's guess we begin with T1 in G1:

Code: [Select]
GA-state  G1 G2 G3 G4 G1 G2 G3 G4
T-state   T1 T2 Tw Tw Tw R1 R2 T1
Wait          |  |  |     |  |  |
Read                   |

If we start with T1 in G3:

Code: [Select]

GA-state  G3 G4 G1 G2 G3 G4
T-state   T1 T2 Tw R1 R2 T1
Wait       |  |     |  |  |
Read             |

And so on...

In all cases the memory access is taken place during G1. But, let's see what happens when we have an opcode fetch followed by a memory read operation (a typical case):

Code: [Select]
GA-state  G4 G1 G2 G3 G4  G1 G2 ...
T-state   T1 T2 R1 R2 T1  T2 T3 ...
Wait       |     |  |  |      |
Read          |               |

Now the memory read operation is performed during G2! That's why I said before that the z80 is "allowed" to access memory during 2 out of 4 cycles. It can be either G1 or G2 depending on the current and previous operation.

But, to be honest, and as I said before, what the z80 is really sampling during this clock is the latched sample of the bus that took place in the previous one during G1. So we can really consider that the real memory read takes place always at G1.

In the third and last part, why OUT (c),r doesn't fit in just 3 nops...
« Last Edit: 21:12, 11 January 15 by opqa »

Online arnoldemu

  • Supporter
  • 6128 Plus
  • *
  • Posts: 5.112
  • Country: gb
    • Unofficial Amstrad WWW Resource
  • Liked: 1952
Re: CPC Z80 Commands and how long they take...
« Reply #41 on: 13:47, 11 January 15 »
As stated in soft968 both memory and io have wait applied so this stretches out operation.
I believe out (c) has a single t state where wait can be applied for slow devices, I think this is where the delay happens.

In pcw docs it says io doesn't have wait so on pcw up should not take as long.

My games. My Games
My website with coding examples: Unofficial Amstrad WWW Resource

Offline opqa

  • CPC664
  • ***
  • Posts: 73
  • Country: es
  • Liked: 82
Re: CPC Z80 Commands and how long they take...
« Reply #42 on: 14:12, 11 January 15 »
@arnoldemu

If you mean that the CPC hardware is inserting addional wait states (apart from the ones from the GA). I'm almost sure that this is not the case. It isn't needed at all to explain the timings. Take a moment to read my posts and maybe you'll agree with me.


So let's analyse what happens with out (c),r (or with in r,(c)). This instruction consist of two opcode fetches followed by a I/O port write. As stated before. The I/O port write takes 4 cycles: T1, T2, Tw (automatically inserted), and T3.

So the timings, starting from the second opcode fetch are:

Code: [Select]
GA-state G4 G1 G2 G3 G4 G1 G2 G3 G4 G1 ...
T-state  T1 T2 R1 R2 T1 T2 Tw Tw Tw T3 ...
Wait      |     |  |  |     |  |  |


Here we have a first wait state after T2 that is introduced automatically by the Z80, not by the CPC hardware, and a second and third ones that are inserted by the GA because of its fixed timing. So this is where the extra NOP comes from.

Now let's analyse why this doesn't happen with out (n),A. This operation consists of an opcode fetch, a parameter fetch, and an i/o port write.

Parameter fetches have the same timing as regular memory accesses, so just 3 t-states T1, T2 and T3. The complete sequence including all machine cycles is:

Code: [Select]
GA-state G4 G1 G2 G3 G4 G1 G2 G3 G4 G1 G2 ...
T-state  T1 T2 R1 R2 T1 T2 T3 T1 T2 Tw T3 ...
Wait      |     |  |  |     |  |  |     |

So this instruction fits within just 3 NOPs, because the shorter timing of the parameter fetch gives enough space to the i/o operation to complete within the next NOP.

And that's all. What do you think?
« Last Edit: 14:18, 11 January 15 by opqa »

Offline ralferoo

  • Supporter
  • 6128 Plus
  • *
  • Posts: 1.071
  • Country: gb
  • Liked: 573
Re: CPC Z80 Commands and how long they take...
« Reply #43 on: 19:59, 11 January 15 »
It too took me ages until I believed that the WAIT generation was as simple as it was. I was convinced it used the M1 and IORQ signals to do something more complicated, but no, it really is that simple... Hopefully I'll show you why now... :) This is largely the same as opqa's explanation but explained slightly differently...

For reference, this is what I mean by the Z80 user manual: http://tlienhard.com/zx81/z80um.pdf and the important pages start at page 29 (11 in the official numbering).

Instruction Fetch: T1 T2 (checks WAIT) T3 T4 (4+)
Read or Write: T1 T2 (checks WAIT) T3 (3+)
IO: T1 T2 TW (checks WAIT) T3 (4+) - later I'll refer to this as just T1 T2 T3 (checks WAIT) T4

For now, forget the labels T1, T2 etc and just consider the states that check the WAIT signal. So, we now have:
IF: -*--
RW: -*-
IO: --*-

OUT (c),r is described on page 298 (280 official numbering): 4,4,4 (IF, IF, IO)
Writing this out in terms of WAIT states, we have: -*-- -*-- --*-
Aligning this up with the CPC's wait states (I'll add a NOP also, -*--, but any instruction is the same):
                                                                           
                    vv the NOP's fetch T state is stretched by 1 T-state until the next gap
-*-- -*-- --__ _*-- _*--                                                       
1234 1234 12__ _341 _234                                                       
W.WW W.WW W.WW W.WW W.WW                                                       
            ^^ ^^ this T state is stretched by 3 T-states until the next gap   
         
As you can see, 4 extra cycles have been inserted, but it's not as simple as just adding 1us to the IO cycle (even though that is the visible effect, it's actually 2 separate stalls)...

We can also see the difference to OUT (n), A which you might expect to take 4us not 3us...

Page 297 (279 official), shows OUT (n),A as: 4,3,4 (IF, RW, IO) - note the shorter 2nd M cycle... ;)
Writing this out in terms of WAIT states, we have: -*-- -*- --*-

Aligning this up with the CPC's wait states (I'll add a NOP also, -*--, but any instruction is the same):
                                                                           
               vv the NOP's fetch T state is stretched by 1 T-state until the next gap
-*-- -*-- -*-- _*--                                                             
1234 1231 2341 _234                                                             
W.WW W.WW W.WW W.WW
           ^ note that this T state does align perfectly                       
     
So, even though the wait check check occurs 1 T state later in the IO M-cycle compared to the others, because it follows an M cycle, it starts 1 T state earlier and so it aligns perfectly.

Hopefully, that makes things clearer. If I've confused you more, I can try to explain it differently... :)
« Last Edit: 20:15, 11 January 15 by ralferoo »

Offline Executioner

  • Supporter
  • 6128 Plus
  • *
  • Posts: 790
  • Country: au
  • WinAPE Developer
    • WinAPE
  • Liked: 374
Re: CPC Z80 Commands and how long they take...
« Reply #44 on: 03:37, 12 January 15 »
I think that's almost exactly what I was saying. It really is as simple in the CPC as EVERY memory or I/O read/write operation is aligned with cycle n (0..3) of every 4 cycles. The latest JEMU source code proves that a Z80 implementation designed using only the fetch, mem read, mem write, I/O read and I/O write operations with timing exactly as per the Zilog user manual can be made to have exactly the same timing as a real CPC by aligning the /WAIT to one the 4 cycles (I'm not sure which one it's actually high on, you'd have to either test with a CRO from reset or read the GA logic. It could also be determined by the exact position of palette changes etc).

If you look at the way it actually works, it is possible for a NOP (or any other 4 T-State instruction) to actually take 7 T-States to complete. If the previous instruction had been 5, 9, 13, 17 or 21 T-States.

Offline Bryce

  • The Hardware Guy.
  • Supporter
  • 6128 Plus
  • *
  • Posts: 9.686
  • Country: wf
  • It's not broken, it just hasn't been fixed yet.
    • index.php?action=treasury
  • Liked: 2974
Re: CPC Z80 Commands and how long they take...
« Reply #45 on: 10:31, 12 January 15 »
Although I asked the question, I stopped understanding the posts in this thread back on page one :D

Bryce.

Offline opqa

  • CPC664
  • ***
  • Posts: 73
  • Country: es
  • Liked: 82
Re: CPC Z80 Commands and how long they take...
« Reply #46 on: 11:07, 12 January 15 »
It really is as simple in the CPC as EVERY memory or I/O read/write operation is aligned with cycle n (0..3) of every 4 cycles.
Well, I disagree a little bit about this, as I explain at the end of this post, from the z80 point of view, the read operations can take place either on first or second cycle out of every 4. Opcode fetches will take place always on first cycle, but memory reads and i/o inputs will take place one cycle later. Anyhow this small detail doesn't change the overall timing.

Offline Executioner

  • Supporter
  • 6128 Plus
  • *
  • Posts: 790
  • Country: au
  • WinAPE Developer
    • WinAPE
  • Liked: 374
Re: CPC Z80 Commands and how long they take...
« Reply #47 on: 11:45, 12 January 15 »
Well, I disagree a little bit about this, as I explain at the end of this post, from the z80 point of view, the read operations can take place either on first or second cycle out of every 4. Opcode fetches will take place always on first cycle, but memory reads and i/o inputs will take place one cycle later. Anyhow this small detail doesn't change the overall timing.

Actually op-code fecthes occur in the same cycle as /WAIT goes high, whereas memory reads and IO occur on the next cycle after /WAIT goes high, so the data has to be available 1 T-State after /WAIT goes high. This does suggest that the GA doesn't do memory reads during those two cycles and the Z80 can, but the /WAIT signal still only goes high for 1 cycle. It's only an assumption, but I'd think the internal operation is something like (using your terminology):

G1: /WAIT high, address and data bus multiplexed for Z80 use
G2: /WAIT low, address and data bus still for Z80 use
G3: /WAIT low, CRTC address on address bus, memory read into GA shift register
G4: /WAIT low, CRTC address + 1 on address bus, memory read into GA shift register

Offline Optimus

  • 464 Plus
  • *****
  • Posts: 348
  • Country: gr
  • Liked: 149
Re: CPC Z80 Commands and how long they take...
« Reply #48 on: 17:27, 12 January 15 »
Or alternatively, "Those who can, do; those who can't, teach."


I recently heard the same quote from a greek friend. Too many teachers and professors here. But few who know how to code or do/did something practical in their career.

Offline Optimus

  • 464 Plus
  • *****
  • Posts: 348
  • Country: gr
  • Liked: 149
Re: CPC Z80 Commands and how long they take...
« Reply #49 on: 17:47, 12 January 15 »
I remember Odiesofts sources. He used like a lot of people MAXAM in ROM (I still prefer it that way) and his source was a collection of lines with dozens of Z80 instructions in every line.


Strange. I am doing that too. I even hate it when an assembler doesn't support this with semicolon.
It's kinda like grouping many opcodes that do one thing in my mind.
Else, you have those listings with one opcode below the other (and with TABs), but then one single routine could be five pages long, instead of summarized in a single page.
Or when I unroll loops manually, it's one line with many opcodes separated by semicolon. So I copy this line many times. Imagine if every next line of this was a new opcode..


As for the discussion, it reminds me of that little joke that says: Pick any of two: Elegant, Fast and Small.