Author Topic: The firmware is interrupting my code, POPping my return address – and losing it!  (Read 3614 times)

0 Members and 1 Guest are viewing this topic.

Offline db6128

  • 464 Plus
  • *****
  • Posts: 316
  • Country: gb
  • We don’t speak 8080 in this house.
  • Liked: 72
  • Likes Given: 44
Edit: D’oh! It looks like this problem was caused by my altering C' by using the alternate registers to change the border – specifically, the one-time event of targeting its INK using EXX:LD BC,&7F10:OUT (C),C. As I did already know but somehow managed not to think about for the purposes of this, C' is reserved by the firmware for storing the current ROM selection status.

So, altering C' in this way meant that bit 7 was no longer set, which meant that the ROM configuration could no longer be updated by the jumpblock and/or interrupt service routine, which presumably meant that at least some of the intended calls to ROM (via my replacing straight CALLs to ROM in the original ROM-based code with CALLs to the corresponding jumpblock entries in RAM) were instead hitting RAM and probably rolling across NOPs until they somehow ended up at usable code.

I’m quite baffled that the program ever worked at all with that! I seem to have fixed the problem, sorted out the stack, etc. by just using a register other than C' to initially target the border. (I thought for a while that another problem was my not having enabled the lower ROM before using the jumpblock, but the RST-based jumps take care of that.)

I’ll leave my original (misguided) exposition of the problem below my more general discussion of the tape-loading code, for anyone who is really bored. :D



Thanks to arnoldemu’s disassembly of the CPC 6128’s ROM – which, by the way, is a great resource to have made and must have taken ages! – I’ve been able to create a slightly altered version of CAS READ (the primitive that loads headerless files from tape), which includes a coloured border and the ability to fill the screen’s lines from top to bottom (rather than the character-based/‘venetian blind’ pattern associated with loading linearly).

So far, so good – I think! It doesn’t crash or anything… Well, not always! ;) Feel free to check the (unattractively formatted) ASM file attached to this post, and please also let me know if I’m tempting fate by messing up timings, and how I might fix them if so. I freely admit to having a terrible grasp of tape handling and maths in general, so I’m quite surprised this has worked at all.

Possibly very sloppy coding: I didn’t do anything to account for changes in timings introduced by the added border-setting and address-changing code, as I assumed the section that sets the edge-timer from R (which itself is zeroed after each bit, and now before my border-changing code) would do that for me; I hope that’s correct?

Still, it seems to work (most of the time…) and load screens as intended, so either way, I guess I’ve not pushed the limits of tolerance of the timing algorithm too far, at least for an emulator.

arnoldemu: I also got line-by-line loading of the screen working with your amended version of Toposoft’s loader from Blue Angel, so thanks for publishing that, too. I did have to actively account for timing in that one, as opposed to in CAS READ, but that was quite simple. However, my subsequent editing to support a 1.5× data rate was very quick, clumsy, and possibly totally wrong :P – so, again, I might be pushing my luck with the timings, and I haven’t tested it on my actual 6128 yet.



Anyhow, the main problem with my new CAS READ isn’t related to timings but instead
Spoiler: ShowHide
occurs right at the end, when the routine is supposed to be cleaning and exiting, just after it re-enables interrupts:
Code: [Select]
29d0 d1        pop     de        ;[Get previous state of PPI port C into D]
29d1 f5        push    af
[…]
29dc fb        ei                        ;; enable interrupts

[Interrupt handler takes over here and does an RST &38, pushing PC onto the stack but losing that address before it can ever return]

29dd 7a        ld      a,d
29de cdc12b    call    $2bc1            ;; CAS RESTORE MOTOR
29e1 f1        pop     af
29e2 c9        ret
The interrupt handler’s RST &38 takes over immediately after the EI, thus pushing my return address (the LD A,D after the EI, equivalent to &29DE in the ROM)…
Code: [Select]
03e7 f3        di     
03e8 08        ex      af,af'
03e9 3833      jr      c,$041e          ; detect external interrupt
…checks the alternate carry – and, finding it to be set, jumps to the routine for externally sourced interrupts…
Code: [Select]
;; handle external interrupt
041e 08        ex      af,af'
041f e1        pop     hl
0420 f5        push    af
0421 cbd1      set     2,c                ; disable lower rom
0423 ed49      out     (c),c            ; set rom config/mode etc
0425 cd3b00    call    $003b            ; LOW: EXT INTERRUPT. Patchable by the user
0428 18cf      jr      $03f9            ; return to interrupt processing.
…which swallows my return address via POP HL. Yet I could not see anywhere a PUSH HL or JP HL that would make my return address available for proper usage, and it seems that eventually HL is corrupted anyway.

This means that, when the overall interrupt ends and tries to RET, it actually ‘returns’ to my PUSHed AF from shortly before interrupts were re-enabled. This is usually some low address that starts a NOP-slide back into the new CAS READ, but in any case, it’s unintended behaviour, and I’m at a loss to figure it out.

I’ve probably missed something simple related to handling registers and otherwise playing nicely with the firmware, but I’d appreciate if someone could suggest a cause for this. I presume that there’s supposed to be a PUSH HL or JP HL somewhere that would put my return address back into play? Is this to do with the alternate carry being set when it shouldn’t be, or what?

In any case, I’m baffled enough by the tape-reading code, even though that’s working! So, this unexpected need to consider the finer points of interrupt handling is a bit too complex for me to understand at the moment. Thanks in advance for any help!
« Last Edit: 05:22, 27 December 12 by db6128 »
[The owner of one of the few existing cartridges of Chase HQ 2] mentioned to me that unless someone could find a way to guarantee the code wouldn't be duplicated to anyone else, he wouldn't be interested.
Did he also say things like "My treasureeeeee" and is he a little grey guy?

Offline SyX

  • 6128 Plus
  • ******
  • Posts: 1.129
  • Country: br
  • Liked: 1121
  • Likes Given: 1871
I'm having a dejavu today :P

Take a look to the Appendix 11 in the firmware guide ;)

Offline db6128

  • 464 Plus
  • *****
  • Posts: 316
  • Country: gb
  • We don’t speak 8080 in this house.
  • Liked: 72
  • Likes Given: 44
I'm having a dejavu today :P
Why, did the same thing happen to you before? ;)

Quote
Take a look to the Appendix 11 in the firmware guide ;)
Thanks; this is too simple to need any of those workarounds, but the other information – such as the confirmation that A', H'L', and D'E' can be used (although I’ll have to be careful with carry') – is useful. I don’t know how I forgot about C' for so long. Having forgotten that, the symptoms didn’t point directly to it as being the cause. I suppose any kind of crazy results could have happened due to the ROM selector being corrupted – it probably would have been easier if it caused a more obvious problem like a crash, haha.

I’m probably going to strip my routine down to be firmware-independent, anyway; at the moment, it’s almost an exact translation to RAM just with a few sections added, whereas I’d like to get it as small as possible. I imagine I’ll keep it compatible with the original version: for one thing, I don’t see any point in changing the encoding to what would end up being a slight variation of the official one, as Amstrad’s encoding seems very clever and allows variable speed, which is the main thing – and also, all the maths needed to change things would hurt my head. :D
[The owner of one of the few existing cartridges of Chase HQ 2] mentioned to me that unless someone could find a way to guarantee the code wouldn't be duplicated to anyone else, he wouldn't be interested.
Did he also say things like "My treasureeeeee" and is he a little grey guy?

Offline SyX

  • 6128 Plus
  • ******
  • Posts: 1.129
  • Country: br
  • Liked: 1121
  • Likes Given: 1871
Why, did the same thing happen to you before? ;)
Sure  ;) , but the reason is because today i already answer that question twice, jejejeje  :P

Offline TFM

  • Visit the mysteries of the CPC at www.futureos.de
  • Supporter
  • 6128 Plus
  • *
  • Posts: 9.899
  • Country: aq
  • Space Chicken for FutureOS is free!
    • index.php?action=treasury
    • FutureOS - The revolution on CPC!
  • Liked: 1983
  • Likes Given: 4650
Edit: D’oh! It looks like this problem was caused by my altering C' ...
Yes, that's the problem with the firmware or CP/M. You can't use the second register set. So you have left only half a Z80. It's a pity.
 
TFM of FutureSoft
Also visit the CPC and Plus users favorite OS: FutureOS - The Revolution on CPC6128 and 6128Plus

Offline rpalmer

  • 6128 Plus
  • ******
  • Posts: 553
  • Country: au
  • Liked: 353
  • Likes Given: 18
From what I have learned is that it is best to not use the alternate register set for anything.
I understand that the Alternate Register Set (ARS) was assigned to the OS while the Normal Register Set (NRS) was for user developed programs.

This has the affect of not needing to store the registers in memory when the OS needs registers for it functions.

Offline TFM

  • Visit the mysteries of the CPC at www.futureos.de
  • Supporter
  • 6128 Plus
  • *
  • Posts: 9.899
  • Country: aq
  • Space Chicken for FutureOS is free!
    • index.php?action=treasury
    • FutureOS - The revolution on CPC!
  • Liked: 1983
  • Likes Given: 4650
This would only make sense if the Z80 would be shared between applicationi and OS several thousand times every second. The CPC with its 300 interrupts per second this is clearly not the case. So the second set is just wasted.
You argument like a classical 8080 programmer, but that way a lot of power is wasted.
 
In my programs I use all registers and the usage of the second register set allows me a steep increase of processing speed.
 
TFM of FutureSoft
Also visit the CPC and Plus users favorite OS: FutureOS - The Revolution on CPC6128 and 6128Plus

Offline Prodatron

  • 6128 Plus
  • ******
  • Posts: 833
  • Country: de
  • Back on the Z80
    • index.php?action=treasury
    • SymbOS SYmbiosis Multitasking Based Operating System
  • Liked: 1061
  • Likes Given: 556
You argument like a classical 8080 programmer, but that way a lot of power is wasted.
The 8080 doesn't have a second register set. The main reason, why the Z80 introduced it, was the acceleration of context-switching (especially for IRQ-routines).

Yes, that's the problem with the firmware or CP/M. You can't use the second register set. So you have left only half a Z80. It's a pity.
Why is an additional feature a pity? The CPC firmware provides quite powerfull interrupt handling routines, which is a great achievement compared to many other 8bit machines of the same time. If you want to use or keep them, you can protect the second register set (you can still use it, but have to restore it + lock INTs during usage). If you don't need the firmeware routines, you can lock the INTs and/or patch #38, which isn't any effort at all (it's always much more simple to restrict features than to intruduce new ones).

CU,
Prodatron

GRAPHICAL Z80 MULTITASKING OPERATING SYSTEM

Offline db6128

  • 464 Plus
  • *****
  • Posts: 316
  • Country: gb
  • We don’t speak 8080 in this house.
  • Liked: 72
  • Likes Given: 44
Syx: Haha, what a coincidence!

Re the alternate registers: Quite interesting discussion. I have no problem with how the firmware uses them; I appreciate the fact that the Z80 has them at all, and if I really need them, I can disable interrupts or be careful in other ways (such as the various suggestions in SOFT 958). In that case, they can be a massive benefit to programming.

Back to tapes, something I’m really curious about: Is the R register not used for RAM refreshing at all in the CPC? I guess not, because otherwise, I can’t see how CAS READ can use it to time edges, as that involves resetting it to 0 after each previous edge. That behaviour suggests it is not used; otherwise, large chunks of RAM would be skipped as R is clipped to a small set of values (e.g. &1A after short pulses in fast loaders). Someone on PUSH'n'POP said the CRTC is used instead, and the document on Kevin’s site implies similarly but is a bit confusing/confused and seems to imply that the CRTC is only there as a ‘backup’ refresher.
« Last Edit: 01:30, 29 December 12 by db6128 »
[The owner of one of the few existing cartridges of Chase HQ 2] mentioned to me that unless someone could find a way to guarantee the code wouldn't be duplicated to anyone else, he wouldn't be interested.
Did he also say things like "My treasureeeeee" and is he a little grey guy?

Offline arnoldemu

  • Supporter
  • 6128 Plus
  • *
  • Posts: 5.336
  • Country: gb
    • Unofficial Amstrad WWW Resource
  • Liked: 2275
  • Likes Given: 3478
R register is not used for refresh.

The CRTC generates addresses. The Gate-Array reads 2 bytes for each CRTC address and converts it to pixel data.

It is the action of the Gate-Array, and the Z80 reading the memory that keeps it refreshed.

So to ensure the memory is refreshed correctly, the screen needs to be setup so that the Gate-Array will refresh the memory correctly.

This fact is mentioned in the Amstrad Plus documents in a cryptic way (split screen, ensuring A1-A8 are not "disturbed" in order to ensure ram refresh).

In addition the actual RAM chips used (1 bit on CPC or 4 bit on Plus), and how it is refreshed play a part too.

I believe reading a single memory address may cause an entire row in the RAM chip to be refreshed...

The whole memory refresh thing is not exactly documented I don't think....

btw, the R register can be used on the Spectrum too, and I think it uses R register refresh??? or perhaps it doesn't?

EDITY: I remember NWC telling me that if you do this:

Code: [Select]
LD BC,&BC06
OUT (C),C
LD BC,&BD00
OUT (C),C

for too long, then the ram can get corrupted.

I'll try and find my docs about it.
« Last Edit: 15:59, 29 December 12 by arnoldemu »
My games. My Games
My website with coding examples: Unofficial Amstrad WWW Resource

Offline db6128

  • 464 Plus
  • *****
  • Posts: 316
  • Country: gb
  • We don’t speak 8080 in this house.
  • Liked: 72
  • Likes Given: 44
Yeah, I’m sure that the GA’s accesses could cover all memory,  depending on how the refresh actually works; it’s really the address-decoding that baffles me. :D It does seem from  the Z80-general description on your site that an access of a given address causes its entire row to be refreshed. So, although I don’t know exactly how it works, obviously the GA’s use of 16 kB is sufficient to refresh all the rows.

My guess (which is probably already documented somewhere) is that the CPC just takes the LSB of the current address on-screen and sends that on the RFSH pins, which I imagine would cycle from 0–255 and thus cover all the rows in a single 16 kB chip (as would the 0–127 of R); some other logic must be used to cover the other three.

The Spectrum does use R for refresh; one of the sites that’s come up on several of my searches for info has a BASIC program that will loop and hold R around a certain value, thus causing corruption/degradation of data in RAM.

It’d be interesting if you find any more info, so thanks for having a look!
« Last Edit: 16:20, 29 December 12 by db6128 »
[The owner of one of the few existing cartridges of Chase HQ 2] mentioned to me that unless someone could find a way to guarantee the code wouldn't be duplicated to anyone else, he wouldn't be interested.
Did he also say things like "My treasureeeeee" and is he a little grey guy?

Offline TFM

  • Visit the mysteries of the CPC at www.futureos.de
  • Supporter
  • 6128 Plus
  • *
  • Posts: 9.899
  • Country: aq
  • Space Chicken for FutureOS is free!
    • index.php?action=treasury
    • FutureOS - The revolution on CPC!
  • Liked: 1983
  • Likes Given: 4650
The 8080 doesn't have a second register set. The main reason, why the Z80 introduced it, was the acceleration of context-switching (especially for IRQ-routines).
Why is an additional feature a pity? The CPC firmware provides quite powerfull interrupt handling routines, which is a great achievement compared to many other 8bit machines of the same time. If you want to use or keep them, you can protect the second register set (you can still use it, but have to restore it + lock INTs during usage). If you don't need the firmeware routines, you can lock the INTs and/or patch #38, which isn't any effort at all (it's always much more simple to restrict features than to intruduce new ones).

CU,
Prodatron
Right the 8080 has no second register set. Therefore it's a deficit to program a Z80 like a 8080. :)
 
Well different producers / users have different reasons to do things. But when regarding a CPU we can do a judgement on basis of logic, independant form reasons people did things.
 
Sure the CPCs interrupt system is great. If would be even greater if it would not use the second register set. Now if the 2. register set wouldn't be screwed the interrupst system would be less than 1% slower, but programs could use the second register set and would increase speed by 20-30% !!!
 
And I have to clarify this: If I talk about the usage of the second reigster set that I talk about a meaningful way to use them - that means without the need of pushing & popping them every time. Because as soon as I have to save/restore the second register set it make no use to use it anylonger.
 
The sense IMO of the second register set is to provide a feature to the OS and to applications to allow the quick and often switch of registers, so it's basicly an doubling of the numbers of registers (of quick access, ignoring the slow registers like IX, IY)
 
Finally the free availability of the double number of registers speeds up programs enornously. Reserving the second register set for the interrupt increases the interrupt routine by a value so small that it is not significant. But it slows down normal programs enourmously. As an example you can take math libaries for the Z3Plus system.
 
TFM of FutureSoft
Also visit the CPC and Plus users favorite OS: FutureOS - The Revolution on CPC6128 and 6128Plus

Offline db6128

  • 464 Plus
  • *****
  • Posts: 316
  • Country: gb
  • We don’t speak 8080 in this house.
  • Liked: 72
  • Likes Given: 44
But, IIRC, the main reason the second set was implemented at all was to allow fast switching, primarily for interrupts. So, although an extremely useful side-effect is the ability to use them as additional registers in normal programs, we probably would not have them at all if interrupts hadn’t existed to encourage their invention in the first place. So, I think you’re arguing that their use for interrupts is a hindrance, but that’s the main reason they exist at all. Just saying that’s slightly inverted, not disagreeing with you about their usefulness or anything. :)
[The owner of one of the few existing cartridges of Chase HQ 2] mentioned to me that unless someone could find a way to guarantee the code wouldn't be duplicated to anyone else, he wouldn't be interested.
Did he also say things like "My treasureeeeee" and is he a little grey guy?

Offline TFM

  • Visit the mysteries of the CPC at www.futureos.de
  • Supporter
  • 6128 Plus
  • *
  • Posts: 9.899
  • Country: aq
  • Space Chicken for FutureOS is free!
    • index.php?action=treasury
    • FutureOS - The revolution on CPC!
  • Liked: 1983
  • Likes Given: 4650
As I mentioned, it does not matter for which reason the second register set was introduced.
 
It matters only how to use them the most efficient way.
 
Learn to think like a machine  ;)
 
 
(p.s.: Where is it written down that they have been introduced for interrupts? I think that's just an urban legend. Or how can it be explained that the Z380 has four sets of registers??).
(p.p.s: "The EXX op-code was ugly to write and read, but was the reason why everybody doing number crunching in 8-bit liked the Z80 over the 8085.
", can be found at: Z80 Number Cruncher )
 
« Last Edit: 09:20, 31 December 12 by TFM/FS »
TFM of FutureSoft
Also visit the CPC and Plus users favorite OS: FutureOS - The Revolution on CPC6128 and 6128Plus

Offline ralferoo

  • Supporter
  • 6128 Plus
  • *
  • Posts: 970
  • Country: gb
  • Liked: 583
  • Likes Given: 222
Yeah, I’m sure that the GA’s accesses could cover all memory,  depending on how the refresh actually works; it’s really the address-decoding that baffles me. :D It does seem from  the Z80-general description on your site that an access of a given address causes its entire row to be refreshed. So, although I don’t know exactly how it works, obviously the GA’s use of 16 kB is sufficient to refresh all the rows.

My guess (which is probably already documented somewhere) is that the CPC just takes the LSB of the current address on-screen and sends that on the RFSH pins, which I imagine would cycle from 0–255 and thus cover all the rows in a single 16 kB chip (as would the 0–127 of R); some other logic must be used to cover the other three.
With DRAM, you just need to select each row once every *mumble* ms (it might be as slow as 32ms IIRC)

Anyway, the CPC does a complete RAS/CAS strobe for every byte transferred, even though it does 2 video bytes per 1MHz cycle (the gate array forces the CPU to operate it's T1 state on a 4MHz clock edge that's a multiple of 4). So, by making the LSB the DRAM row instead of the column, we cycle through every row as long as we cycle through screen RAM in the normal fashion. If you deliberately enabled a short display and forced that not to happen, then you could also data loss from the RAM if the CPU never accessed RAM at those addresses either.

The Z80 refresh register is only 7 bits wide, so it doesn't work if you have 64K of DRAM unless you have a trick like with the CPC. The spectrum, by contrast, has the MSB for the row and LSB for column which means it can select 2 adjacent bytes without changing the row again. As a consequence, it also only cannot use 64KBx1 chips like the CPC does which is why it's actually implemented as a 16KB and 32KB RAM for a total of 48K.