Implementing protected memory

martin464 · 04:01, 07 March 23

I was wondering about whether protected memory could be done on the CPC, in theory (maybe for fun)

An application already can run in it's own 64k address space in the C2 mode. That's a basic building block for it
But it won't handle a crashing application disabling interrupts, changing IM mode, or paging ram itself and going crazy where it shouldn't

This is where the hardware comes in
Maybe something could monitor the gate array, storing the ram mode, and prevent ram paging unless done in C0 mode and so scupper illegal attempts to access the gate array

For interrupts, it could fire an NMI interrupt which can't be disabled, or somehow prevent DI from working and also detect the normal interrupt and page C0 ram in so interrupts only run on the base 64k

If such a device were crafted I think you'd get a protected memory architecture, but is it possible? Maybe all the bases couldn't be covered and there's some reason it couldn't be done

andycadley · 09:33, 07 March 23

I think the problem is that such a device would need to be implemented not only in the gate array, but also every memory expansion unit. And that's before you even think about trying to stop DI working (or the enormous concurrency problems that might occur if you could).

I also suspect the practical benefit would be near zero. Even 16-bit machines like the Mac, ST and Amiga managed all sorts of multitasking without protected memory.

martin464 · 12:26, 07 March 23

haha yes i avoided talking about practical benefits of this!
its just interesting that something that seemed completely impossible might be possible, in theory

i was thinking that the device would just monitor the gate array via the expansion bus (since registers are read only) and maintain a copy of the ram mode, then yes, i didn't think it could over-ride or hack the gate array itself, leaving everything intact on the cpc internals - but a custom ram expansion would implement some logic. like when interrupt is active select C0

stopping DI would only be when it see's C2 mode is paged in. Only the O/S running in base 64k should ever do this. If the device couldn't re-enable them through some trickery (like paging in a blank 64k containing EI who cares what PC is!) an NMI is the other option but it might still not be possible to work around some code changing IM mode i guess

andycadley · 13:30, 07 March 23

The problem is you really need a CPU with a kernel mode to pull this kind of thing off because you need to trap a whole bunch of instructions.

You need to not only prevent DI, but you also need to prevent changing interrupt mode. And any kind of writes to the paging ports.

And, even if you accomplish that, you can still only have one program running at a time, which begs the question "What are you protecting memory from?" You can't really add pre-emptive scheduling without trapping a whole bunch more hardware functionality to prevent, e.g., colour changes not sanctioned by the OS (and the OS would have to co-ordinate them anyway since it is the only thing that can disable interrupts and you'd need writes to be atomic since palette changes take multiple steps).

Prodatron · 14:04, 07 March 23

Quote from: martin464 on 04:01, 07 March 23An application already can run in it's own 64k address space in the C2 mode. That's a basic building block for it

In many cases this is already enough to prevent a buggy application from crashing the whole system. At least this is the experience in SymbOS, where you can usually kill a broken/frozen application with the Task Manager and still have the system in a healthy state.

Apart form that, did you see this?
https://www.cpcwiki.eu/forum/general-discussion/the-z80-s-secret-feature-discovered-after-40-years!/msg221596/#msg221596
"The Z80 has a protected mode"

martin464 · 14:12, 07 March 23

ok, let's look at those issues

Interrupt mode might be workaroundable. i assume its not possible to tell what mode it is in from the bus
but if it's set to 1 the rst 38 catches it, if it's set to 0 doesn't it go to rst 38 also by default, if 2 then the 16bit vector would have to reside in all possible combinations of i, so it would mean a special 64k block just to try and catch it away from screen ram, so there would be a 64k "interrupt ram bank" (slight mod to what i said originally)

as for colour changes and so on, the device would monitor the io ports on the bus. if it detected activity from a ram bank not the o/s one, it would fire an nmi and the o/s could stop the offending process before the cpu had time to process it?

i'll admit this is pretty crazy stuff, i personally can't think of a use for it but maybe someone can come up with one, nope. i just tried again, still nothing.

andycadley · 14:51, 07 March 23

Theoretically you could monitor interrupt mode by looking for instruction reads on the bus. IIRC there is a signal to indicate that it's the start of an instruction fetch and all the IM x instructions are single byte, so doable in theory.

IM0 doesn't go to #38 by default, it just executes whatever random instruction gets placed on the bus - the CPC hardware isn't compatible with it so it could be ignored, as there's no reason to use it other than crashing the machine.

IM 2 complicates matters because the value placed on the bus isn't necessarily even so it's a bit more tricky than just a bank full of addresses. However, unless it's a Plus machine, the value on the bus is "floating" so the hardware could presumably put an known 8 bit vector on the bus instead and your 64K interrupt space could handle all combinations of I register, so it's probably solvable

But monitoring the bus for IO activity means you're too late. By the time you raised an NMI, the IO instruction has already happened and the hardware will have already carried out whatever was requested of it. At best your "OS" could try to undo it or track it, I suppose, but again to what real end? You've kind of created a more complex version of the Multiface, which does something similar by paging it's "OS" in when the NMI is generated by pressing the button (and otherwise keeps track of hardware state by bus sniffing).

You're still stuck running a single 64K program. It can't trash any other memory or do anything to alter the hardware configuration, but there isn't anything else to either, and you've had to add an enormous bunch of hardware to, in effect, cripple what the only running program can do.

If the goal is to provide some functionality for running multiple programs at once, you might as well use an approach like Symbos and just accept that bad programs can bring the system down. If the goal is to switch between games quickly, there's probably better approaches that flashload data rather than worrying about providing an OS.

There's a intriguing thought process around "Could a Z80 based computer have 16-bit like OS functionality, using only the base CPU" (see the video linked above) but I suspect the real answer is "probably not" and certainly if it did, it wouldn't look like a CPC architecturally as you'd design the hardware layout around the multitasking requirements rather than the other way around, if you see what I mean.

Animalgril987 · 18:22, 07 March 23

During an I/O write, /WR goes low after / IORQ, so there will be an extremely short window before the hardware changes anything, but probably far too short to stop the Z80 completing the write.

andycadley · 18:26, 07 March 23

Quote from: Animalgril987 on 18:22, 07 March 23During an I/O write, /WR goes low after / IORQ, so there will be an extremely short window before the hardware changes anything, but probably far too short to stop the Z80 completing the write.

Yes, especially because even if you trigger an NMI the Z80 won't notice until it's completed the IO instruction (since it only checks for an interrupt before starting an instruction). At best you'd be trying to override the signals on the bus to force the hardware not to see the write, but I suspect it'd be problematic.

martin464 · 10:04, 08 March 23

in practice it might not matter such an operation was caught after the event, most IO operations select a register first. you could probably set it up so any untoward write to a port would be harmless. the only thing that really matters is the gate array and crtc. if you can set it up to be harmless (if) it doesn't matter..... this level of detail is needed, otherwise people will say its not a proper protected mode, if there's some edge case that could break it

the idea i had for the interrupt ram bank was when irq detected - page in the 64k which is filled almost entirely with IM 1 and EI instructions, eventually the PC will find code at &38 that switches bank and runs real interrupt. it might be fool proof (might)
also, no special ram expansion would be needed cause they would be plugged into the device and use whatever signals it cooked up

hackers are getting more sophisticated, how long before they find a way to hack our CPC's? Now with the new patented protected memory architecture they will be denied and our cpc's will be more secure than the bank of england

andycadley · 11:06, 08 March 23

Well protected memory doesn't really protect you from hackers, they'd be running code in-process and so can trash things as they like. It's really about stopping bugs in one program crashing another.

I'm not sure you could make IO "safe" following that method either. You could detect a register select and intercept it, but what then? You interrupt handler would have to inspect memory where the PC was and try a figure out what register it was writing to. I don't think there is a "safe" register you could select instead to effectively dispose of future writes, so instead you'd probably need to scan ahead and try and re-write code to remove further OUTs. Which is likely to fail in most cases because you just can't know how many (if any) writes you'd see so it's not obvious how many instructions ahead you'd need to scan.

And if you did, you've still now got the problem that a program can't e.g. change screen colours. And can't communicate with the OS, because the OS is all paged out and the program is supposed to have all 64K to do with as it pleases....

Bread80 · 11:40, 08 March 23

I'd suggest attaching a microcontroller to the bus. If this monitors every instruction being read it can listen for illegal instructions and act as necessary. I.e. it could assert ROMDIS or RAMDIS and inject an instruction of it's own. Either a NOP to skip the instruction or, preferably, raise and exception with the task manager.

BTW RAM banking is handled by the GAL, not by the gate array. If you're mostly concerned about crashed software, rather than malicious code, you could add some extra hardware to validate writes to the GAL in some way.

And again, if recovery from crashes is the focus, how about a simple button? Pressing it forces execution to a specific address and RAM setting? Again, might need a microcontroller to handle all that but not too complex.

martin464 · 12:50, 08 March 23

Andy, all good points you raise!
I think any reading ahead of code is a bodge too far, even for me. I see that as a non-starter

But monitoring executing code that Bread suggests (and I think you suggested it too) sure, so long as it was possible to properly identify it as code not data or there be any confusion from multi-byte instructions. This would be even better, like a shell around the cpu almost as if it had kernel only instructions

But wouldn't the device know the port from its monitoring address/data bus? To pass that onto the o/s for rectifying
A 'safe' register counts as any that is able to be disposed of, eg the gate array could be left on register 0, so an unwanted write would only trash the border colours and could be corrected

Ah, when you say left with a program that can't change screen colours or communicate with the OS, you're asking me how does that work

The idea is, the program can call the o/s by other nefarious means. Such as page 0 being replaced by the device as if it were a 256 byte rom that couldn't be changed. The privilege is code executing from page 0 (I'm making this up as I go along aren't I?) is allowed to switch banks and call the o/s to do things it wants done

Alternatively the device could have it's own port just for making o/s calls and the program sets up a data structure for parameters or collect them up and store internally. The device would know an output to this port means start sending the CPU instructions to jump to a certain address in bank 0 and do it that way

Yes a bullet proof O/S is probably the main attraction
But imagine, say a CPC on-line with an o/s that supports a multi-user command line environment accessed via a web-page, then the users could compile code or do anything safely. Along comes the hacker, aha, I will crash this cpc and they are foiled, the CPC sends out one last message before disconnecting them, perhaps an appropriate piece of ascii art

andycadley · 13:10, 08 March 23

Quote from: martin464 on 12:50, 08 March 23But imagine, say a CPC on-line with an o/s that supports a multi-user command line environment accessed via a web-page, then the users could compile code or do anything safely.

The problem is that there are so many much more complex problems to solve once you have multiple users/programs that memory protection starts to be the least of your worries. The lack of easy code relocation, for example, is already forcing you to allocate 64K to every program and now you're having to try and intercept calls into the bottom part of the address space to trap OS calls so you can't run any existing software (even BASIC) and the design just starts to get clunkier.

Like I said previously, if you wanted a Z80 based machine that could do this, you'd do a lot differently. Probably a ROM in the lower few K and IO ports that are only considered paged in when a RST instruction has executed, forcing all IO access to go via firmware routines. Not being able to have easily relocatable code or hardware memory protection would still really limit what can be achieved on a purely Z80 system though and it certainly wouldn't look like a CPC architecture.

GUNHED · 13:34, 08 March 23

Quote from: martin464 on 04:01, 07 March 23I was wondering about whether protected memory could be done on the CPC, in theory (maybe for fun)

That is actually very easy!

All you need is a device like SF2, FlashGordon, MegaFlash, X-MEM or the M4 expansion.
They all provide ways to unlock ROMs, to write to them, and to relock them again.

So, just have your software running between &C000-&FFFF and you have protected memory.

In this setup it's useful to move the screen RAM from &C000 to &4000 using RAM mode &C3.

zhulien · 18:47, 08 March 23

If you want a real protected memory, then ROM is the way to go, an almost protected, a ROM or RAM that can be set by the CPC itself - remember, if you can do it on CPC, then others can too. If you only mean to make software more behaved (i.e. in a multitasking environment), you could do something like some Amiga utilities do and that is track memory manually and raise a fault if your background memory tracker finds something wrong. If someone was able to help me get the hardware of the Raspberry Pi bus interface working - then we can monitor all memory access of the CPC on another display in realtime. This could be also done with a microcontroller - anyone want to make a CPC debug M4 card? Must have additional display, the ability to track memory and I/O accesses and other bus activity and the abililty for CPC software to also write to it for display.

zhulien · 18:51, 08 March 23

If you look at my MCP implementation which currently runs 16kb tasks in other 16kb blocks - it can both resume from a CPC reset and recover too. The purpose of this idea was specifically for development.

martin464 · 23:21, 09 March 23

I was going to leave it at this... in case anyone was curious... i can't get much further with it
But i got a bit further and going to write a spec for how this might be done which I will attach, cause i think the cpc can do this
In no way am I trying to say this is a good idea, but it's also not a bad one!

QuoteThe lack of easy code relocation, for example, is already forcing you to allocate 64K to every program and now you're having to try and intercept calls into the bottom part of the address space to trap OS calls so you can't run any existing software (even BASIC) and the design just starts to get clunkier.

It's not the lack of code relocation that means 64k allocation. It's that the 16 bit address bus provides a handy sized chunk for giving applications their own ram which won't contain other applications data in it. This makes things a lot simpler than a PC with a larger address bus where the full allocation would be too big to do this... for once this limitation becomes an advantage! Without this it could never work and there's plenty of 64k chunks in a 2/4mb CPC

The call intercept is a hack but i think there's a better way, that's less kludgey to call the O/S. Will chuck all that in the fag packet spec

But this whole idea is for a server type o/s or it's kind of pointless, not imagining for running BASIC or any existing applications.
It's something specialist for anyone wanting to do multi-user stuff maybe

Prodatron · 00:29, 10 March 23

Quote from: andycadley on 13:10, 08 March 23The lack of easy code relocation, for example, is already forcing you to allocate 64K to every program

At least WinApe (since 2006) and SjAsmPlus (since a few years) are supporting the generation of relocation tables for SymbOS apps. Sounds strange to waste full 64K only for one single app, when you can run maybe even 5-10 inside such an address space.

martin464 · 10:16, 10 March 23

Prod, true, but that's the trade off. To have completely secure applications there's no other way to isolate ram from one another. But, wow, it is possible! To even be able to do this is pretty amazing I feel

With 4mb ram that's 64 of these blocks. Thats probably about right for what the Z80 can handle in terms of CPU power, so the waste may turn out not to really matter. Other limits are hit first

OK Here's the fag packet spec. I really will leave it at that cause i'm not a hardware person i couldn't do anything with it!

CPC Protected memory architecture

The goal of protected memory is to guarantee no application can ever crash the system or access data outside of it's allocated memory

This is achieved on the cpc by:
1. Taking advantage of the 16bit address bus of 8 bit architecture
All applications run in their own 64k address space (RAM mode C2)
So RAM is allocated in 64k blocks to an application and belongs to it alone
There is not data from other applications in the same address space

2. A new protected memory interface device

This expansion device provides the rest of the missing functionality for protected memory
It works with all ram expansions which would be connected to it

One main task is to make dangerous CPU instructions harmless so no application code can crash the system
It also ensures interrupts are handled in the base 64k ram as interrupt code must not be in any application 64k block unlike the usual C2 mode
It also supports applications calling the o/s
A potention feature might be to enable alternative ram banking configurations, C2 mode with 1 bank from another expansion bank - this is not to support protected memory but to enable applications to more efficiently work with more than 64k

The device, like the multiface, tracks the gate array and knows the ram paging state at all times
The o/s will know which application is currently in process and controls ram paging except where noted below

The device combines its knowledge of the gate array ram config + it's own logic to select ram banks in the expanded memory. A bit like a 2nd gate array

OUT instruction + calling the O/S:
Any IO output to devices is transformed into an O/S call also preventing an application working with the ports (eg changing ram banks/accessing o/s ram)
The device monitors the IO port and if it detects a port output when in C2 mode an NMI is fired together with paging in base 64k ram
This immediately escapes to the OS for handling

Note: It is true this is detected *after* the out to the port has occured
But 1) The unsafe code is immediately exited
2) No harm can be done because all motherboard devices (Gate array/CRTC) are preset to register selects that render
a single OUT harmless eg Gate Array register 0 or CRTC register for the light pen

Also it would not pass on the IO bus lines to any devices plugged into the expansion in this case

This is also the mechanism used by an application to call the o/s. A predefined memory location containing data structures for passing/receiving the call after NMI
So now IO devices are isolated from the application code and the o/s can be called

IM The last thing wanted is the IM mode changed and it must be handled
This is done by the device paging RAM every interrupt to a special 64k interrupt handling bank (int-bank)
Normally IM 1 will cause PC to &38 - the usual state
IM 0 will go to an unknown location and so will IM 2 as we don't know what I register will contain
In this case its ok because all this ram is filled with RST 38's
When the device detects the address bus reading from address 38 it knows PC is there and switches to base 64k ram
Then the real interrupt handler kicks in which does IM 1
Although a workaround it would be pretty efficient and IM0/IM2 can no longer crash the system

DI A regular NMI occurs, every 300ms interleaved with the maskable one
This does EI, also checks data structures to see if application wants to pass something to the o/s asynchronously
It is very possible the interrrupt code can detect an interrupt was missed from DI in the application code and flag a suspect process

HALT If interrupts are disabled then the NMI interrupt will enable them again so the system won't stop
Again it should be possible to detect this by counting interrupts and flag the process

andycadley · 11:43, 10 March 23

I'm still not entirely convinced this will work in all cases. When the int-bank is enabled, where do writes go? I assume it has to be to the last paged in bank, otherwise there will be no way of recording the correct return address (you can't wind back the stack pointer because you don't know if a RST 38 executed or not)

And then there's the whole issue of relocating the stack anyway. I suspect you'd run into the issue that where the stack is located for a process creates issues trying to get back to it after the OS call - it can't really rearrange the bank combination to C2 and then return, because now the OS code has all been paged out. Unless the OS is in ROM, but then the stack might be under the ROM so now you need two ROMs (upper and lower) and then duplicate the restore code in both.

And I think the OS would still need some portion of RAM in every process even if only for the restore process. You can't protect against that being overwritten, so you'd have to restore it on every OS call/trap and now you're starting to make things expensive on a CPU that isn't exactly overpowered already...

martin464 · 14:16, 10 March 23

ah, i see you found some bugs

When the NMI fires the return address is written to the applications stack. The device would have to give time for that before paging in the int-bank. But it could detect that write and know when to page
The int-bank itself would throw away writes. You're right the RST 38 would meddle with the SP i hadn't thought of that
Instead the int-bank could be filled with JP 0's. Then a trail of NOPS leading up to &38. Then would get the same effect without moving SP. Of course normally it would never go through this, just the edge case of an IM0 or 2

Yes also returning would need the device to manage the paging, you'd have to have an OUT that told it to monitor for a subsequent RET and then page the application back in, something like this:
OUT &device - prepare device for interrupt return
OUT &gatearray - switch to C2 - but the device ignores it, captures it and waits for RET
EI
RET - device now switches to C2

To handle expanded ram beyond 512k it would also store the expanded ram ports not just gatearray

Having any OS ram in the processes wouldn't be viable, yes for the reason you mentioned it couldn't work that way and defeats the object for sure, but i think not needed. I suppose the device could prevent read and writes to page 0 by logic except during an interrupt and then the code there does something similar, an OUT to prepare and a RET except this time it's the page 0 hiding that is triggered by the RET. That would work also

One thing I noticed about this was the device doesn't do very much computation or store much, it could be done with logic and not a microprocessor needed maybe i don't know. Isn't that what CPLD is. It sounds like it's theoretically possible, that's all i was trying to note really!

News:

Implementing protected memory