Print Page - I tried to port Smalltalk to the CPC. It didn't work out in the end.

Title: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: PulkoMandy on 15:32, 24 February 24

Read about it on my website (https://pulkomandy.tk/_/_Development/_Porting%20Smalltalk%20to%20the%20Amstrad%20CPC:%20a%20failed%20experiment)

Well, it didn't work this time. I may retry this later with a different approach...

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: BSC on 21:00, 24 February 24

Interesting write-up. Just one question regarding your memory allocator, since you said it can't free memory. Have you considered the possibility that you would not run out of memory if your allocator would support freeing memory (and doing the necessary book-keeping and reordering)? I mean I am aware that I don't know what exactly that parser is doing and why it needs to allocate memory, but it seems counter-intuitive that this eats all of your free memory.

Another thing: Do you think it is possible to write the image directly to disk instead? It sounds like you are not doing it, but building the image in memory. Even then it might be relatively easy to build the image inside of the extra 64 (or more)k of memory.

Anyhow, keep it up! I am looking forward to the next iteration.

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: GUNHED on 00:42, 25 February 24

Would be a nice project for SOS, it already has overlapping windows and all that. :)

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: PulkoMandy on 09:50, 25 February 24

The original code has no calls to free(). It is designed to have all the classes and methods in RAM, so they can cross-reference each other, and then in a second step, store it all to a file.

I think it really needs at least all the class descriptions to be in memory. Maybe it will be able to compile the methods one by one and stream them to the output file? Or maybe all methods from one class at a time? That could be made to fit in RAM. But I think it will need some bigger changes to the compiler.

I am not out of ideas, for example, I could move most of the code into the C000-FFFF memory range to free more low RAM for the heap. Either put it in a bank, or remove the printing to screen and use the printer port for debug output (with an emulator that sends the output to a textfile, or with a real CPC and printer for more dramatic effect :laugh:).

But still, in the end, the generated "image" file, which contains all the classes and methods, will be larger than the CPC main RAM (about 100K). So, yes, maybe I can make some changes to the compiler and get it to run through. But then, I have to also get the bytecode interpreter running, otherwise, this is quite useless.

The goal of this experiment was to see if it was reasonable to get Little Smalltalk running on a 64K machine, maybe as a BASIC replacement/alternative. I stopped when I got the answer (NO). The interpreter code for the virtual machine would fit, but the interpreted Smalltalk image would not.

I don't give up on Smalltalk for the CPC yet, but now I know that even the smallest existing implementation would need memory banks, I may as well go with the full Smalltalk-80 version anyways. And I need to design my VM in a way that it can manage the Smalltalk image being loaded into multiple banks. That will certainly be an interesting project for when I have more time. And I will need to do it from the ground up for the VM, since none of the existing implementations work that way, as far as I know. I don't know if that would still result in a somewhat usable system, or it if would be way too slow.

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: asertus on 11:12, 25 February 24

What about a 128kb version, given that "normal" CPCs with disk drive are the 6128?

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: PulkoMandy on 11:26, 25 February 24

With 100K for the image, let's say 16K for the interpreter (it's larger currently, but it's sdcc compiled code), and 16K for the screen (not counting the amsdos and the user written code), I don't think even this minimal version of Smalltalk can fit on a 128K machine without any hardware expansions. That's why we can call this a failed experiment.

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: GUNHED on 18:28, 25 February 24

Why not going to 576 KB of RAM? (Or even higher if RAM is there). Most 'not-only-game-playing-users' probably have at least an 512 KB RAM expansion today - imho. :)

In your detailed description (link see first post) it's explained that Smalltalk uses a Virtual Machine. So on CPC it could be implemented by providing such a VM by any OS with banking capabilities. :)

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: andycadley on 22:17, 25 February 24

The problem, I assume, is not using more than 64K in general. It's managing the memory model such that calls between routines still work. You could do with 24 bit pointers to all data and subroutines but that would suffer a massive performance penalty. So you very rapidly start needing a compiler that is smart enough to allocate and manage memory in the most optimal way.

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: PulkoMandy on 22:38, 25 February 24

It's not really a problem to get it running, since Smalltalk is a virtual machine. The virtual machine "just" needs to know in which bank the called method is, and page it in. So, yes, 24-bit pointers and a vm that knows what to do with them.

But then you have to care about performance. In Smalltalk-76, they didn't have the luxury of memory banks. Instead the image was stored on a harddisk, with the ram used as a cache for the most recently used objects. And they managed to get something usable out of it. So, with banks, it should be even easier?

To make it optimal, you can organize the memory banks so that methods and objects that are frequently used together eventually end up in the same bank. This can be handled as part of the memory allocator and garbage collector, if the vm is implemented in a way that moving objects around in memory is possible (that means an extra level of indirection when accessing them, essentially).

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: GUNHED on 02:56, 26 February 24

So, would it help to have the VM on CPC running?

If yes, what needs the VM be able to do? (I had a good read of quite some text, but a link or short 'list of features' would help). Sorry, in case I ask too much here, but it's a new topic for me.

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: PulkoMandy on 08:44, 26 February 24

The VM is like an emulator for a very simple CPU. For Little Smalltalk V4 there isn't a lot of documentation (the author never finished the new version of the book that goes with it). For Little Smalltalk V1 there is the "A Little Smalltalk" book that gets into a bit more details.

A list of opcodes:

- Push Instance: puts an object reference on the stack
- Push Argument: puts a method argument on the stack
- Push Temporary: puts a temporary variable on the stack
- Push Litteral: puts a litteral value on the stack
- Push Constant: puts a constant on the stack (numbers 0 to 9, true, false, or nil/NULL)
- Push Block: create an execution context for a "block". This allows to have a piece of code called later that will have this context in use and can reference other objects from there
- Assign Instance: set a value in an object instance variable (taking a value from the stack)
- Assign temporary: set a value (from the stack) in one of the temporary variables
- Mark Arguments: pop as many values as needed from the stack and put them in the "arguments" array to pass them to a method
- Send Message: call a method (with the arguments above)
- Send Unary, Send Binary: optimized cases for some methods with no parameters or only one parameter (+, <, <=, isNil, ...)
- Primitives: call some native code. Print a character on screen, read a char from keyboard, basic math functions, and some accessors to data (for example: get the class corresponding to an object, or get the size of an object, create a new object). File IO is also implemented here.
- "Special" operations: return from a method or a block and clean up the stack, duplicate an element on the stack, branches and conditions

The bytecode is usually encoded as 4 bit opcode + 4 bit argument. When the argument does not fit in 4 bits, instead, the opcode is encoded on 8 bits (with the 4 high bits being 0000, which is a reserved opcode so it doesn't conflict with the 4 bit version) and the argument is encoded on the next byte or bytes (I don't remember).

The VM implementation can decide how exactly to store its internal data. But more importantly it has to manage dynamic memory. There are opcodes to create new objects, but the bytecode doesn't explicitly track when an object is not needed anymore. This has to be implemented either with reference counting, or garbage collection. Previous versions of Little Smalltalk used reference counting, but V4 uses garbage collection.

Before we can get to executing bytecode, the interpreter needs to load it from a file. The file is a compacted representation of a tree of objects, classes and methods. The interpreter parses it and creates the corresponding objects, classes and methods, and then calls the "bootMethod", and from there, it starts running bytecode.

The C sourcecode isn't very long, you can read it here: https://github.com/crcx/littlesmalltalk/tree/master/lst4/source

interp.c: the bytecode interpreter
memory.c: the garbage collector
main.c: ties it all together.

The initial objects and classes are defined in text form here: https://github.com/crcx/littlesmalltalk/blob/master/lst4/ImageBuilder/imageSource
The ImageBuilder tool parses this and creates the binary "image" file that the interpreter needs to start.

After taking a closer look, I have found out two things:

- The binary "image" file is defined such as the "Method" objects contains not only the bytecode, but also the ASCII sourcecode for the method. This should not be needed, I think it can be removed to make the image smaller. I have tested this quickly and I managed to parse 16% of the source file on the CPC (vs 10% before this change).
- The binary "image" file is not platform dependant as I thought originally. So I don't really need to generate it on CPC. I can do it on a computer with large linear RAM, and transfer it afterwards.

The image has about 4000 "objects" (that counts objects, but also classes, methods, integers, ... everything is an object in Smalltalk). Each object needs at least 4 bytes in RAM: a pointer to the class, and a size. But it needs more if it does something at all (methods need space for their bytecode, objects need space for their fields, integers need space for their value, classes need space for their method list, etc).

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: GUNHED on 19:04, 26 February 24

Thanks for the detailed explanation. It's quite some stuff, but should be doable on the CPC. :)

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: PulkoMandy on 22:39, 27 February 24

Ok it got me thinking about garbage collectors and memory banking, so, I wrote another article:

https://pulkomandy.tk/_/_Development/_Ideas%20for%20a%20garbage%20collector%20for%20memory-banked%20systems

Let me know what you think (if it makes any sense... maybe I'll re-read it tomorrow and notice I wrote something stupid that can't work).

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: zhulien on 01:24, 02 March 24

I read the journey and wonder if you put the stack at let's say... #3fff. And instead of using the heap as a single contiguous block, treat it as an array of 16kb blocks between 4000 and 7fff. This is one way.

The other way is more akin to how cpm+ works, and put all code to be interpreted into a 2nd 64kb. If you are using an upper rom, for the actual smalltalk itself, then you can easy have 48kb available (minus the first 100 bytes or so). Or with a little cpm+ style gymnastics or Ramlam... close to 64kb without going to further banks.

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: zhulien on 01:30, 02 March 24

BTW you can use 16bit packed far addresses instead of 24 bit pointers if it helps, just make sure your pointers are pointing to aligned memory. A 4mb cpc expansion gives you 256 x 16kb blocks that are quite easy to manage if you have a class limitation that a class cannot exceed 16kb.

Or alternatively being a vm it makes it relatively simply to have the 64 banks of 64kb which could allow greater than 16kb per class or heap allocations.

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: zhulien on 01:39, 02 March 24

One other note is that you don't have to work in 16kb blocks on the cpc, you can work in 64kb banks, for the majority of things that is not BASIC. or if you call firmware from an external bank be sure to bank switch first. The cpc is quite flexible for its bank switching and even for multitasking... although I am told msx and enterprise 128 is even more flexible.

You can context switch on a cpc for example just by swapping entire 64kb banks at the right time (pus registers save stack swap 64k restore stack restore registers) and cpc very happy and its super fast.

And... if you can make the code romance you get all that bank available for the VM... basically an array of 64kb banks with a ROM that can ram lam the ram under it.

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: PulkoMandy on 09:15, 02 March 24

The objects in smalltalk are usually extremely small, a max size of 256 would not be too much of a constraint. But there are a not of objects (a string is an object, an integer is an object, a method bytecode is an object, ...) and also a lot of references between them. If the memory management isn't good enough, and objects are spread apart everywhere, every operation will require a bankswitch.

The current state s, I have noticed that the smalltalk image I was trying to build includes not only the bytecode, but also the sourcecode for all methods. I removed that and now the imagefile is a more reasonable 35 kilobytes. I have not tried to load it yet (I think the memory usage will be a bit higher). That's still a bit large for working without banks, but maybe I can get something to mun. Which would be easier for me, once it runs in a very simple way, I can start optimizing it and making it better. But I don't feel confident writing a super complicated thing from tthe start.

The first version will probably be unable to free any memory at all. Believe it or not, this is how some people preferred to use their LISP machines back when tht was a thing: save todsk frequently, and when the system runs out of mmory and crashes, reload the last savestate. That was apparently better than having the garbage collector slow everything down

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: PulkoMandy on 15:58, 02 March 24

Also, I think I understand the possibilities of the various banking schemes pretty well, given that I designed and built a memory expansion myself. So my problem here is not that I don't have ideas for how to make it work, it's rather that there are a lot of options and I have not yet decided what is the best one.

Moreover, if I find the existing banking schemes too limiting, I could easily design a more flexible memory extension. In fact I already have the Nova which opens a few more possibilities (so it is even harder to decide what to do now).

I will try first to make it fit without banking. But the C code built with SDCC is too big to fit in a ROM I think. So I already have to optimize it quite a bit if I want to try that...

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: zhulien on 18:30, 02 March 24

Can you build the c code at 0100h and run it from the 2nd 64k bank... then you have close to 64kb for that, and about 4pkb from the main bank for code and data. Yes some bank switching but given 64kb in the 2nd bank you might be able to allocate 16kb for buffers to allow for fast transfers between the two.

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: m_dr_m on 22:20, 11 March 24

Fantastic project! (I was tempted myself (https://www.cpcwiki.eu/forum/programming/porting-smalltalkjoynimj-on-cpc/))

Would be useful for prototyping and tools (hmm, for that, maybe something like free pascal would be more indicated).
As a side note, I use a limited VM (well, ARM-like bytecode interpreter with some access modes dedicated to object programming) in "Emotion Trouble" and "Ayane Ayane LaCarree" (which was almost a bad idea, as it's more difficult to debug for now).

Let us know how we can help!

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: PulkoMandy on 23:31, 11 March 24

I have not given up yet on this, just briefly distracted by other projects.

The current status: I removed a lot of things from the C code (the garbage collector, most of the opcodes from the virtual machine) to get the code under 16k so I can fit it in a rom. I hope I can then fit the initial smalltalk image in main ram to start with something "simple"

Then optimize the code a bit, make space for one more feature, and so on, until I get something usable.
I don't know yet if that will work. But I am unable to cut this project in smaller chunks I test one at a time, so I hope this will fit.

I have seen your experiments, it saved me some time on other languages I knew would not be worth trying. I will come back when I have news...

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: m_dr_m on 21:47, 12 March 24

It may deserve its own thread:
* Have you tried to port SDCC itself? (since crossdev is aberrant in general and abhorrent in my book!)
* Have you tried the latest LLVM-Z80 to see how it compares?

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: PulkoMandy on 23:35, 12 March 24

Sdcc already needs too much resources to run on a modern pc. No chance of running it on cpc at all.

Cloudstrife has tried llvm to compile C++ for the cpc. It works, and we used it to build a player for reality adlib tracker 2 music for the willy/opl3lpt soundcard. The player is very slow, but it runs.

If you want something that runs on the cpc, the best choice for a currently developped language would probably be David Given's Cowgol. I have not tried it yet, but it has a goal of being self hosting on 8 bit machines.

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: stevensixkiller on 21:40, 13 March 24

I thought there was no working Z80 backend for LLVM. Did he used https://github.com/grapereader/llvm-z80 (https://github.com/grapereader/llvm-z80) ?

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: PulkoMandy on 23:37, 13 March 24

I don't remember which version it was. The generated code might have required some patching (occasional use of non-existing opcodes or something like that) and we also changed the compiled code a bit (making variables static instead of stack allocated, etc). Given the speed of execution, it was not really worth digging further...

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: PulkoMandy on 21:57, 19 March 24

Next blogpost in the series (http://pulkomandy.tk/_/_Development/_Porting%20Smalltalk%20to%20the%20Amstrad%20CPC:%20some%20progress)

I got the VM to run ;D

Unfortunately, it only runs a handful of methods before running out of space in the new indirection table that I added :(

Time to let this rest for a few days and think about the best way to move from there.

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: PulkoMandy on 12:16, 07 October 24

Well it was a little more than a few days...

Someone mentionned this in another thread so it's time for an update :)

I resumed work on the project last week. As mentionned in the previous post, I have removed again the indirection table, and went for objects directly referencing each other. This means:

The pointers/references are 3 bytes (bank + address)
The garbage collector will have a bit more work to do to update all references when moving an object

I also updated to the latest nightly build of SDCC which has improved code generation a bit. This allows to fit more things in the ROM. The generated code is still not great, compilation times are ridiculous (at least 45 minutes to build 16K of code, up to 5 hours if I ask the compiler to optimize a bit more). Modern software...

The current result is:

The interpreter runs far enough to display the smalltalk prompt
I can enter an expression in the prompt
The compiler starts converting the expression into bytecode, but it runs out of memory (512K of banks + 16K of main memory are reserved for the object heap) before managing to compile and execute it (I just tried to run something simple like 5 + 3)

There is no garbage collection yet and the interpreter creates a lot of temporary objects, which normally would be almost immediately deleted by the garbage collector. So the memory usage will be massively reduced when I implement that.

But I prefer to get "5 + 3" fully working before I move on to implementing garbage collection, so that I have a simple test scenario to test with. I have some ideas to achieve this. Also I should start rewriting the code in assembler directly (one function at a time) to make it smaller and more efficient. And even before doing that I have some optimizations to do in the C code that I apply manually (each new development requires finding a way to save space in the ROM for it).

At least initially, I may move the garbage collector to a separate ROM. The interface between the two parts of the code is currently 4 functions, but it can probably be reduced to 1 or 2 with extra parameters. So that provides a nice interfacing point to split the ROM in two. And maybe later I can optimize things enough to consider merging them again.

You can see some screenshots and day-by-day notes and milestones on my Fediverse account:
https://mastodon.tetaneutral.net/@pulkomandy/112107254263308680

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: PulkoMandy on 10:53, 13 October 24

After some bugfixes and optimizations, I managed to get it to compute 4 + 3 and print the result, and then go back to the prompt. This means the interpreter is fully working.

It is also very slow and it needs almost all of the 512K of memory to compute this.

The next step is writing a garbage collector that works with banked memory so that I can avoid eating all the RAM so quickckly. I have some ideas about it.

Then, in no specific order:

Rewrite it all in assembler so it is faster
Wire in the function to call an editor (maybe orgams, maybe protext, maybe something else?) so you can actually edit methods, add new classes, etc
Include the disassembler method from Andy Valencia's version of Little Smalltalk so you can edit existing methods (disassemble bytecode, edit, reassemble bytecode)
Implement file output so you can save your working session and reload it later
Write some classes to access CPC firmware and hardware
Write some example programs and so on

Curently it fits in a single ROM, but I will need a second one for the garbage collector (at least initally, until everything is rewritten in assembler) and the "image" file is under 50KB. Memory usage to get to the prompt is also around 50-60KB, so maybe it can work on a stock CPC 6128, but it will be quite bad once you start actually doing things.

In theory it's also possible to use mass storage as swap and load things in/out as needed. But I hope with 512K of RAM it is not needed for most projects, and I'm more interested in making it reasonably fast than making it use more space.

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: Prodatron on 11:09, 13 October 24

I never had a look at SmallTalk, but I have a little question, I hope it isn't too stupid:
AFAIK the first SmallTalk-80 was developed on the Xerox Alto. This machine had a 5,8MHz 16bit TTL-CPU and 128-512K ram.
As it sounds, that it is really not easy to get SmallTalk running on a machine with seeming similiar technical data I wonder if they used a different SmallTalk for the Alto, or was its CPU very different and optimized for running it?

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: PulkoMandy on 11:58, 13 October 24

Not stupid at all, indeed that is what I'm trying to understand. How does Smalltalk manage to run so well (including a GUI) on the Alto, which doesn't seem much more powerful than the CPC?

First of all, the CPU. It runs at around 6MHz but it does a lot more than a regular CPU. It is microcoded, meaning assembler instructions are implemented using an even simpler low-level language. The microcode runs not only a CPU as we understand it, but also other tasks such as feeding data to the display, managing the disk drive, etc (things that on the CPC are offloaded to the FDC and CRTC). So, in reality the CPU does not run so fast.

However, an interesting feature is that the Microcode is loaded in a special RAM, and is reprogrammable. As a result, Smalltalk can make some things faster by implementing them in microcode. The display drawing also takes advantage of this, by having a "blitter" implemented in microcode (to perform various copies and transformations of display data, including text rendering and so on).

Next, the memory. There is 128K of RAM in a base Alto. Half of it is for the display. Once you have the Smalltalk interpreter running (including the low level OS layer to access the disk, network, etc), there is only about 20K of RAM free for the Smalltalk objects. This is not much. What they did is constantly swap objects in and ouf of memory and write them back to disk. If you watch some demonstrations running on the real hardware, you can hear the disk seeking all around everytime they do anything with the OS. This also means they have to keep a table in RAM to know if an object is loaded in RAM, or otherwise, where to find it on disk. This table is several kilobytes big and uses up a lot of the available RAM. So, really, it's as if the RAM is used as a cache, and the real RAM is the hard disk.

In some cases they even turn the display off to free more time for the CPU to run faster as well.

Overall, it runs quite slowly on the Alto. But, compared to other environments available at the time, the main advantage is you can edit things live while the system is running. You don't have to recompile or even restart your program. So, overall, the development cycle is still faster, but it is not exactly a fun platform to use.

Finally, what was released out of Xerox (Smalltalk-80) is a later version that did not run on the Alto, but on later machines like the Dorado. These are quite a bit bigger and faster. So, to get a good impression of what it was like on the Alto, you have to look at Smalltalk-76.

Interestingly, there is a Smalltalk-78 that was ported to a 8086 based machine with 256K of RAM. This one does not use object swapping (there were only floppy disks, no hard drive, so that would be too slow). There is an "image" of it available, however, this version was never released publicly back in the days, and is not very well documented. But there certainly are interesting ideas there to save memory. Also, it has another problem: Smalltalk-80 was the first version to be converted to ASCII. The previous ones use all kind of strange symbols ("looking eye", "open colon", "pointing finger", "up arrow" and "left arrow", ...) which I find a bit confusing. Also, the machine used 3 8086 CPUs to do all its tasks, and it was still quite slow.

That's why I went with Little Smalltalk instead. This is an ASCII version of Smalltalk but removing all of the GUI, it ends up being quite a bit smaller than Smalltalk-80. My current code is not optimized at all, I wanted to get it running first. Since the interpreter is very "hot" code (run over and over again), optimizations in key areas, even saving just a microsecond or two, can have a huge impact. This is both for low-level optimizations (rewriting things in assembler) and also high level ones (adding a "method cache" so that if you call a method several times, the lookup of the method in the class and parent classes isn't done each time ; avoiding allocating memory when you don't need it, ...).

Finally, on memory usage: Smalltalk on the Alto used 16 bit object references. I went with 24 bit (bank number + 16 bit pointer) so all my objects are 50% larger than on the Alto. Also, my code is constantly bank switching when accessing different objects, it may make sense to cache a few of them in main RAM. For example: the bytecode of the currently running method, its local variables, and its execution stack, as well as the current object and class. But then, of course, this introduces cache invalidation problems and I have to make sure the cached main RAM version is in sync with the original. But it is probably worth the effort.

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: Prodatron on 16:09, 13 October 24

Thanks a lot for this great explanation!
I was already wondering about what advantages could have these 70ies TTL/bitslice CPUs by using tons of microcode.

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: HAL6128 on 17:51, 13 October 24

Xerox had invented a lot of wonderful ideas in that time ago. Impressive.

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: Prodatron on 20:10, 13 October 24

Yes, as the father of the GUI and Ethernet already in 1973, but also somehow of the personal computer, etc., the Xerox Alto was one of the most impressive IT projects I've ever seen.
For me it's very fascinating, that @PulkoMandy is working on the Alto-related Smalltalk project.

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: PulkoMandy on 23:02, 14 October 24

Quick update: the garbage collector is working. So now the ramis not getting filled up as fast as before.

Benchmarks:
30 seconds from starting to getting the prompt
2 minutes to compile and execute "4 + 3"

Now it's time to rewrite the C code in assembler to make it acceptably fast

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: GUNHED on 23:14, 14 October 24

Great! This wonderful thing is proceeding! Good luck with your further work. :) :)

I would like to see it one day also running for FutureOS. Since you seem to code close to hardware, it should not be a big deal to convert the version for native OS.

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: PulkoMandy on 07:28, 15 October 24

The firmware is used for text output and keyboard scanning, as well as amsdos for disk access. I think this should be easy to convert to another system.

I will later also need to launch a text editor to edit some text loaded in RAM, and get the edited result back.

I have no idea about Future OS memory layout to see if that would cause me any problems.

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: GUNHED on 17:36, 15 October 24

Quote from: PulkoMandy on 07:28, 15 October 24The firmware is used for text output and keyboard scanning, as well as amsdos for disk access. I think this should be easy to convert to another system.

I will later also need to launch a text editor to edit some text loaded in RAM, and get the edited result back.

I have no idea about Future OS memory layout to see if that would cause me any problems.

Yes, for text output FutureOS functions can be of help. However they deal more with pages of text (not archaic command lines and text flow like in CP/M output. But that shall no problem).

Text editor: FutureOS contains a function for editing text (defined length, defined lower and upper character). Also FutureTex can be involved as app.

The usable memory is the following: 0-&B7FF. For disc I/O buffers are needed. RAM from &0000 to &AFFF can always be used. And &B000 to &B7FF can be used as buffers.
And for expansion RAM (&4000-&7FFF) management there are a lot of functions, using up to 4 MB (if connected).

To do an implementation for FutureOS I can do all that part. In your source code there would be just some 'IF' and 'ENDIF' commands, to be able to assemble for a different target OS.

An further advantage is that FutureOS leaves all RSTs for you and you can use all registers (2nd register set!).

Let me know the time you think is right, and I can do the implementation.

Meanwhile I really with you luck and success with this great and ambitious project! :) :) :)

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: zhulien on 05:04, 14 November 24

If you used packed far addresses you will be able to fit 1/3 more object pointers in your 16k object heap.

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: PulkoMandy on 08:28, 14 November 24

I'm not sure what packed format you would use. Currently this needs at least 5, probably 6 16K pages of memory, so I can't store them in 2 bits alongside with the 14 bit address. And that's just with the initial set of objects loaded, not having written any smalltalk code myself yet. Any application will need a bit more.

Also, the bank number byte is used for a few other things: set to 0 for integer values, 1 for individual character values (not strings, theyeare stored in a more compact way), and I may need some more of these later.

So there's no way I would fit all this information into only two bytes

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: zhulien on 14:54, 15 November 24

Here in the RAM tab outlines two schemes that might be possible to use for 16 bit packed far addresses

https://docs.google.com/spreadsheets/d/1XgRVlh27K_C0-gMtroMhN8lK9mAQxVQg1x-3M42kBYo/edit?usp=drivesdk

You should be able to support 4mb within 16bit packed addresses if you align to 64byte boundaries.

"OPTION 2:

address: xxxxxxyyyyyyyyyy
capacity: x = 64 x 64kb = 4mb (note: 1kb wasted per 64kb) so effective almost 4mb
alignment: y = 256 x 64byte blocks per 16kb (1st block is header)

stats:

64kb has 1024 blocks, 4 entries in the MainRAT since there are 4 16kb banks
256kb has 4096 blocks, 16 entries in the MainRAT since there are 16 16kb banks
512kb has 8192 blocks, 32 entries in the MainRAT since there are 32 16kb banks
4mb has 65536 blocks, 256 entries in the MainRAT since there are 256 16kb banks

pros:

address fits within 16 bits
still works with IX and IY registers
lots of blocks for certain types of applications

cons:

slower to translate due to 64 byte alignment
ram allocation table (RAT) is larger

RAT per 16kb:"

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: PulkoMandy on 15:40, 15 November 24

An address in a 16K bank needs 14 bits if you don't do any alignment.
With 512K of banks, you need 5 bits to store the bank number (2^5 = 32 banks of 16K)
That would be 19 bits. So you need to remove 3 bits from the address and align everything to 8 bytes.

OK, sure, that can work. But it will make everything much slower (a lot of bitshifting needed to decode things into an usable address) and more complicated, and waste up to 7 bytes per object.

May be fine for a generic memory allocator, but for smalltalk, a typical object size is 2 to 10 cells at most (and I can save one byte from that)

With 3 byte cells as I have now, this is 5 to 29 bytes.
With 2 byte cells and rounding up to 8 bytes, it would be 8 to 24 bytes.

And also, as I mentionned, the extra bits in the 3 byte address are used for other things. So I would probably need one or two bits more, because I need that to store some things that are not pointers (integers, chars, possibly more in the future). That would increase the allocation alignment requirement to 16 or even 32 bytes. Not a good choice for Smalltalk, which does a lot of very small allocations and a lot of pointer dereferencing, even if it would work for many other cases where the developer has control of the memory layout, and will probably do larger chunks of allocation and store related data together inside them, possibly with normal pointers for most of it except when it points to another allocation block.

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: zhulien on 07:11, 17 November 24

But 512k gives 8192 allocations with above... 4mb 65536. Is thst still too little for a typical smalltalk application? Of course byte aligning wastes space, but that is a lot of memory for a z80.

What would a typical number of objects be within a smalltalk application? If it is so memory block intensive does a smalltalk application spend more time allocating memory than program logic?

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: PulkoMandy on 09:05, 17 November 24

Just loading the base image create more than 2000 objects/allocations, filling up about 50K of memory. Then, running the interpreter creates a lot more, since every method call requires allocating at least a context (and a few more things if there are argumets), and every operation (including things like adding two integers) is a method call. So, yes, currently most of the work is in the allocator.

There is a way to improve this by having some objects allocated elsewhere, especially method contexts can, most of the time, be handled like a stack, and are short lived. But there are some cases where this doesn't work due to some quirks in the language, so, sometimes these objects will first be allocated in the stack, and then moved to the heap. This stack can be in main memory, which will have the advantage that accessing these objects will not need any bankswitching, the interpreter can benefit from this.

In the current situation, at any given time there are few objects really in use, but the system goes quickly through all the memory while allocating and immediately releasing objects.

The main question is: how long does an assembler routine take to "unpack" such a 16 bit pointer, back into a bank number to write in the gate array and an offset in the bank? And how long to perform the reverse operation? In smalltalk, these will happen a lot, and would further slow things down. And right now, for the thing to be practical, I need speed more than I need extra memory space/more compact objects. So I went with the larger but faster way

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: Prodatron on 15:44, 17 November 24

Quote from: PulkoMandy on 09:05, 17 November 24The main question is: how long does an assembler routine take to "unpack" such a 16 bit pointer, back into a bank number to write in the gate array and an offset in the bank?

I was just curious:

Code Select

;input      HL=8byte aligned 512K address (rrrbbaaa aaaaaaaa)
;output    #4000-#7fff = 16K block, HL=address in 16K block
;destroyed  AF,BC

        ld a,h          ;1      a=rrrbb...
        rrca:rrca      ;2      a=..rrrbb.
        ld c,a          ;1
        and #38        ;2
        or #c4          ;2
        ld b,a          ;1      b=11rrr100
        ld a,c          ;1
        rrca            ;1
        and #03        ;2      a=000000bb
        or b            ;1      a=11rrr1bb
        ld b,#7f        ;2
        out (c),a      ;4
        add hl,hl      ;3
        add hl,hl      ;3
        add hl,hl      ;3      hl=..aaaaaaaaaaa000
        res 7,h        ;2
        set 6,h        ;2      hl=01aaaaaaaaaaa000 -> 33 NOPs
        ret

33 NOPs for unpacking and bankswitchting, maybe someone can optimize this.

You are probably right.

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: zhulien on 02:57, 18 November 24

For me the rrrrbb i would pull out and shift to lookup a bank table to read some bank i for but its more to do with out I do my bank switching rather than calculate the bank, it can get a little complicated calculating if you want to support 4mb, to me anyway. Then of course byte align the other half as you did.

I also was thinking the heap could be treated a (tiny) bit like a stack with a pointer to the next 64byte offset and move it as required for allocations. In any case it might speed up allocations but it won't speed up the translation of the packed addresses.

I would say it could be worth POC at 33 NOPS if it gives more memory and allows smalltalk to work.

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: PulkoMandy on 08:49, 18 November 24

Smalltalk is already working, but very slow. I don't really have a ram usage problem with a 512K machine (I decided to not handle more than that for simplicity).

Now I will be trying to make it faster. The code is open if someone wants to try making it slower but smaller :)

Title: Re: I tried to port Smalltalk to the CPC. It didn't work out in the end.
Post by: GUNHED on 18:15, 18 November 24

Or bigger and quicker ;D

Great to see you advancing in this interesting project! :) :) :)

CPCWiki forum

General Category => Programming => Topic started by: PulkoMandy on 15:32, 24 February 24