News:

Printed Amstrad Addict magazine announced, check it out here!

Main Menu
avatar_SagaDS

CPC Basicator - A python tool to create BAS AMSDOS Files from PC Text files

Started by SagaDS, 12:56, 18 July 25

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

SagaDS

Hello,

I have created a BASIC tokeniser (and AMSDOS header) for a big BASIC project. I know WinApe can do it by 'typing' a copy/paste text, but it is slow (you have a x10 but you need to switch it ON and OFF) and it can't do several files... With this, I can easily add several BAS files into a DSK file.
It works for me and seem to create identical tokens than my tests with WinApe. Only missing point: no floating point tokeniser... If you know how to create the 5 bytes from a string, feel free to tell me (I have some documentation and a link to the ROM assembly, but I can't figure it out in python).

Link to project


SagaDS

Yes I know this one (I put a link to it in doc directory of project). 
Just not so easy to reprogram (even with readable ASM method).

https://github.com/Bread80/CPC6128-Firmware-Source  look for  REAL_5byte_to_real

Also, my project was not using floating point numbers so no real need for them  :D

McArti0

@SagaDS

if floatnumber=0 then

exponent=0
mantissa=0

else

exponent =-127
norm= abs(floatnumber)/(2^exponent)
while norm>=1 or norm<0.5
   exponent=exponent+1
norm= abs(floatnumber)/(2^exponent)
wend
if sgn(floatnumber)=1 then norm=norm-0.5
mantissa=int(norm*(2^32))
exponent=exponent+128

endif

4 bytes Little endian mantissa and 1 byte exponent
CPC 6128, Whole 6128 and Only 6128, with .....
NewPAL v3 for use all 128kB RAM by CRTC as VRAM
One chip driver for 512kB(to640) extRAM 6128
TYPICAL :) TV Funai 22FL532/10 with VGA-RGB-in.

lightforce6128

I wrote a small program in Locomotive BASIC to do the conversion. It should not be a problem to convert any part of it to Python or something else.

100 MODE 2 : ZONE 16
110 float!=-123.456
120 PRINT "Number: ",float!
130 PRINT
140 IF float!<>0 THEN GOTO 170
150   PRINT "Special case: Set all bytes to zero."
160   GOTO 350
170 sign=SGN(float!)
180 intDigits=INT(LOG(sign*float!)/LOG(2))
190 exponent=128+intDigits+1
200 mantissa=(sign*float!)/(2^intDigits)
210 PRINT "Sign: ",sign
220 PRINT "Exponent: ",exponent,
230 PRINT "&";RIGHT$("0"+HEX$(exponent),2)
240 mantissa=mantissa-1
250 mantissa=mantissa/2
260 PRINT "Mantissa:"
270 FOR i=1 TO 4
280   mantissa=mantissa*256
290   intPart=INT(mantissa)
300   mantissa=mantissa-intPart
310   IF i=1 AND sign<0 THEN intPart=intPart+128
320   PRINT i;": ",intPart,
330   PRINT "&";RIGHT$("0"+HEX$(intPart),2)
340 NEXT
350 PRINT
360 PRINT "Memory:"
370 FOR i=0 TO 4
380   byte=PEEK(@float!+i)
390   PRINT i;": ",byte,
400   PRINT "&";RIGHT$("0"+HEX$(byte),2)
410 NEXT
420 PRINT

Some explanation:
  • Line 110: In modern languages often there is some kind of parsing function to convert a string to a floating point number. Here I use the BASIC interpreter to do this work. Afterwards the floating point number is deconstructed to its parts.
  • Line 140: Value 0.0 is a special case stored as five bytes with &00.
  • Line 180: Instead of searching for the first bit, it can be calculated with a logarithm.
  • Line 190: Because the result of the logarithm is rounded down, the exponent is increased by one.
  • Line 200: The mantissa value will always start with 1,...
  • Line 240: Remove the leading 1. The mantissa is now 0,...
  • Line 250: Reserve one bit for the sign. Shift the mantissa right. It is now 0,0...
  • Line 280: Shift the mantissa left by 8 bits. Extract these 8 bits.
  • Line 310: Only for the first mantissa byte: Add the sign bit.
  • Line 370: As can be seen, in memory the values are stored in reverse.

Finally I have to say that I only did a few tests. There could be some corner cases left where unexpected things happen. Also using a floating point number itself to do the calculations could introduce rounding errors. This effect will be minimized on modern systems that use 64 bits, much more than the needed 32 bits.

GUNHED

This works on every CPC and emulator:

- Load ASCII file (every line begins with an number)
- Save it with SAVE"xzy

Now you got a BASIC program on you disc / cassette
http://futureos.de --> Get the revolutionary FutureOS (Update: 2024.10.27)
http://futureos.cpc-live.com/files/LambdaSpeak_RSX_by_TFM.zip --> Get the RSX-ROM for LambdaSpeak :-) (Updated: 2021.12.26)

SagaDS

Quote from: GUNHED on 15:11, 20 July 25This works on every CPC and emulator:

- Load ASCII file (every line begins with an number)
- Save it with SAVE"xzy

Now you got a BASIC program on you disc / cassette
There are several ways to generate BASIC files. 
My purpose here was to generate them directly on a DSK produce on PC.

SagaDS

Quote from: McArti0 on 21:35, 19 July 25@SagaDS

if floatnumber=0 then

exponent=0
mantissa=0

else

exponent =-127
norm= abs(floatnumber)/(2^exponent)
while norm>=1 or norm<0.5
  exponent=exponent+1
norm= abs(floatnumber)/(2^exponent)
wend
if sgn(floatnumber)=1 then norm=norm-0.5
mantissa=int(norm*(2^32))
exponent=exponent+128

endif

4 bytes Little endian mantissa and 1 byte exponent
Thanks for proposed algo (@lightforce6128 too).
I will give it a go sometime in future.

Just hope that python will not modify float precision in a way that result won't be the same...
That is why I was looking for a text parser (thus the ROM information) instead of a conversion from float...

McArti0

CPC 6128, Whole 6128 and Only 6128, with .....
NewPAL v3 for use all 128kB RAM by CRTC as VRAM
One chip driver for 512kB(to640) extRAM 6128
TYPICAL :) TV Funai 22FL532/10 with VGA-RGB-in.

SagaDS

I have implemented algorithm of lightforce6128.

I had to modify one thing when testing with more values in python (was working in BASIC):

        intDigits=int(math.log(sign*floatnumber)/math.log(2))
        if intDigits<0:
            intDigits-=1
        exponent=128+intDigits+1

New version v1.0 pushed.

mv

It took me a while to figure it out in TypeScript. I hope it's correct...

https://github.com/benchmarko/CPCBasicTS/blob/8496dd96ecc1a2585626637fc14b6d23e3c0952f/src/CodeGeneratorToken.ts#L375C1-L394C3

private static floatToByteString(number: number) {
  let mantissa = 0,
    exponent = 0,
    sign = 0;

  if (number !== 0) {
    if (number < 0) {
      sign = 0x80000000;
      number = -number;
    }
    exponent = Math.ceil(Math.log(number) / Math.log(2));
    mantissa = Math.round(number / Math.pow(2, exponent - 32)) & ~0x80000000;
    if (mantissa === 0) {
      exponent += 1;
    }
    exponent += 0x80;
  }
  return CodeGeneratorToken.convInt32ToString(sign + mantissa) + CodeGeneratorToken.convUInt8ToString(exponent);
 }


And the reverse (bytes to number):
https://github.com/benchmarko/CPCBasicTS/blob/8496dd96ecc1a2585626637fc14b6d23e3c0952f/src/BasicTokenizer.ts#L98C1-L114C3
...

cpcitor

Thanks @SagaDS for writing and sharing this! Love the name, by the way! :)

Alternatives to CPC Basicator

As other noticed, one can let Locomotive BASIC read ASCII and produce binary. It is its own reference implementation. Of course this is interesting if all done without any manual step: have in your toolchain a continuous integration step that instruments any open-source emulator to read an ascii text into the BASIC interpreter and have it save a binary file. CPC Basicator does it in one step very quickly without launching an emulator.

Even more, do we actually need a binary BASIC program? One can simply save the BASIC program as a text file on the tape/disc and call it a day. That's what I did in color-flood-for-amstrad-cpc. Job done!

If that was all, then one might question the value of CPC Basicator.

The real benefit CPC Basicator can have

To me the real benefit of such a program would be to create binary files that the firmware BASIC is incapable of providing!

Consider this:

* a program that, given an ASCII input that the regular BASIC interpreter would accept, always produces exactly the same binary as the regular BASIC interpreter, byte for byte -- that's what CPC Basicator currently aims at
* yet some ASCII input that the regular interpreter would reject (or ignore some parts, like a line that starts with REM or ' and no line number), produce a binary with something more interesting, that the regular BASIC interpreter would be incapable of

Some real world use for such a program

Some prods have mixed BASIC/binary loader. The point is to have a file that the firmware recognizes as a BASIC program, so the firmware is not reset when it runs, yet its payload is actually Z80 binary code, in one file. This is typically done by hiding some lines that contains the compiled Z80 code in comments.

To do this, I have seen somewhere a ASM source code that, interspersed with actual Z80 instructions as assembly source, hard-codes some bytes so that the result of calling an assembler on that source is a file that the firmware recognizes and can load and run as a BASIC program. Even when commented, the assembly source code is at best readable, not practical to write. One might imagine wanting to write long BASIC programs with many short assembly parts.

One could imagine that CPC Basicator is extended to make such a program easier to make.

Let's get wild now

I see two ways:

* modify the output so that CPC Basicator generates not a binary program but assembly source code (with a choice of Z80 syntax), to be later interspersed with actual assembly source code and processed by a regular pre-existing assembler to make the actual BASIC binary file. Not obvious how to put the pieces together. Would allow links between various assembled parts (like, Z80 code hidden in line 100 could reference Z80 code and data hidden in line 120 or even BASIC structures).
* expanding the syntax accepted by CPC Basicator with useful constructs, and have it call an external assembly program to compile each part. Each assembly part would be independent and could not refer to each other.

Examples:

* define a pointer label that is the address of any part
100 print $(PTR: my_address) "Hello"
110 print peek( $(@ my_address) )

* lines given as a inline binary stream:
100 $(BIN: C0 20 43 50 43 20 72 75 6c 65 73 0a)100 '$(BIN: 20 43 50 43 20 72 75 6c 65 73 0a)
* (let's get wild) insert assembly code anywhere:
100 $(PTR: clrscr) '$(ASM
ld hl, # 0xC000
ld (hl), a
ld de, # 0xc001
ld bc, # 3fff
ldir
ret
)
110 call $(@ clrscr)

One could even imagine from the Z80 ASM reference addresses of lines or even individual token, to change strings or even adjust code at run time. Crazy? Yes. Useful? I don't know. If someone finds a use then it is useful.

* autonumbering mode, use labels instead of lines in your source, get a regular BASIC program with generated lines. This can be activated locally.

$(set autonumber on)
PRINT "This line will automatically get a number and prints once."
$(LINENO: loop) PRINT "Hello CPC Basicator (in a loop)"
GOTO $(# loop)

Other ideas:

* given as input a binary file containing a valid BASIC program, turn it into text again ("CPC DeBasicator")
* given as input a binary file containing a hacked BASIC program (using any of the known hacks: out-of-order line numbers, line number zero, comment with binary values), turn it into text again but with some extra information, like hidden lines, decode binary into proper $(BIN ...) instructions, etc

(Did any one say "transpiler"?)

Notice the CPC Basic interpreter accept (nearly?) anything you throw at it, provided it has a valid line number and is not too long. This means that technically, the use of $(...) syntax is not good, in the sense that while it is not a valid run-time syntax, the regular interpreter does generate a valid (as in "you can save and load it") binary BASIC file. So this would need a decision: adopt $() considering that no sane BASIC program would use that? Define some different syntax?

Or maybe I just got too crazy and this is of no use. Your turn to imagine things now.  8)
Had a CPC since 1985, currently software dev professional, including embedded systems.

I made in 2013 the first CPC cross-dev environment that auto-installs C compiler and tools: cpc-dev-tool-chain: a portable toolchain for C/ASM development targetting CPC :) later forked into CPCTelera. Also author of intro 2021 a CPC Odyssey :) game Just Get 9 :) and demo Sunny Day :)

Powered by SMFPacks Menu Editor Mod