Author Topic: IDEA: Compression/Decompression of all Strings in BASIC Listing  (Read 227 times)

0 Members and 1 Guest are viewing this topic.

Offline SRS

  • Supporter
  • 464 Plus
  • *
  • Posts: 384
  • Country: de
  • Schneider CPC464 - what else ?
  • Liked: 324
Still stuck with the "BIG" Basic games/listings that use a lot of text I have following idea but no solution I can do now ...

Given a BASIC Source A the Code should be scanned for Text (identified by ""), write that to a file / memory, than the text should be all compressed (maybe with Huffmann / SMAZ - compression for very small strings), and then there should be a new Listing which replaces every found text with its compressed counterpart (and changing the "PRINT" command to a new BASIC command like "PCT" (print compressed text)).

Like Org is
Code: [Select]
10 PRINT " I am a simple example" and New is
Code: [Select]
10 PCT " IUANI/HGJ"
Is it possible ? And is it useful ? (Could also have a "twin" or C sources and cpctelera) ....


Offline arnoldemu

  • 6128 Plus
  • ******
  • Posts: 4.899
  • Country: gb
    • Unofficial Amstrad WWW Resource
  • Liked: 1842
Interesting idea.

The quoted strings are stored directly in the basic program with 1 byte per char.
They can contain any byte so filling them with compressed data is ok. You may need to escape quotes.

I'm not sure how you would get basic to accept a new token so it may need to be an rsx if you want to go with this method.

Another would be to split strings into words. Then you could join words to make sentences with a string of bytes.

AMSDOS does similar to generate it's strings. AMSDOS also has special control codes where the drive and user are substituted - so this could be a possibility.

I have seen some data embedded in REM statements before. This could be used to store data and would save the need for DATA statements to poke the binary code into memory.

I know this is not related but have you tried BEEBUG's basic toolkit? I believe it has a compress command that puts a lot on one line and reduces the number of lines.

http://www.cpc-power.com/index.php?page=detail&num=7261

Ideally you could do with a way to keep it in basic.

If you had words stored in ram you could probably do this:

Code: [Select]
10 rem Thisisstring
20 a$=""
20 w=1:gosub 5000:w=2:gosub 5000:end

5000 rem print word
5010 len = peek(remaddr+(w*3)+0):addr = peek(remaddr+(w*3)+1)+addr(peek(remaddr+(w*3)+2)*256:
5020 poke @a$+0,len
5030 poke @a$+1,addr and 255:poke @a$+2,addr/256:print a$
5040 return

If you did it word by word you could avoid needing spaces too. You could add spaces only when needed and you could control word wrap automatically by counting the number of chars output so far.

Just some ideas. Sorry I didn't have answers.


My games. My Games
My website with coding examples: Unofficial Amstrad WWW Resource

Offline AMSDOS

  • Supporter
  • 6128 Plus
  • *
  • Posts: 3.199
  • Country: au
    • index.php?action=treasury
    • Programs for Turbo Pascal 3
  • Liked: 533
I've seen it being done with M/C being transformed using UUEncoded format (I've attached image below from the Wiki), so perhaps a series of Strings could be converted into UUEncoded format and then written to an Array, only limitation is the amount of time it takes to decode that format. I recall that particular author writing a Text Editor program in that format as well, but had some M/C poked before getting to the UUEncoded bit which I think was to reduce the amount of time to poke to memory.



Offline SRS

  • Supporter
  • 464 Plus
  • *
  • Posts: 384
  • Country: de
  • Schneider CPC464 - what else ?
  • Liked: 324
IIRC UUencode makes 4 bytes out of 3 in source so this would be contraproductive ?

Offline AMSDOS

  • Supporter
  • 6128 Plus
  • *
  • Posts: 3.199
  • Country: au
    • index.php?action=treasury
    • Programs for Turbo Pascal 3
  • Liked: 533
IIRC UUencode makes 4 bytes out of 3 in source so this would be contraproductive ?


It seems that way. When I studied that program, I thought that it might of been Compressed and then converted into UUEncode, the initial code begins just below &4000 and is about &600 bytes, though extra data is sent to &6000 and &8000.