Author Topic: Text manipulation tool? (cross-dev)  (Read 1369 times)

0 Members and 1 Guest are viewing this topic.

Offline Targhan

  • Supporter
  • 6128 Plus
  • *
  • Posts: 1.116
  • Country: fr
  • Liked: 1048
  • Likes Given: 152
Text manipulation tool? (cross-dev)
« on: 10:46, 31 January 18 »

I'm looking for a cross-platform (PC, Mac, Linux) way to parse a text file and apply simple transformations to it. Ideally, a command line tool would be nice. Example:
textModifier <input file> <ouput file> <transformation script>


I would like to be able to, for example, remove sections of source files. For example, I want to convert this:
Code: [Select]
ld hl,123
inc a
if PLATFORM_CPC
;code for CPC
else if PLATFORM_SPECTRUM
;code for SPECTRUM
endif
...


into this:
Code: [Select]
ld hl,123
inc a
;code for CPC
...


So, if PLATFORM_CPC is found, keeps that section only. And later, do the opposite in another output file: keep only the PLATFORM_SPECTRUM section.


But I would also like to be able to change, for example, how labels are encoded:


Code: [Select]
.Mylabel: ld hl,0

into:
Code: [Select]
Mylabel ld hl,0

So, is there any tool to do things like that, in a simple way? I know there are stuff like SED, but it's simply terrible to look at and to maintain. I also *could* do something by myself of course, but I would like to avoid reinventing the wheel.


Thanks!
Targhan/Arkos

Arkos Tracker 2 - alpha 9 now released! - Follow the news on Twitter!
Disark - A cross-platform Z80 disassembler/source converter
FDC Tool 1.1 - Read Amsdos files without the system

Imperial Mahjong
Orion Prime

Offline pelrun

  • Supporter
  • 6128 Plus
  • *
  • Posts: 611
  • Country: au
    • index.php?action=treasury
  • Liked: 305
  • Likes Given: 185
Re: Text manipulation tool? (cross-dev)
« Reply #1 on: 15:13, 31 January 18 »
You could use the standard C preprocessor ("cpp"). It's a pure text transformation process, it doesn't actually care what language (if any) is used.

Offline Targhan

  • Supporter
  • 6128 Plus
  • *
  • Posts: 1.116
  • Country: fr
  • Liked: 1048
  • Likes Given: 152
Re: Text manipulation tool? (cross-dev)
« Reply #2 on: 17:55, 31 January 18 »

Thanks for the suggestion, but it won't work, because I still want the original code to be compiled with my assembler. If I use #define here and there, it won't compile anymore.
Targhan/Arkos

Arkos Tracker 2 - alpha 9 now released! - Follow the news on Twitter!
Disark - A cross-platform Z80 disassembler/source converter
FDC Tool 1.1 - Read Amsdos files without the system

Imperial Mahjong
Orion Prime

Offline andycadley

  • Supporter
  • 6128 Plus
  • *
  • Posts: 898
  • Liked: 430
  • Likes Given: 72
Re: Text manipulation tool? (cross-dev)
« Reply #3 on: 22:20, 31 January 18 »
What do you mean by "the original code"? Are you talking about the pre-transformed text? Because if you are then surely your assembler is going to need to fully understand whatever pre-transform language you use. Normally the only thing that an assembler would see is the post-transformed version and so pretty much any existing translation tool from cpp to T4 should be able to do what you want, it's really a matter of syntactic preference.

Offline Targhan

  • Supporter
  • 6128 Plus
  • *
  • Posts: 1.116
  • Country: fr
  • Liked: 1048
  • Likes Given: 152
Re: Text manipulation tool? (cross-dev)
« Reply #4 on: 22:25, 31 January 18 »
The original code contains all the code, like this:


Code: [Select]
if AMSTRAD
out (c),c
...
endif


if MSX
out (c),a
...
endif


I want to create two files off it: the first one with only the AMSTRAD block code (and not MSX), the second file with only MSX, and without AMSTRAD. These two files will have the if/endif removed.
The three files will still compile, yet the two output files will directly target the Amstrad or MSX, without the useless code of the other platform.

Targhan/Arkos

Arkos Tracker 2 - alpha 9 now released! - Follow the news on Twitter!
Disark - A cross-platform Z80 disassembler/source converter
FDC Tool 1.1 - Read Amsdos files without the system

Imperial Mahjong
Orion Prime

Offline andycadley

  • Supporter
  • 6128 Plus
  • *
  • Posts: 898
  • Liked: 430
  • Likes Given: 72
Re: Text manipulation tool? (cross-dev)
« Reply #5 on: 22:31, 31 January 18 »
In that case, your assembler is going to have to be able to parse (and skip) all the directives itself (because it has to know the if/endif bits aren't code) so you're either going to have to support the syntax of a full transformation tool or use something bespoke. Unless you can find something that just so happens to use the same symbol for its directives as your assembler is using for comments.

Offline krusty_benediction

  • CPC664
  • ***
  • Posts: 143
  • Country: fr
  • Liked: 104
  • Likes Given: 37
Re: Text manipulation tool? (cross-dev)
« Reply #6 on: 22:31, 31 January 18 »
I guess you have to modify an existing assembler compatible with your syntax in order to let him output the z80 text code instead of the binary bytecodes.I know no tools able to do that directly
Otherwise, like already said before,
cpp can be a good fit if you use sed to replace if by #ifdef (and all the directive like that)

Offline Targhan

  • Supporter
  • 6128 Plus
  • *
  • Posts: 1.116
  • Country: fr
  • Liked: 1048
  • Likes Given: 152
Re: Text manipulation tool? (cross-dev)
« Reply #7 on: 23:08, 31 January 18 »
Maybe I don't express myself clearly, I'm sorry :).
The assembler is not problem. It can assemble the original file, the second (Amstrad only), the third (msx only). I want an external tool to produce the second and third file. I don't understand why you talk about "modifying an assembler".


I want to "ask" the external tool to create a file with all the lines of the original file, WITHOUT the lines between "if MSX" and the following "endif". Then I will "ask" it: produce the third file, still from the original file, but without the lines between "if AMSTRAD" and the following "endif".


Then the generated second and third files can be assembled normally by any assembler. But the user will only see the code he's interested in, if he opens the file related to his platform[/size][size=78%].[/size]
Targhan/Arkos

Arkos Tracker 2 - alpha 9 now released! - Follow the news on Twitter!
Disark - A cross-platform Z80 disassembler/source converter
FDC Tool 1.1 - Read Amsdos files without the system

Imperial Mahjong
Orion Prime

Offline robcfg

  • Supporter
  • 6128 Plus
  • *
  • Posts: 2.310
  • Country: se
  • 8-Bit Technomancer
    • index.php?action=treasury
  • Liked: 1030
  • Likes Given: 2451
Re: Text manipulation tool? (cross-dev)
« Reply #8 on: 23:23, 31 January 18 »
You want to split the original asm file into other two files, one with Amstrad code and another one with MSX code, so users of any of the plarforms only see the code related to their platforms, right?


That at source file level, before assembling anything.


Are you sure the preprocessor doesn’t do what you need? You should try to call it and defining the adequate symbols so you get either an Amstrad source or MSX source.

Offline Targhan

  • Supporter
  • 6128 Plus
  • *
  • Posts: 1.116
  • Country: fr
  • Liked: 1048
  • Likes Given: 152
Re: Text manipulation tool? (cross-dev)
« Reply #9 on: 23:27, 31 January 18 »
Quote
You want to split the original asm file into other two files, one with Amstrad code and another one with MSX code, so users of any of the plarforms only see the code related to their platforms, right?

Yes. But keep in mind there is also common code, which should not be modified.

Quote
Are you sure the preprocessor doesn’t do what you need? You should try to call it and defining the adequate symbols so you get either an Amstrad source or MSX source.

Do you mean the C++ preprocessor, as suggested Krusty? I can use that because adding #if / #ifdef inside a Z80 assembler will make my source not understood by any assembler.
Targhan/Arkos

Arkos Tracker 2 - alpha 9 now released! - Follow the news on Twitter!
Disark - A cross-platform Z80 disassembler/source converter
FDC Tool 1.1 - Read Amsdos files without the system

Imperial Mahjong
Orion Prime

Offline Targhan

  • Supporter
  • 6128 Plus
  • *
  • Posts: 1.116
  • Country: fr
  • Liked: 1048
  • Likes Given: 152
Re: Text manipulation tool? (cross-dev)
« Reply #10 on: 00:40, 01 February 18 »
I guess SED will do the job, though I would have preferred something more readable :).
Targhan/Arkos

Arkos Tracker 2 - alpha 9 now released! - Follow the news on Twitter!
Disark - A cross-platform Z80 disassembler/source converter
FDC Tool 1.1 - Read Amsdos files without the system

Imperial Mahjong
Orion Prime

Offline andycadley

  • Supporter
  • 6128 Plus
  • *
  • Posts: 898
  • Liked: 430
  • Likes Given: 72
Re: Text manipulation tool? (cross-dev)
« Reply #11 on: 01:17, 01 February 18 »
It's pretty much going to be your only choice, because any generic pre-processor is going to be expecting its instructions marked up in some specific format and not just 'if AMSTRAD ... endif'

Offline pelrun

  • Supporter
  • 6128 Plus
  • *
  • Posts: 611
  • Country: au
    • index.php?action=treasury
  • Liked: 305
  • Likes Given: 185
Re: Text manipulation tool? (cross-dev)
« Reply #12 on: 06:21, 01 February 18 »
Yeah, if you insist that it handle the specific directives that your assembler already recognises, then you'll need to write it yourself. Or rethink your build chain.

Offline krusty_benediction

  • CPC664
  • ***
  • Posts: 143
  • Country: fr
  • Liked: 104
  • Likes Given: 37
Re: Text manipulation tool? (cross-dev)
« Reply #13 on: 10:19, 01 February 18 »
yes ;) my answer exactly corresponds to this request
I had well understood


Maybe I don't express myself clearly, I'm sorry :) .
The assembler is not problem. It can assemble the original file, the second (Amstrad only), the third (msx only). I want an external tool to produce the second and third file. I don't understand why you talk about "modifying an assembler".


I want to "ask" the external tool to create a file with all the lines of the original file, WITHOUT the lines between "if MSX" and the following "endif". Then I will "ask" it: produce the third file, still from the original file, but without the lines between "if AMSTRAD" and the following "endif".


Then the generated second and third files can be assembled normally by any assembler. But the user will only see the code he's interested in, if he opens the file related to his platform[size=78%].[/size]

Offline Targhan

  • Supporter
  • 6128 Plus
  • *
  • Posts: 1.116
  • Country: fr
  • Liked: 1048
  • Likes Given: 152
Re: Text manipulation tool? (cross-dev)
« Reply #14 on: 10:39, 01 February 18 »
Quote
Yeah, if you insist that it handle the specific directives that your assembler already recognises, then you'll need to write it yourself. Or rethink your build chain.

Well, no, because that's the purpose of a "text manipulation tool". I didn't ask for a preprocessor. I just need a tool that will look for the strings "if AMSTRAD" / "endif". SED can do that, but it's not very user-friendly.
Targhan/Arkos

Arkos Tracker 2 - alpha 9 now released! - Follow the news on Twitter!
Disark - A cross-platform Z80 disassembler/source converter
FDC Tool 1.1 - Read Amsdos files without the system

Imperial Mahjong
Orion Prime

Offline robcfg

  • Supporter
  • 6128 Plus
  • *
  • Posts: 2.310
  • Country: se
  • 8-Bit Technomancer
    • index.php?action=treasury
  • Liked: 1030
  • Likes Given: 2451
Re: Text manipulation tool? (cross-dev)
« Reply #15 on: 10:48, 01 February 18 »
Yes, but you're not manipulating any text, you're working with code and that is where the preprocessor kicks in.


Anyway, it would be very easy to create a small tool in any language that searches for #if,#else#endif blocks and keep only the desired ones.

Offline arnoldemu

  • Supporter
  • 6128 Plus
  • *
  • Posts: 5.335
  • Country: gb
    • Unofficial Amstrad WWW Resource
  • Liked: 2264
  • Likes Given: 3478
Re: Text manipulation tool? (cross-dev)
« Reply #16 on: 11:09, 01 February 18 »
Well, no, because that's the purpose of a "text manipulation tool". I didn't ask for a preprocessor. I just need a tool that will look for the strings "if AMSTRAD" / "endif". SED can do that, but it's not very user-friendly.
a pre-processor does exactly what you want.

it omits bits you don't want, it will paste text together and it will do text substitutions.

for the c/c++ preprocessor it recognises a specific syntax and works on that.

#if AMSTRAD
;; amstrad stuff
#endif

you can also pre-process the labels

define them like this for example:
LABEL(text)

#if CPC
#define LABEL(x) \
   .x
#endif

#if MSX
#define LABEL(x) \
  x:
#endif

run through pre-processor and get the result you want.

the pre-processor is used in games and programs to do exactly as you want for exactly the reasons you want.

in games we have #if DEBUG, #if PACKAGE
#define NICE_COMPARISON_VALUE 3

etc..

EDIT: Ok you want to be able to assemble the version with if in it, then output two cleaner versions which can also be assembled.
Yes something else would be needed. OR can you use the assembler to generate a listing for you for the others?

« Last Edit: 11:23, 01 February 18 by arnoldemu »
My games. My Games
My website with coding examples: Unofficial Amstrad WWW Resource

Offline Targhan

  • Supporter
  • 6128 Plus
  • *
  • Posts: 1.116
  • Country: fr
  • Liked: 1048
  • Likes Given: 152
Re: Text manipulation tool? (cross-dev)
« Reply #17 on: 11:28, 01 February 18 »
I understand perfectly what a preproc is. But this would not work, because if makes the original code NOT compilable by the assembler. So this is not an option to me.
Targhan/Arkos

Arkos Tracker 2 - alpha 9 now released! - Follow the news on Twitter!
Disark - A cross-platform Z80 disassembler/source converter
FDC Tool 1.1 - Read Amsdos files without the system

Imperial Mahjong
Orion Prime

Offline robcfg

  • Supporter
  • 6128 Plus
  • *
  • Posts: 2.310
  • Country: se
  • 8-Bit Technomancer
    • index.php?action=treasury
  • Liked: 1030
  • Likes Given: 2451
Re: Text manipulation tool? (cross-dev)
« Reply #18 on: 13:18, 01 February 18 »
Could you please post sample files so we can see what are you trying to achieve?


Because I have the feeling that we are missing something. I though you'd like a tool to separate the CPC and MSX versions of the code for assembling them, but obviously, the original file will have both versions and you need to specify which regions are for one platform or another.


How do you distinguish between CPC and MSX code? If you cannot use preprocessor definitions, do you define labels?

Offline Targhan

  • Supporter
  • 6128 Plus
  • *
  • Posts: 1.116
  • Country: fr
  • Liked: 1048
  • Likes Given: 152
Re: Text manipulation tool? (cross-dev)
« Reply #19 on: 13:47, 01 February 18 »
Ok, here is precisely what I want to do. As you may know, I'm working on Arkos Tracker 2. The player can be used on various platforms (CPC, MSX, Spectrum, etc.). The player uses conditional assembling to target the platform, because adressing the PSG is obviously from a platform to another. So I have one main code (called "the original file") :


Code: [Select]
PLY_PLATFORM_CPC equ 1
PLY_PLATFORM_MSX equ 0
PLY_PLATFORM_SPECTRUM equ 0


org #1000
ld hl,1234
inc hl
... blablalba ...


;Sends the values to the PSG, according to the platform
if PLY_PLATFORM_CPC
   ld b,#f4
   out (c),0
   ...
endif


if PLY_PLATFORM_MSX
    ld b,#ff     ;Random stuff, I don't remember how to do this on the MSX
    out (c),a
    ...
endif


ret


So as you can see, when developing, I can assemble for any platform. However, the source is bloated: when the user is going to use it, he will get the code for each platform. I say "the source" is bloated, not the assembled code. What I want is to provide three source files to the user, one for each platform, with only the code it requires.
So PlayerAmstrad.asm would look like:



Code: [Select]
org #1000
ld hl,1234
inc hl
... blablalba ...


;Sends the values to the PSG, according to the platform
   ld b,#f4
   out (c),0
   ...


ret


In order to do this, I want an external tool to parse my file and removes everything that is between tags I would have given it ("if PLY_PLATFORM_MSX" / "endif") for example. I think that putting section markers as comments will make things easier :

Code: [Select]
if PLY_PLATFORM_CPC       ; [SECTION_CPC_START]
   ld b,#f4
   out (c),0
   ...
endif       ; [SECTION_CPC_END]


Am I clearer?
Targhan/Arkos

Arkos Tracker 2 - alpha 9 now released! - Follow the news on Twitter!
Disark - A cross-platform Z80 disassembler/source converter
FDC Tool 1.1 - Read Amsdos files without the system

Imperial Mahjong
Orion Prime

Offline andycadley

  • Supporter
  • 6128 Plus
  • *
  • Posts: 898
  • Liked: 430
  • Likes Given: 72
Re: Text manipulation tool? (cross-dev)
« Reply #20 on: 14:19, 01 February 18 »

Yes. But the usual way to do that would be to mark up the main file with preprocessor comments (#if #define #end if etc) and then use that to generate all three variants you want.
Otherwise yes, you're basically limited to running regular expressions over the source with something like see, but as you say that's not a very user friendly approach.

Offline robcfg

  • Supporter
  • 6128 Plus
  • *
  • Posts: 2.310
  • Country: se
  • 8-Bit Technomancer
    • index.php?action=treasury
  • Liked: 1030
  • Likes Given: 2451
Re: Text manipulation tool? (cross-dev)
« Reply #21 on: 15:07, 01 February 18 »
That's exactly the problem.


The preprocessor would be the right tool if the assembler could use the #if,#else#endif statements, which as I understand, it does not. So Targhan cannot work with his main file directly.


I still think that it would be possible to create a small tool that copies the file to another file while ignoring the required blocks of code.


Something like, check that the current char in the file is not the beginning of "if PLY_xxxx", then copy char to the new file, else skip chars until the end of the next endif.