News:

Printed Amstrad Addict magazine announced, check it out here!

Main Menu
avatar_Targhan

Text manipulation tool? (cross-dev)

Started by Targhan, 09:46, 31 January 18

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Targhan


I'm looking for a cross-platform (PC, Mac, Linux) way to parse a text file and apply simple transformations to it. Ideally, a command line tool would be nice. Example:
textModifier <input file> <ouput file> <transformation script>


I would like to be able to, for example, remove sections of source files. For example, I want to convert this:
ld hl,123
inc a
if PLATFORM_CPC
;code for CPC
else if PLATFORM_SPECTRUM
;code for SPECTRUM
endif
...



into this:
ld hl,123
inc a
;code for CPC

...


So, if PLATFORM_CPC is found, keeps that section only. And later, do the opposite in another output file: keep only the PLATFORM_SPECTRUM section.


But I would also like to be able to change, for example, how labels are encoded:


.Mylabel: ld hl,0


into:
Mylabel ld hl,0


So, is there any tool to do things like that, in a simple way? I know there are stuff like SED, but it's simply terrible to look at and to maintain. I also *could* do something by myself of course, but I would like to avoid reinventing the wheel.


Thanks!
Targhan/Arkos

Arkos Tracker 2.0.1 now released! - Follow the news on Twitter!
Disark - A cross-platform Z80 disassembler/source converter
FDC Tool 1.1 - Read Amsdos files without the system

Imperial Mahjong
Orion Prime

pelrun

You could use the standard C preprocessor ("cpp"). It's a pure text transformation process, it doesn't actually care what language (if any) is used.

Targhan


Thanks for the suggestion, but it won't work, because I still want the original code to be compiled with my assembler. If I use #define here and there, it won't compile anymore.
Targhan/Arkos

Arkos Tracker 2.0.1 now released! - Follow the news on Twitter!
Disark - A cross-platform Z80 disassembler/source converter
FDC Tool 1.1 - Read Amsdos files without the system

Imperial Mahjong
Orion Prime

andycadley

What do you mean by "the original code"? Are you talking about the pre-transformed text? Because if you are then surely your assembler is going to need to fully understand whatever pre-transform language you use. Normally the only thing that an assembler would see is the post-transformed version and so pretty much any existing translation tool from cpp to T4 should be able to do what you want, it's really a matter of syntactic preference.

Targhan

The original code contains all the code, like this:



if AMSTRAD
out (c),c
...
endif


if MSX
out (c),a
...
endif



I want to create two files off it: the first one with only the AMSTRAD block code (and not MSX), the second file with only MSX, and without AMSTRAD. These two files will have the if/endif removed.
The three files will still compile, yet the two output files will directly target the Amstrad or MSX, without the useless code of the other platform.

Targhan/Arkos

Arkos Tracker 2.0.1 now released! - Follow the news on Twitter!
Disark - A cross-platform Z80 disassembler/source converter
FDC Tool 1.1 - Read Amsdos files without the system

Imperial Mahjong
Orion Prime

andycadley

In that case, your assembler is going to have to be able to parse (and skip) all the directives itself (because it has to know the if/endif bits aren't code) so you're either going to have to support the syntax of a full transformation tool or use something bespoke. Unless you can find something that just so happens to use the same symbol for its directives as your assembler is using for comments.

krusty_benediction

I guess you have to modify an existing assembler compatible with your syntax in order to let him output the z80 text code instead of the binary bytecodes.I know no tools able to do that directly
Otherwise, like already said before,
cpp can be a good fit if you use sed to replace if by #ifdef (and all the directive like that)

Targhan

Maybe I don't express myself clearly, I'm sorry :).
The assembler is not problem. It can assemble the original file, the second (Amstrad only), the third (msx only). I want an external tool to produce the second and third file. I don't understand why you talk about "modifying an assembler".


I want to "ask" the external tool to create a file with all the lines of the original file, WITHOUT the lines between "if MSX" and the following "endif". Then I will "ask" it: produce the third file, still from the original file, but without the lines between "if AMSTRAD" and the following "endif".


Then the generated second and third files can be assembled normally by any assembler. But the user will only see the code he's interested in, if he opens the file related to his platform[/size][size=78%].[/size]
Targhan/Arkos

Arkos Tracker 2.0.1 now released! - Follow the news on Twitter!
Disark - A cross-platform Z80 disassembler/source converter
FDC Tool 1.1 - Read Amsdos files without the system

Imperial Mahjong
Orion Prime

robcfg

You want to split the original asm file into other two files, one with Amstrad code and another one with MSX code, so users of any of the plarforms only see the code related to their platforms, right?


That at source file level, before assembling anything.


Are you sure the preprocessor doesn't do what you need? You should try to call it and defining the adequate symbols so you get either an Amstrad source or MSX source.

Targhan

QuoteYou want to split the original asm file into other two files, one with Amstrad code and another one with MSX code, so users of any of the plarforms only see the code related to their platforms, right?

Yes. But keep in mind there is also common code, which should not be modified.

QuoteAre you sure the preprocessor doesn't do what you need? You should try to call it and defining the adequate symbols so you get either an Amstrad source or MSX source.

Do you mean the C++ preprocessor, as suggested Krusty? I can use that because adding #if / #ifdef inside a Z80 assembler will make my source not understood by any assembler.
Targhan/Arkos

Arkos Tracker 2.0.1 now released! - Follow the news on Twitter!
Disark - A cross-platform Z80 disassembler/source converter
FDC Tool 1.1 - Read Amsdos files without the system

Imperial Mahjong
Orion Prime

Targhan

I guess SED will do the job, though I would have preferred something more readable :).
Targhan/Arkos

Arkos Tracker 2.0.1 now released! - Follow the news on Twitter!
Disark - A cross-platform Z80 disassembler/source converter
FDC Tool 1.1 - Read Amsdos files without the system

Imperial Mahjong
Orion Prime

andycadley

It's pretty much going to be your only choice, because any generic pre-processor is going to be expecting its instructions marked up in some specific format and not just 'if AMSTRAD ... endif'

pelrun

Yeah, if you insist that it handle the specific directives that your assembler already recognises, then you'll need to write it yourself. Or rethink your build chain.

krusty_benediction

yes ;) my answer exactly corresponds to this request
I had well understood


Quote from: Targhan on 22:08, 31 January 18Maybe I don't express myself clearly, I'm sorry :) .
The assembler is not problem. It can assemble the original file, the second (Amstrad only), the third (msx only). I want an external tool to produce the second and third file. I don't understand why you talk about "modifying an assembler".


I want to "ask" the external tool to create a file with all the lines of the original file, WITHOUT the lines between "if MSX" and the following "endif". Then I will "ask" it: produce the third file, still from the original file, but without the lines between "if AMSTRAD" and the following "endif".


Then the generated second and third files can be assembled normally by any assembler. But the user will only see the code he's interested in, if he opens the file related to his platform[size=78%].[/size]

Targhan

QuoteYeah, if you insist that it handle the specific directives that your assembler already recognises, then you'll need to write it yourself. Or rethink your build chain.

Well, no, because that's the purpose of a "text manipulation tool". I didn't ask for a preprocessor. I just need a tool that will look for the strings "if AMSTRAD" / "endif". SED can do that, but it's not very user-friendly.
Targhan/Arkos

Arkos Tracker 2.0.1 now released! - Follow the news on Twitter!
Disark - A cross-platform Z80 disassembler/source converter
FDC Tool 1.1 - Read Amsdos files without the system

Imperial Mahjong
Orion Prime

robcfg

Yes, but you're not manipulating any text, you're working with code and that is where the preprocessor kicks in.


Anyway, it would be very easy to create a small tool in any language that searches for #if,#else#endif blocks and keep only the desired ones.

arnoldemu

#16
Quote from: Targhan on 09:39, 01 February 18
Well, no, because that's the purpose of a "text manipulation tool". I didn't ask for a preprocessor. I just need a tool that will look for the strings "if AMSTRAD" / "endif". SED can do that, but it's not very user-friendly.
a pre-processor does exactly what you want.

it omits bits you don't want, it will paste text together and it will do text substitutions.

for the c/c++ preprocessor it recognises a specific syntax and works on that.

#if AMSTRAD
;; amstrad stuff
#endif

you can also pre-process the labels

define them like this for example:
LABEL(text)

#if CPC
#define LABEL(x) \
   .x
#endif

#if MSX
#define LABEL(x) \
  x:
#endif

run through pre-processor and get the result you want.

the pre-processor is used in games and programs to do exactly as you want for exactly the reasons you want.

in games we have #if DEBUG, #if PACKAGE
#define NICE_COMPARISON_VALUE 3

etc..

EDIT: Ok you want to be able to assemble the version with if in it, then output two cleaner versions which can also be assembled.
Yes something else would be needed. OR can you use the assembler to generate a listing for you for the others?

My games. My Games
My website with coding examples: Unofficial Amstrad WWW Resource

Targhan

I understand perfectly what a preproc is. But this would not work, because if makes the original code NOT compilable by the assembler. So this is not an option to me.
Targhan/Arkos

Arkos Tracker 2.0.1 now released! - Follow the news on Twitter!
Disark - A cross-platform Z80 disassembler/source converter
FDC Tool 1.1 - Read Amsdos files without the system

Imperial Mahjong
Orion Prime

robcfg

Could you please post sample files so we can see what are you trying to achieve?


Because I have the feeling that we are missing something. I though you'd like a tool to separate the CPC and MSX versions of the code for assembling them, but obviously, the original file will have both versions and you need to specify which regions are for one platform or another.


How do you distinguish between CPC and MSX code? If you cannot use preprocessor definitions, do you define labels?

Targhan

Ok, here is precisely what I want to do. As you may know, I'm working on Arkos Tracker 2. The player can be used on various platforms (CPC, MSX, Spectrum, etc.). The player uses conditional assembling to target the platform, because adressing the PSG is obviously from a platform to another. So I have one main code (called "the original file") :



PLY_PLATFORM_CPC equ 1
PLY_PLATFORM_MSX equ 0
PLY_PLATFORM_SPECTRUM equ 0


org #1000
ld hl,1234
inc hl
... blablalba ...


;Sends the values to the PSG, according to the platform
if PLY_PLATFORM_CPC
   ld b,#f4
   out (c),0
   ...
endif


if PLY_PLATFORM_MSX
    ld b,#ff     ;Random stuff, I don't remember how to do this on the MSX
    out (c),a
    ...
endif


ret



So as you can see, when developing, I can assemble for any platform. However, the source is bloated: when the user is going to use it, he will get the code for each platform. I say "the source" is bloated, not the assembled code. What I want is to provide three source files to the user, one for each platform, with only the code it requires.
So PlayerAmstrad.asm would look like:




org #1000
ld hl,1234
inc hl
... blablalba ...


;Sends the values to the PSG, according to the platform
   ld b,#f4
   out (c),0
   ...


ret



In order to do this, I want an external tool to parse my file and removes everything that is between tags I would have given it ("if PLY_PLATFORM_MSX" / "endif") for example. I think that putting section markers as comments will make things easier :


if PLY_PLATFORM_CPC       ; [SECTION_CPC_START]
   ld b,#f4
   out (c),0
   ...
endif       ; [SECTION_CPC_END]



Am I clearer?
Targhan/Arkos

Arkos Tracker 2.0.1 now released! - Follow the news on Twitter!
Disark - A cross-platform Z80 disassembler/source converter
FDC Tool 1.1 - Read Amsdos files without the system

Imperial Mahjong
Orion Prime

andycadley


Yes. But the usual way to do that would be to mark up the main file with preprocessor comments (#if #define #end if etc) and then use that to generate all three variants you want.
Otherwise yes, you're basically limited to running regular expressions over the source with something like see, but as you say that's not a very user friendly approach.

robcfg

That's exactly the problem.


The preprocessor would be the right tool if the assembler could use the #if,#else#endif statements, which as I understand, it does not. So Targhan cannot work with his main file directly.


I still think that it would be possible to create a small tool that copies the file to another file while ignoring the required blocks of code.


Something like, check that the current char in the file is not the beginning of "if PLY_xxxx", then copy char to the new file, else skip chars until the end of the next endif.

Powered by SMFPacks Menu Editor Mod