News:

Printed Amstrad Addict magazine announced, check it out here!

Main Menu

ripadsk - a utility to automate code archiving.

Started by copychr$, 22:22, 01 August 13

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

copychr$

Please, download a working version of ripadsk from post#17:--> ripadsk Update
Between here and there is a walk through, just like for any other game.
Give it the old once-over and come back for clarification of some detail.
Post #11 has a set-up to get CATs of DSK files on the fly. A standalone, minimum install, does not require ripadsk.
A complementary download of modified uca 1989-1992 dsks is found in post #15:
--> Download not_acu DSK set
---------------------------------------------------------------------------------------------------

This is an initial presentation, showing some output for evaluation.

The work is done by CPCxfs to bring out BAS programs, and by 2 versions of BASLIST to render the code.
Automation leads to a "File Central" for Locomotive code.
DOS was sufficient to start experimenting and put up at least a reasonable outlay.

Every treated BAS file generates 3 text and 3 html files.

As text:
Basic Code in ASCII form with unbroken lines.
Mode 2 Listings, WIDTH 80.
Mode 1 Listings, WIDTH 40.

From each of these text files an .HTM page is prepared.
These pages are HTML 4.01 strict and error free in CSE HTML Validator.
Final choice of encoding and font selection not done yet.

It would certainly help to hear comments from all perspectives.
So, please step in with ideas or specifics, especially if you have big IDEs ;-)
Notepad only goes so far and dos is dos.

Baslist versions, used to render Basic code to text, are still under development next door:
BASlist Java Tool to list BASIC files

The raw file output can serve in archiving projects and downloads on websites, adding a lot of quick value to hard work already done:

Inserting this material: [attachurl=2]
in an existing type-in page like Galaxian Revenge - CPCWiki

Getting "homey":

[attach=3]
cpcvol4no1288022.gif

[ ... ]

copychr$

After the gentle come along, the inevitable hard sell.

Sample output from a collection of programs verified to "comply" with both versions of Baslist.

Various text files in the top folder hold relevant logs.

The files are presented twice:
- once by format; code, mode 2 and mode 1 listings.
- and once clustered by name, with the original BAS program.

Please, toggle type and use the view pane in explorer to run through the files.

Gryzor

I feel this is important, but for someone not knowing what it's about it's all lost... I myself don't fully understand what's going on :)

copychr$

#3
Quote from: Gryzor on 17:36, 04 August 13
... I myself don't fully understand what's going on

Sorry there. I've been posting more extensively on the BASlist Java Tool to list BASIC files thread.
So I provided too little context at the start of this thread, but will make that up.

Two versions of Baslist.exe are under development by Kevin and Markus.
They can each extract bas code as ascii text from programs, but are still being tested.
One can find a working version of ripadsk there. It is called ripabug, has documentation, and is meant to identify Baslist problems.

The ripadsk project is over 10 years old. The aim is to show BAS code in a selection of useful text formats.
Kevin published Baslist at the time and with the use of CPCxfs is was possible to start extracting from dsk.

Both those programs run in batch mode, so DOS is used to get them to work in tandem.
At the time this was a reasonable way to proceed, now it looks very old fashioned.
Sticking to Dos is not by choice, but due to my technical limitations.

Without a doubt, once Baslist versions are good to go, they will be deployed properly, as on Java CPC.

For now, using their output already allows for preparation of code files in a repository and logs useful for archiving.

Here is an old picture I did not mean to show, but it does layout the scheme.
Stepping manually through these operations, nothing unfamiliar is encountered.

Ripadsk is the Dos version of the functions needed to automate these steps.

[attach=2]

In the right-hand column, broken code is the one way street, useful for publishing and reading.
The rest is live code, that can be stored, compared, modified, put back to dsk as ascii.

A working version of ripadsk will be up in a short while.
Once I'm done looking at all the output ;-)

[attach=3]

meteor.bas - acu1992

copychr$

#4
A complete ripadsk archive for evaluation.

- "rip_not1989a.zip" contains the archive.

- "from acu1989.zip" contains the original material and documentation used in the ripadsk archive.
   One file; arnjewel.bas was rejected because of a Baslist glitch. Perfectly good otherwise.

The remaining 33 programs are reproduced exactly by both versions of Baslist.
They have also been checked against the original ASCII version from the CPC.

By making this pre-selection, we can consider the ripadsk output "as if" Baslists were perfect already.
And that could interest pretty much anybody who handles or enjoys Basic code.

I've chosen to show a complete archive, before putting up the software.
That's what it's all about, and one can step through log files and outlay before "hassling" with any Dos stuff.
* Please see post #8: ripadsk - a utility to automate code archiving. for further details on those subjects.

The rip is named [not1989a], to avoid any confusion with the real thing and for evaluation only.

To access the code quickly, one can use the view pane in Explorer, but the snazziest look is in the browser.
Here is the most comfortable way to proceed:

1. Double-click "_Top Index.html" in the top folder and adjust the window.
    Choose a format.
2. Click on a program from the list.
3. Drag down the new tab with the Program Name and release.

[attachimg=3]  [attachimg=4]  [attachimg=5]

4. Adjust the new window as needed.

[attachimg=6]

From now on every file selected from the list will appear in this window.
Click back in each window, to change formats or see code already loaded in a tab.

Ctrl + A in the browser window will select only valid text or code.
That is the reason nothing else is visible there, not even the Program Name itself.

Devilmarkus

When you put your ear on a hot stove, you can smell how stupid you are ...

Amstrad CPC games in your webbrowser

JavaCPC Desktop Full Release

copychr$


Devilmarkus

Quote from: copychr$ on 21:55, 17 August 13
Summer to Fall in Fantasy Forest :: Add-ons for Firefox

It keeps coming back. Must be curious ... There's a squirrel about also ;-)

Cute... Installed ;)

Now let's get back to topic :D
When you put your ear on a hot stove, you can smell how stupid you are ...

Amstrad CPC games in your webbrowser

JavaCPC Desktop Full Release

copychr$

#8
This is a summary run-through of the top folder content and the file layout.
It is meant to complement the archive presented in post #4

Before any activity there are only three folders:

A container; holding two sub folders and anything else you wish to have handy, dsk files etc.
The ripadsk folder; receives the active dsk files, all of which will be ripped in one pass.
Folder z; ?

For every rip, a complete named archive is created in the containing folder.

[attachimg=1]

This is the situation immediately after the "not1989a.dsk" rip.
All files are present twice:

- grouped by format, with all the BAS programs in folder bas.
- clustered; each Bas program with the various text and html files in a named folder.

These 2 outlays are fundamentally different but complimentary according to needs of the moment.

The following working files are generated in the archive, top to bottom:
(the underscores are just to bring a bit of order to the sort)

_clean.bat.txt - by default it will remove all folders except the cluster, no doubles left.
                         set up to be easily edited, so it can remove any combination required.

_Current File List.txt - a tree view of all the folders and files in the archive.

_maketree.bat - generates the File List, can be run anywhere on the computer.
                          run it after modifications to the file structure or before storing final versions.

_mcat.txt - all CATs from all DSKs ripped, all file names are present.
                   it can be useful for file name searches in large archives, or to find doubles;
                   same named files can be identical or may need to be renamed, etc.

_peek.txt - the text version of the ripadsk DOS Window after the last run.
                  the fastest take on just what happened, dsk files present and programs extracted.

_Top Index.html - access to browser views of all html files. please see post #4

Files prefixed with the rip's name:

not... Rip Report  - overview of events, aggregate of file sizes, accepted or rejected status.
not... dsk_cat.txt - a separate CAT view for each dsk ripped. (DSKs would normally also be present)
not... _sort.txt    - a CSV file holding structured information. it is the data source for:
not... _sort.xls    - a prepared spread sheet that can import this information at the end of every run.
                             the data can be sorted, filtered, copied and pasted into collections.

x_*.ver - ripadsk version number, using a naming format by date.

copychr$

#9
Gear up for battle ... more like an easter egg hunt.

During a rip six applications are accessed:
- CPCxfs.exe
- Baslist.exe by ArnoldEmu - renamed to baslist_kev.exe
- Java BASList.exe by Devilmarkus - renamed to baslist_jav.exe
The latest versions of those programs are present in folder z.

- Notepad
- Excel - (open office not tested)
- GNU Sed - a download from: GnuWin32 Packages
          scroll down to "Sed" and click the Set Up link.
          download sed-4.2.1-setup.exe for Windows, and follow the default installation instructions.

This contraption then, gets "powered" by DOS.
In the Dos Window only one command can be given; rip foo
No switches or otherwise confusing options ;-)

In order for things to be "hands off" under Dos, a few props have to be set.
Nothing spectacular and, if you walk along, a little useful application will pop up.

The only essential thing is to give the Spread Sheet a fully qualified path to find it's data source.
It will not latch on to the CSV present in the same folder, but imports a copy from a fixed location.

The location must be without administrative rights.
After deep reflection the old C:\Temp came to mind. I've had one forever as a sort of limbo.
Currently the hard-coded path for ripadsk write-to use is "C:\Temp\Cpc".

With that done, another use of this fixed location can be made:

- Double-clicking a *.DSK file in Explorer, in order to open its CAT in a text file.

In the zip is a prepared "Cpc" folder. It should be copied into "C:\Temp".
[attachurl=2]

Next we'll get GNU Sed to work, and set up to quikcly access dsk files.

copychr$

#10
GNU Sed needs to be installed and on the PATH to function. This is not done during set up.
GnuWin32 Packages

My default path for sed.exe is C:\Program Files (x86)\GnuWin32\bin

The hard way:
To edit the PATH, open Control Panel - System - Advanced system settings - click the Environment Variables button.

The easy way:
Copy the 4 files from GNUWin32\bin into the Windows folder.

After proceeding either way, a test can be performed:  [attachurl=2]
After unzipping the folder, double-click "test_sed.bat".

The empty outfile.txt should now contain a confirming message.
On success, delete the folder. If all files are blank sometime, check if Sed is still on the path.

copychr$

#11
CAT files in Explorer:

After double clicking a *.dsk file, two windows open up.
One is a Dos Box, the other is Notepad showing the CAT image.

At the same time a text file appears, prefixed with the dsk's name.

To keep this file on the HD, Close the Dos Window first.
If one Closes Notepad first, it will be deleted.

[attach=2]

These CAT files are identical to the ones generated during any rip.
The difference is that one can check out a dsk any time, anywhere, and keep a record right next to it.
Also, if CPCxfs says: No image loaded!, it means the dsk can not be accessed with ripadsk.
Good to know beforehand. There are many of those in the "databox" collection, 42 tracks ...

- For things to work cpcxfs and cat.bat must be in "C:\Temp\Cpc\Cat".
   The Cpc.zip download is in post #9.

Right-click any *.DSK file and select Open with...
Then Choose default program, and browse for "cat.bat".
If you find it in either pane, select it as Default and OK.

Test by double clicking some DSK files. Chances are high it will work first time around.
There are no particular conflicts with other apps already on the .dsk extension.
Once it has been default, it will stick around forever.

This CAT thingy is quiet handy, so let me know if it won't go.

copychr$

#12
Well it's that time again.
My spouse has taken to throwing food down the stairs.
That's OK. She cooks very well and trying to catch it is good exercise.

Quickly then, a last test on XP.

Have to bring in:
- The Container with ripadsk and it's z folder.
- A copy of the "Cpc" folder into C:\Temp
** Edit; The most recent and updated Download of these files is found in post #17: --> ripadsk Update
             Post #15 has a set of prepared dsk files to try out immediately.

- GnuWin32 sed-4.2.1-setup.exe and install it.
   GnuWin32 Packages
   Dump sed.exe and the 3 DLLs from the GnuWin32\bin into the Windows directory (quick path fix).

- Copy prepared DSK file(s) from post #15 into the ripadsk folder.
   Run #_RIP.bat from there & read some important info.

  At the >z prompt type rip foo
  The output folder is rip_foo

Things are proportionally slower on an older machine, still quite acceptable.
Running ripadsk off an old USB key on a USB 2.0 slot proved possible, but way too slow.

Don't ask how many DSKs you can rip in one go.
I stopped at 41, that being the supply at hand.

The complete 1989-1992 set has been sifted to produce only good code.
I expect to post it soon. In the mean-time, please take into account that all other output is questionable.

Still, it's a good way to check back on your own stash of old files and get logs and cats.

Next, some more on using the spread sheet and arranging the Dos Window.

copychr$

#13
Setting a Dos Box:

[attachimg=1]

[attachimg=2]

Right-click the title bar of the ripadsk Box, click properties and set up to your liking.
These are my preferred settings:

[attachimg=3]

[attachimg=5]  [attachimg=4]

The Box will come up this way until you change it again.
Set width to 120-140 if there are many files.

copychr$

#14
 - Set a default mono spaced font for log files:

All text files require the use of a mono spaced font, set in Notepad!
Courier New 10 will do, but the choice is open of course.
Consolas 11, the first choice for browser rendering, reads well and has barred zeros.

- The Spread Sheet:

The sort.txt file and the spread sheet record every Bas program found.
It is the best way to find same named files. Those can be identical (use the bytes info) or different.

In any case, the LAST Bas file processed with a same name will over-write all others.
It is possible to have a series of links that all point to a single file.
Harmless if these programs were identical, otherwise one will need to rename.

By default the spread sheet is ON. When tooling around that can be a needless distraction and it can be set to OFF.
If one needs to sort things out or keep some serious records, switch back to ON state.
This very simple operation is described in the info.txt file, which can also be called up from the command line by typing z> info

Things have been set up with some care (by you), so that when the sheet opens and the data connection is enabled, the C:\Temp\Cpc folder opens.
The data source file, prefixed by the rip's name, should be visible and upon validation the data is imported. Save and close it then, please.
If one needs to do more work, close only the active window and leave Excel minimized, for faster re-opening of the next spread sheets.

It is possible, but quite a bother, to retrieve information from the original sort.txt in the archive.
If a data import was missed, it is easier to redo the run.

The .xls extension activates the default spread sheet. Open Office not tested though.
For a rip named "spreadsheet", one would see this:

[attachimg=1]
[attachimg=2]
[attachimg=3]

copychr$

#15
Download of modded ACU dsk set for use with ripadsk.

The material presented here has its source in the type-ins archive at: ftp://ftp.nvg.ntnu.no/pub/cpc/typeins/
The files correspond to the 1989-1992 ACU collection, 7 dsks in all.

Mr Nicholas Campbell has authored that long-time collection, and also gratified us with very complete documentation on every program.
That original documentation, along with the corresponding identification files, has been added to the download, but not the dsks themselves.

I failed to contact Mr Campbell until indecently late to inform him of my intentions.
He has not had time to give any consent or opinion on the matter.
If he wishes so, this modded dsk set will be withdrawn and a more random file selection made.
My hope is, that the addition of readily available code to the existing documentation and file set, will appear complementary.

Some 275 unique Bas programs can be found in the collection.
After verification against original ASCII files, 216 Bas programs are rendered correctly by both current versions of Baslist.
The remaining 59 programs have been withheld, mostly for very minor discrepancies.
They are of course perfectly good but can not yet be used under Baslist.

A few files had to be slightly renamed; same named but different files or & and + characters replaced.
All excluded files are still present, but have been given the extension .BAN
Otherwise, nothing has been deleted or changed in the original file structures.

[attachurl=2]

Gryzor

Just read the explanation and the rest of the thread. How very interesting :)

copychr$

#17
Thank you, Gryzor, for your interest, always encouraging.
And a thank you to Nich. He has no problems with seeing this "not" set on the wiki to help illustrate the uses of Baslist.
He does point out that many of these type-ins have been "bug-fixed" and may not be exact copies of the UCA source.

Until Baslist programs are ready, this lot should be enough to get an idea of the possibilities.
Also, there will be only one selection like this, it's too much work for any temporary use.

So, what next then? An update of course:

[attachurl=2]

The "Cpc" folder is bundled. Please copy it to  a "C:\Temp\" path.

A lot of old code has been wasted and replaced with a line or two of proper Dos, or whatever it's called now.
That gets rid of much overhead and HD access and things are up to speed, sort of.

The program file names seemed too cryptic to be of use; they have been made more explicit.

An important change:
For best results in HTML it is necessary to use the iso-8859-1 character set, which mimics the default Windows ANSI source.
The unicode utf-8 charset used before gets dumped. It simply has too many bits for it's own good (joke, Markus, joke ;-)
That subject will get a separate work-up, involving invisible mystery characters  8)

Also, you may now thrash all your precious archives, which is always worth a laugh.

copychr$

#18
As long as things are on hold, one can really only play around.
Ideally, Baslist output should be identical to the original Bas program AND should run when put back to the CPC.
That acid-test can be passed on most occasions. Even when the output shows some quirks, many of these self-correct in the CPC environment.
A few problems remain and lead to failure to run.

When using ripadsk on a random dsk, all programs that don't have identical Baslist output are filtered to separate folder "rejected".
Many times these differences are futile as can be seen there.

What will still show up in current ripadsk files are:
sticky stuff:
CALL&BCO2 for CALL &BC02, MEMORY&7FFF for MEMORY &7FFF, POKE&A701 for POKE &A701 and similar.

irregular rounding:
(15.666000001*x) for (15.666*x), g=(c+1)*31.899999999 for g=(c+1)*31.9, D+(9.109999999*K) for D+(9.11*K)

Both versions of Baslist show the above output, but they normalize again in the CPC. You may find other discrepancies of course.
Scientific notation is a problem. I've seen 1E-38 rendered as 0 by both and also found such code completely missing.

If one were to meld the present Baslist programs and retain only their best output, two problems would remain:
Scientific notation and non-ASCII or illegal control characters.

Of the first I know wot, but some obfuscation in the Basic source seems to be a headache.

Of the second, irregular character rendering, one finds a beautiful example in minicalc.bas: [attachurl=2]
This is the "ascii" text version from the CPC: [attachurl=3]
Kevin's Baslist gives exactly the same result.

Short of elaborate programming and a special font set, one can not display such an original listing outside of the CPC.
Nor can one expect Baslist to do so, and it looks like "identical" listing will remain impossible here.
However, putting the text back to the CPC, gives a perfect listing and a program that runs!

[attach=4]

On the whole this is rather fun to walk through. It will be detailed in a sister thread, where Ripadsk gets cloned as Ripascii.
The same text and html files are produced, but the source will only be original ASCII from the CPC.

Powered by SMFPacks Menu Editor Mod