Extracting files from Amiga disk magazine > Script needed >
by GeordieJedi from LinuxQuestions.org on (#6GHGE)
Hi there guys.
I was hoping that you could help me with an obscure problem that I'm trying to solve.
Background:
Like a lot of people my age, I grew up with, and love the Amiga range of computers.
(Linux is the closest feeling I get to using an Amiga these days).
(For those who may be unaware BITD, before we had the internet) we used disk magazines
to store/distribute and display data and articles on floppy disks.
One of these was called Grapevine and it was one of the best diskmags ever made.
Myself and others are trying to convert these articles back into plain text / PDFs
So that they can be read much easier and without having to run an emulator.
However they were crunched down with Amiga archiving software called Powerpacker,
and then altered slightly to hide this, and obfuscate the archiving mechanism.
(So I can't just use Powerpacker on an Amiga emulator to recreate the files).
Issue:
I'm trying to -
2.1. Scan these files and strip out the individual articles.
2.2. Unpack /(de-crunch) some files.
I believe that once I have been able to extract the articles from the files
I will be able to put them into Powerpacker on the Amiga emulator,
unpack the files, then save each article individually.
Troubleshooting:
I have tried the following -
1. OCR using OCR reader = Terrible results.
2. OCR using gscan2pdf = Reasonable results. However you still need to do a lot of manual
clean up on each of the text files.
This would constitute a massive amount of work for the thousands of articles
that are in the complete set of Grapevine disks.
3. Looked at the files in a HEX editor
One of the guys on the EAB Amiga Forum has taken a look using a hex editor
and noticed the following details about the article files
Code:$2 *bytes - Number of articles
[repeat for each article]
$2c bytes - Article title
$4 *bytes - Article offset in file
$4 *bytes - Article file size
rest.. *data files packed with powerpack with PP20 replaced with TXT!He has also very kindly -
4.1. Created a Windows script/console app to do this (created using C#).
4.2. Provided the source code for this script/app (see attachment).
As a result I then tried -
5. Using the Windows console app but using -
5.1. A 32-Bit Wineprefix
5.2. A 32-bit Play on Linux install
Neither of which worked.
Questions:
Can you help create a bash script that would -
1.1. Scan the grapevine articles files.
1.2. Split out the articles from the file.
1.3. Change the header from TXT! back to PP (to denote that it's a powerpack file)
This should allow me to then use powerpacker to save each text file individually
so that we can obtain the separate text files, once again.
I was hoping that providing the source code would make this a whole lot easier
for the programmers amongst you.
Useful extra details:
Source code - For the Windows CLI/app (above)
Stored as a .txt file
I had tried to also upload the Windows console/app
and an actual Grapevine bin file containing the archived articles
however the upload system would not recognize them as valid files
And I didn't want to rename them in the hopes that it would
pass the filter.
A MASSIVE TIA - for any help or advice.
Attached Files
I was hoping that you could help me with an obscure problem that I'm trying to solve.
Background:
Like a lot of people my age, I grew up with, and love the Amiga range of computers.
(Linux is the closest feeling I get to using an Amiga these days).
(For those who may be unaware BITD, before we had the internet) we used disk magazines
to store/distribute and display data and articles on floppy disks.
One of these was called Grapevine and it was one of the best diskmags ever made.
Myself and others are trying to convert these articles back into plain text / PDFs
So that they can be read much easier and without having to run an emulator.
However they were crunched down with Amiga archiving software called Powerpacker,
and then altered slightly to hide this, and obfuscate the archiving mechanism.
(So I can't just use Powerpacker on an Amiga emulator to recreate the files).
Issue:
I'm trying to -
2.1. Scan these files and strip out the individual articles.
2.2. Unpack /(de-crunch) some files.
I believe that once I have been able to extract the articles from the files
I will be able to put them into Powerpacker on the Amiga emulator,
unpack the files, then save each article individually.
Troubleshooting:
I have tried the following -
1. OCR using OCR reader = Terrible results.
2. OCR using gscan2pdf = Reasonable results. However you still need to do a lot of manual
clean up on each of the text files.
This would constitute a massive amount of work for the thousands of articles
that are in the complete set of Grapevine disks.
3. Looked at the files in a HEX editor
One of the guys on the EAB Amiga Forum has taken a look using a hex editor
and noticed the following details about the article files
Code:$2 *bytes - Number of articles
[repeat for each article]
$2c bytes - Article title
$4 *bytes - Article offset in file
$4 *bytes - Article file size
rest.. *data files packed with powerpack with PP20 replaced with TXT!He has also very kindly -
4.1. Created a Windows script/console app to do this (created using C#).
4.2. Provided the source code for this script/app (see attachment).
As a result I then tried -
5. Using the Windows console app but using -
5.1. A 32-Bit Wineprefix
5.2. A 32-bit Play on Linux install
Neither of which worked.
Questions:
Can you help create a bash script that would -
1.1. Scan the grapevine articles files.
1.2. Split out the articles from the file.
1.3. Change the header from TXT! back to PP (to denote that it's a powerpack file)
This should allow me to then use powerpacker to save each text file individually
so that we can obtain the separate text files, once again.
I was hoping that providing the source code would make this a whole lot easier
for the programmers amongst you.
Useful extra details:
Source code - For the Windows CLI/app (above)
Stored as a .txt file
I had tried to also upload the Windows console/app
and an actual Grapevine bin file containing the archived articles
however the upload system would not recognize them as valid files
And I didn't want to rename them in the hopes that it would
pass the filter.
A MASSIVE TIA - for any help or advice.
Attached Files
gv_extractor_source_c_sharp.txt (1.2 KB) |