A Labor of Love – Doc Savage

A

I recently posted about finally finding a workflow to properly convert a cache of old Doc Savage books to ePub files.

In the last installment, I used Calibre to convert the original BBeB format (Sony) files into rich text files. Then I was using Nisus Writer Pro to clean them up, and then finally copy the text to Apple Pages for the final tweaks.

This was workable, but I learned a few things, and improved my processes:

First – OCR Conversion issues

Not sure if this was caused by the original conversion from print, or the translation into the BBeB format, but instead of proper CR/LF’s it was littered with soft newline characters. Perhaps in the late 1990’s when they were originally ripped, this was state of the art.

This caused some really weird things when it created a ePub. Weird text display, and faulty chapter breaks. Oddness.

Sadly, I figured this out after completing 20 or so conversions.

Lame.

Fixing the newlines into proper paragraph breaks

Fortunately, there was a really kickass utility called Wordservice – freeware from Devonthink, that adds a bunch of really killer services. Installing it, adds a slew of options in the “Services” menu option under the Application name menu item (sorry, Mac only).

The “Reformat” option goes through and replaces the newlines with CR/LF as is proper. Boom. And it works in all programs that allow you to create or edit text.

Multiple line breaks

Not sure if this was an artifact of the original conversion, but each paragraph had a blank line between them. Perhaps the original conversion to BBeB didn’t have any formatting options to set line spacing.

Whatever, it made the amount of text per page really sparse.

Wordservice to the rescue again. It has an option to remove multiple feeds.

Perfect.

End Result

In about 3 minutes, I was able to take a munged rich text file, fix all the newlines, remove the multiple spaces, and check each chapter.

Towards the end of my 181 books I fixed, I was getting quite efficient.

The original conversion

Speaking of a “labor of love”, whomever originally took printed pulps, cut the bindings, scanned them, fixed the OCR fluff, and created proper books with chapters and formatted them for BBeB was dedicated.

I am astounded by how few typos, or fluff items I had to fix. They even captured some commonly used (but authentic) word choices. Makes the 181 passes through my workflow seem trivial. I really wish I knew who did that original work.

Summary

Three weekends, probably 30 hours total, 181 books, about 15 redone as I improved my workflow. I now have a very clean collection of all 181 original Doc Savage stories.

If anybody would like a copy, drop me an email and I will make a link available.

About the author

gander

Product Manager in Tech. Guitar player. Bicycle Rider. Dog rescuer. Techie.

By gander

Posts

Subscribe to Tralfaz via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 4 other subscribers
February 2017
M T W T F S S
 12345
6789101112
13141516171819
20212223242526
2728  

Spam Blocked