Ethan Magoc, Student/Intern
 |
Erie | PA | United States | Posted: 11:22 PM on 03.28.11 |
->> Hello SS,
As my senior project, I'm currently digitizing every issue of my college's student newspaper that has been published, dating to 1929. I'm using a Canon 7D to scan each page, so there's plenty of resolution even after scaling the images down to make PDF file sizes more reasonable (25 to 30 mb each). I then use ABBYY FineReader Express for Mac, which I'd estimate is about 80 to 90 percent accurate in its PDF output.
I had initially planned to use Issuu.com to house all the issues. Its screen reader is among the best I've found and it will also pick up all that embedded, OCR'd text in the uploaded document. Only problem is I'm guessing I'll eventually run into some sort of storage limit (though the free version is rumored to have unlimited storage). I have estimated somewhere in the 30 gig range for all 80-some years.
Approx 1,200 issues x 30mb = 32GB
There are a few selected issues (non-OCR versions from the 1920s, 60s and 70s) up as tests on our 2010-11 issue archive: http://issuu.com/themerciad
My question, then, for all of you: has anyone ever tried to set up a digital archive with similar PDF searching and viewing capabilities? I had been looking into hiring one of our more talented web students to look into this, but he's going to be far too busy during the duration of spring term.
This post is mostly a shot in the dark here (hence the OT), but I'd be most grateful to anyone who has any advice to offer. |
|