OCR Challenges and Hopes for the Web

With the year 2011 rapidly coming to an end, I am thinking about how to proceed with the Hilgenberg Archive. Access has always been our main objective.

UPDATE: I have applied for a trial version of the ABBYY Historic OCR software. I am very excited to hear from the IT Manager at MdHS as to how we can get started. Please see the comments section of this post.

I was hoping that the wonderful program that Library of Congress is leading, the National Digital Newspaper Program (NDNP), would open itself to German-language titles but so far this is the criteria:

Newspaper titles that document a significant minority community at the state or region level during the target time period (1836-1922) should be considered as a means to balance content. Only English, French, Italian, and Spanish language titles may be converted during this NDNP phase.

Other than the fact that DDC is German, it would fit the NDNP quite well.

After speaking with the IT Manager at MdHS, we have considered the best options for creating a German Heritage Archive website which would prominently feature the Hilgenberg Archive. This would allow visitors to view the newspaper images (most likely as PDFs but this has not yet been determined) and to find out more about the project. I would also include other German-related items in the MdHS collection. This will take careful planning as we would like a website that could be manipulated later on, is designed well, is easy to use, and acts as a forum for all interested visitors.

This is what we will be considering for the next few weeks. After we have an understanding of what we would like to see on the website, we will request quotes from web designers.

  1. Hi Michael, I don’t recall receiving this link! Thank you so much for passing it along as well as the discussion forum from Archive.org.

    I’ve applied for a trial which will enable me to test 50 pages of the newspaper. I have to involve the IT Manager because I cannot use the trial on my computer. He is looking in to which machine would be best to set this up. This may be a great option for us!

    Thanks again,


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: