OCR Tips and Tricks from eMOP
IDHMC Lead Programmer and self-dubbed "code monkey," Matt Christy, has posted a series of short reflections about eMOP's Tesseract experimentations. We hope you'll visit the page to get a look at the lessons we've learned, the goals we're trying to achieve, and the tips and tricks we have to offer the OCR community.
Reflecting on a number of hurdles the eMOP project has overcome in the past few months, Matt shares customized instructions for installing Tesseract on a Mac, our process(es) for training with Tesseract, a compilation of how our team has tested with Tesseract to find specialized settings for specialized OCR circumstances, and the eMOP naming conventions for data output.
Take a look at the eMOP software page for current and future posts by the team, as the IDHMC's eMOP team will continue to post in the coming weeks, as we approach some major milestones in our Mellon grant schedule.