Tuesday, September 20, 2011

What A Great Idea #15

Okay, so I know there are some tech-y types who read this blog, so please don't rip into me too hard when I lay this one out. I'm relatively certain it won't work based solely on the amount of knowledge I'm missing regarding it, but here goes.

So I use this software associated with Adobe Acrobat to recognize text in .pdf's and convert it to something I can manipulate. For those of you not familiar with .pdf's, they are essentially images of documents that can be sent around without a fear of them being altered too much or messed with. (if I got that wrong feel free to correct me)

So this OCR software is nice and all, but there are a bunch of situations where it doesn't work. If there's a picture in the pdf, or if there is renderable text, etc etc. But when I look at the screen, it's very obviously an "a" right there looking at me. At some point in the pdf-monitor process something in there recognized that certain pixels needed to be lit up so that my eye would recognize the shape of an "a".

So my great idea is more of a question. Would it be possible to access whatever part of the OS is determining what I see on the monitor and scan that data for certain arrangements of pixels? If they are sending out these couple hundred pixels, laid out just so, couldn't there be program that saw that as a letter and processed it as such for me?

I understand that there may need to be some kind of filtering system, or a determination of font color, or there's the high likelihood of getting every single letter of text on the entire screen, but is it possible? If so it would make my and a lot of other people's jobs a lot easier. I can't tell you the number of times I've had type page after page of stuff into a new word doc just because the OCR couldn't recognize it and when I tried to convert it it went all wonky (that's a technical term) and was essentially gibberish.

1 comment:

  1. There are programs that attempt this, but it takes a certain level of human interaction to get it right every time. The computer can only make a few assumptions and if there's a gray area, it might get it right, and it might not. So you're really still in the same boat. In those "human verification" images, the same idea is used, they're always funky squashed letters with extra lines and such to throw off such programs from discerning the proper text.

    ReplyDelete