Optical Character Recognition

Discussion in 'The Barracks' started by von Poop, Oct 14, 2009.

  1. von Poop

    von Poop Adaministrator Admin

    OCR's a hit & miss business, when it works it's bloody marvellous, but when it doesn't it drives you mad.

    I've tried a few freeware ones like 'TopOCR' & 'SimpleOCR', both good for sharp clear text but fall over completely on even quite good quality photocopies or original documents.

    Been thinking of getting a scanner upgrade anyway, but has anyone used any particularly good OCR programmes?
    Wondering also if anyone perhaps has a proprietary one that came bundled with a scanner that's proved good on the more difficult documents?

    ~A
     
  2. Paul Reed

    Paul Reed Ubique

    My Canon scanner is now 5/6 years old, but it was bundled with Pictbridge, which I have found pretty good. What it must be like now, 6 years later, with a latest version, I can only but guess at!
     
  3. Ron Goldstein

    Ron Goldstein WW2 Veteran WW2 Veteran

    Adam

    I've been using http://www.abbyy.com/finereader_sprint/ for years now but can't remember which printer it came bundled with :(

    I see that they do a free trial....... worth a shot ?

    I see you say OCR's a hit & miss business, when it works it's bloody marvellous, but when it doesn't it drives you mad.


    Ain't that the truth :)

    Regards

    Ron
     
  4. militarycross

    militarycross Very Senior Member

    VP,
    I like Ron have been using the Abbyy Finereader that came with my Canon 4400. I liked it so much, I bought a second one to use at work. It will do slides. The OCR is teachable. I love the 1200 dpi it does with pictures. This unit is a few years old and there are probably better ones out now, but I think a dedicated scanner instead of one of these scanner, printer, etc is the ticket.

    cheers,
    phil
     
  5. Ron Goldstein

    Ron Goldstein WW2 Veteran WW2 Veteran

    It just occured to me that there may be some members on the forum who have yet to succumb to the delights/exasperation of OCR.

    There are plenty of explanations of OCR (Optical Character Recognition) on the internet but if you want a simple and quick intro read on:

    If you have an image on your PC (say a jpg.) that shows a document then it will always be just an image and you cannot edit the text.

    What OCR does is to scan the image and every time it identifies a character it will store that info in it's correct place. On completion the program will reproduce the whole image as, for example, a WORD document which can then be edited exactly as any other document
    .
    There is much more to OCR than that, of course, but that's enough for you to get interested.
     
     
  6. m kenny

    m kenny Senior Member

    Some times it works and *7Rw2 JkkR;' d87*6-244,;0= It dON'T!
     
  7. von Poop

    von Poop Adaministrator Admin

    Some times it works and *7Rw2 JkkR;' d87*6-244,;0= It dON'T!

    Don't I just know that feeling...

    Cheers for the suggestions so far chaps, I suppose like most computer things it'll get better and better. Kind of staggers me that we can even do a crude version of it at home these days at all.

    ~A
     
  8. Peter Clare

    Peter Clare Very Senior Member

    Been using Omnipage pro 14 for some time now in conjunction with my Lexmark All In One printer/scanner, and up to now cannot fault it even though there are more up to date versions available.

    Regards
    Peter
     
  9. Za Rodinu

    Za Rodinu Hot air manufacturer

    Abby FineReader works well for me, but I had to buy it. The secret with them all, I suppose, is scanning on high resolution to decrease the chance of error, and then selecting the right scanning language.

    FR will read scans, files kept, PDF files, etc.
     
  10. Slipdigit

    Slipdigit Old Hickory Recon

    I assiduously avoid OCR software, my blood pressure can't handle it.
     
  11. Steve G

    Steve G Senior Member

    Ron; Cheers for the synopsis. I'd been avoiding this thread simply because I didn't know what it was about.

    Having looked, yours is indeed the only post which makes it any clearer :)
     
  12. PsyWar.Org

    PsyWar.Org Archive monkey

    Must agree about ABBY Finereader, it's a great OCR programme. I'm using the Pro version which has a number of different languages built it. Funny enough it seems to recognise German text better than English (as long as it is not in a Fraktur typeface, of course).

    The thing that really drives me crazy with OCR software is when they do a good job of reading a bad original and then make stupid mistakes in an excellent original. I've never figured that one out!?

    But to be honest, if your original text is not perfect then you're still better off learning to touch type then waste your time with OCR; epsecially if there is a lot of formatting that needs to be kept or changed.

    For example if I have twelve photocopied pages of an old typewritten document, I can re-type them faster and more accurately than scanning them, doing the OCR and then correcting all the mistakes.

    Where it works well and is a benefit is when the original documents are professionally printed, first generation copies.

    Lee
     
  13. Ron Goldstein

    Ron Goldstein WW2 Veteran WW2 Veteran

    Hi Lee

    Nice to see you keeping in touch.

    I'm terribly sorry, but on spotting your contribution I just couldn't resist posting this !

    "But to be honest, if your original text is not perfect then you're still better off learning to touch type then waste your time with OCR; epsecially if there is a lot of formatting that needs to be kept or changed"


    I glad to see your touch typing is no better than mine :)

    Best regards

    Ron
     
  14. PsyWar.Org

    PsyWar.Org Archive monkey

    :icon_smile_blackeye You got me there Ron, that kind of destroys my argument! :D

    Hi Lee

    Nice to see you keeping in touch.

    I'm terribly sorry, but on spotting your contribution I just couldn't resist posting this !

    I glad to see your touch typing is no better than mine :)

    Best regards

    Ron
     
  15. sapper

    sapper WW2 Veteran WW2 Veteran

    Mine came with cheap printer scanner that a friend gave me. (he already had a better one and it came with the new computer)
    It works very well indeed reading and deciphering worn out war time documents.

    What I want is a Voice recognition to text software that works. I have "Dragon" and it dont!
    Sapper
     
  16. Owen

    Owen -- --- -.. MOD

    I tried to use my OCR program yesterday after reading this.
    Using my Lexmark 2600 series priner/scanner.
    I scanned a letter from a bank as it was nearest thing to hand.
    I must say it was rubbish & then I couldn't edit it as I don't know my 25 digit code to get Word working & can't find it anyway.
    Still if I (or rather the kids for school) need a WP prog I can use Works WP & the scanner works fine usually as you can see form alot of my posts on here.
     
  17. Smudger Jnr

    Smudger Jnr Our Man in Berlin

    I have never possessed a programme as outlined on this thread, but the idea sounds really good if it functions.

    It really sounds from members personal experience that a lot still do not work as promised.

    Perhaps I will stick with my steam driven methods for the time being at least :D

    Regards
    Tom
     
  18. roodymiller

    roodymiller Senior Member

    you can use abbyy for free (10 pages per day) online. never tried it myself, but could be worth a go...

    ABBYY FineReader Online
     
  19. roodymiller

    roodymiller Senior Member

    just tried it...

    absolutely rubbish (nearly swore then!!).

    tons of mumbo jumbo.
     
  20. m kenny

    m kenny Senior Member

    Keep trying. Only by experience do you get the skill to recognise what will copy and what wont. Low res wording simply will not register so you might have to mess about with the contrast on the original. Sometimes I have to scan a jpg, alter the contrast and maybe re-size it. Then print it out before re-scanning it to make a copy that is good enough for OCR!. I have to say 75% of the time it works just fine but even the best have a number of errors and you HAVE to check it with a line by line read.
    You can not scan a document that has more than one column. If a page has 2 columns then it must be done one column at a time. Page titles and nunbers are also to be avoided.
     

Share This Page