• ivanafterall ☑️@lemmy.world
    link
    fedilink
    English
    arrow-up
    10
    arrow-down
    4
    ·
    6 days ago

    This has been an immensely helpful feature of both Claude AI and ChatGPT. I have tons and tons of historic sources and suddenly, I’m not fighting with non-working OCR options. It’s pretty great.

      • ivanafterall ☑️@lemmy.world
        link
        fedilink
        English
        arrow-up
        11
        ·
        edit-2
        6 days ago

        I don’t have a specific figure for you. My use-case is I’m trying to write a non-fiction book. I’ve got a ton of old newspaper articles in PDF format. The Library of Congress’ built-in OCR is very helpful, but very lacking and, in some cases, can miss large swaths of pages or generate really unhelpful gibberish that requires painful cleaning. I’ve had similar results from every other OCR tool I’ve tried.

        Thus far, in using Claude/ChatGPT for transcription of a few dozen articles, I’ve only had to fix one individual stray word a few times. It’s been very close to perfect in my limited testing. High 90%. Impressively, with old newspaper articles where words have worn away or are otherwise very hard to make out even for me, it has done a great job of inferring/recognizing, where OCR would start generating gibberish. I haven’t tried hand-writing and suspect that’s a different beast, but I know there are tools that have cropped up to that end.