Wednesday, June 23, 2010

Google Docs adds OCR, converts images and PDFs to text

Google Docs adds OCR, converts images and PDFs to text

Filed under: Text, Web services, Google

Google Docs continues to make the case for dumping your desktop work apps, this time with a useful new text recognition feature that converts PDFs or images into plain, editable text. This new OCR feature -- that's optical character recognition -- is quite accurate, and worked pretty well on some old college textbooks scans I had laying around on my hard drive. Things are a bit tricky when you've got a page with multiple columns -- your words might not end up in the right order, but they'll all be there, accurately recorded.

To use OCR, look for the " Convert text from PDF or image files to Google Docs documents" checkbox when you're uploading a file. The file will show up in Google Docs as a text document instead of its original format, so if you want to share the image, you'll have to upload it again with the box unchecked.

Google Operating System tested the new feature and didn't find it quite as accurate as I did. I agree with them that the loss of formatting is a problem, but the OCR was better than the 90% accuracy they noted in their test. Your mileage, obviously, may vary. The typeface, font size and scan quality of your PDF will all affect the results, but it should definitely be easier than re-typing the whole thing by hand.

We also previously covered Google Docs' OCR feature when it was still an experiment.

Google Docs adds OCR, converts images and PDFs to text originally appeared on Download Squad on Mon, 21 Jun 2010 20:30:00 EST. Please see our terms for use of feeds.

Read�|�Permalink�|�Email this�|�Comments


Miranda Kerr
Ana Ivanovi?
Carrie Underwood

No comments:

Post a Comment