The open source way to tackle this task usually involves the pdftotext command-line tool from the poppler-utils package (this is how it is called in Debian Linux; see http.//poppler.freedesktop.org for source code). This invocation works well for me. pdftotext -nopgbrk -layout input.pdf output.txt The resulting file, here called output.txt, contains plain text with the formatting approximately left intact. Now you can (manually or otherwise) save the tables from this file into files with .csv, .tsv or .dat endings, and with any luck, R's read.table function or the software of your choice will accept the formatting as it is. Otherwise, you will need to do some postprocessing/postediting.
They also work on Mac, Windows, Android devices, iOS and Blackberry devices. If you're looking for a low-cost, low-maintenance, offline, free to use, all-around, all-the-time OCR program, try OCR Tools Online, developed by OCR specialists, and designed to help people to read e-books in any browser or mobile device, including on their phones and tablets. How to use the OCR Tools Online for free to OCR e-books on your laptop, smartphone, tablet or other mobile device? Use the above-mentioned OCR tools, and start reading a free e-book, right here on our website. The OCR tools will extract the text of each page into a separate text file, with your name, date printed on it (if you've given the book this information), and the name of the person you've given the book, in case your name is not recorded. You can add the author name under the “Author's name” box on the.