Significantly reduce the file size of scanned PDF documents without noticeable loss of quality.
-
Updated
Jun 8, 2022 - Swift
Significantly reduce the file size of scanned PDF documents without noticeable loss of quality.
This is a Python application that converts non-readable PDF files, such as scanned documents, into readable Word documents. It achieves this by first converting the PDF files into images and then extracting the text from the images to create the Word documents. The application provides a user-friendly interface to do the above task.
A wrapper on top of python-OCR tools such as pytesseract and easyocr, to recognize and extract text embedded in images. Also, convert scanned-PDFs to text searchable PDFs.
Add a description, image, and links to the scanned-pdf-documents topic page so that developers can more easily learn about it.
To associate your repository with the scanned-pdf-documents topic, visit your repo's landing page and select "manage topics."