Description:
A Java/.NET GUI frontend for Tesseract OCR engine. Supports optical character recognition for Vietnamese and other languages supported by Tesseract.
VietOCR is released and distributed under the Apache License, v2.0.
Features:
- Multi-platform (Java version only)
- Windows
- Solaris
- Linux/Unix
- Mac OS X
- Others
- PDF, TIFF, JPEG, GIF, PNG, BMP image formats
- Multi-page TIFF images
- Screenshots
- Selection box
- File drag-and-drop
- Paste image from clipboard
- Postprocessing for Vietnamese to boost accuracy rate
- Vietnamese input methods
- Localized user interface for many languages (Localization project)
- Integrated scanning support
- Watch folder monitor for support of batch processing
- Custom text replacement in postprocessing
- Spellcheck with Hunspell
- Support for downloading and installing language data packs and appropriate spell dictionaries