PDF & Image OCR — OpenClaw Plugin

Extract text from scanned PDFs and images using the PDFAPIHub API. This OpenClaw plugin provides Tesseract OCR with 100+ languages, document photo enhancement, and visual document comparison.

What It Does

Read text from scanned documents and photos using OCR, clean up document photos into professional scans, and compare documents for visual similarity.

Features

PDF OCR — Rasterise and OCR scanned PDFs with configurable DPI (72-400)
Image OCR — OCR photos of receipts, documents, signs, business cards, meter readings
100+ Languages — Tesseract language packs, combine with + (e.g. eng+hin+fra)
Word-Level Bounding Boxes — Per-word positions and confidence scores
Character Whitelisting — Restrict to digits only for invoice amounts or meter readings
Image Preprocessing — Grayscale, sharpen, threshold, resize/upscale for noisy inputs
Document Scan Enhancement — Edge detection, perspective correction, brightness/contrast
Color Modes — B&W (best for text), grayscale, enhanced color
PDF Output — Export scanned documents as single-page PDFs
Document Comparison — Visual similarity scoring with feature matching, SSIM, or phash
Confidence Scores — Per-page and per-word OCR confidence percentages

Tools

Tool	Description
`ocr_pdf`	OCR scanned PDFs with multi-language Tesseract
`ocr_image`	OCR images with preprocessing options
`scan_enhance`	Clean up document photos into professional scans
`compare_documents`	Compare two images/PDFs for visual similarity

Installation

openclaw plugins install clawhub:pdf-ocr

Configuration

Add your API key in ~/.openclaw/openclaw.json:

{
  "plugins": {
    "entries": {
      "pdf-ocr": {
        "enabled": true,
        "env": {
          "PDFAPIHUB_API_KEY": "your-api-key-here"
        }
      }
    }
  }
}

Get your free API key at https://pdfapihub.com.

Usage Examples

Just ask your OpenClaw agent:

"Extract text from this scanned PDF"
"OCR this document in English and Hindi at 300 DPI"
"Extract only the numbers from this invoice scan"
"Read the text from this receipt photo"
"Clean up this document photo to look like a scan"
"Scan this photo then OCR the result"
"How similar are these two documents?"

Use Cases

Invoice Processing — OCR scanned invoices to extract line items and totals
Receipt Scanning — Extract text from receipt photos for expense tracking
Document Digitization — Convert legacy paper documents to searchable text
Multi-Language Documents — Process documents in Hindi, French, German, Arabic, etc.
Business Card Reading — Extract name, phone, and email from card photos
Meter Reading — Extract digits from utility meter photos with character whitelisting
Document Photo Cleanup — Turn phone photos into clean, professional scans
Fraud Detection — Compare documents for visual similarity
QA Testing — Compare rendered documents before and after changes

API Documentation

Full API docs: https://pdfapihub.com/docs

License

MIT

PDF & Image OCR