@rishabhdugar

PDF & Image OCR

OCR text extraction from scanned PDFs and images. Tesseract OCR with 100+ languages, configurable DPI, word-level bounding boxes, character whitelisting, image preprocessing, and document scanning with perspective correction. Powered by PDFAPIHub.

当前版本
v1.0.0
code-plugin社区source-linked

PDF & Image OCR — OpenClaw Plugin

Extract text from scanned PDFs and images using the PDFAPIHub API. This OpenClaw plugin provides Tesseract OCR with 100+ languages, document photo enhancement, and visual document comparison.

What It Does

Read text from scanned documents and photos using OCR, clean up document photos into professional scans, and compare documents for visual similarity.

Features

  • PDF OCR — Rasterise and OCR scanned PDFs with configurable DPI (72-400)
  • Image OCR — OCR photos of receipts, documents, signs, business cards, meter readings
  • 100+ Languages — Tesseract language packs, combine with + (e.g. eng+hin+fra)
  • Word-Level Bounding Boxes — Per-word positions and confidence scores
  • Character Whitelisting — Restrict to digits only for invoice amounts or meter readings
  • Image Preprocessing — Grayscale, sharpen, threshold, resize/upscale for noisy inputs
  • Document Scan Enhancement — Edge detection, perspective correction, brightness/contrast
  • Color Modes — B&W (best for text), grayscale, enhanced color
  • PDF Output — Export scanned documents as single-page PDFs
  • Document Comparison — Visual similarity scoring with feature matching, SSIM, or phash
  • Confidence Scores — Per-page and per-word OCR confidence percentages

Tools

ToolDescription
ocr_pdfOCR scanned PDFs with multi-language Tesseract
ocr_imageOCR images with preprocessing options
scan_enhanceClean up document photos into professional scans
compare_documentsCompare two images/PDFs for visual similarity

Installation

openclaw plugins install clawhub:pdf-ocr

Configuration

Add your API key in ~/.openclaw/openclaw.json:

{
  "plugins": {
    "entries": {
      "pdf-ocr": {
        "enabled": true,
        "env": {
          "PDFAPIHUB_API_KEY": "your-api-key-here"
        }
      }
    }
  }
}

Get your free API key at https://pdfapihub.com.

Usage Examples

Just ask your OpenClaw agent:

  • "Extract text from this scanned PDF"
  • "OCR this document in English and Hindi at 300 DPI"
  • "Extract only the numbers from this invoice scan"
  • "Read the text from this receipt photo"
  • "Clean up this document photo to look like a scan"
  • "Scan this photo then OCR the result"
  • "How similar are these two documents?"

Use Cases

  • Invoice Processing — OCR scanned invoices to extract line items and totals
  • Receipt Scanning — Extract text from receipt photos for expense tracking
  • Document Digitization — Convert legacy paper documents to searchable text
  • Multi-Language Documents — Process documents in Hindi, French, German, Arabic, etc.
  • Business Card Reading — Extract name, phone, and email from card photos
  • Meter Reading — Extract digits from utility meter photos with character whitelisting
  • Document Photo Cleanup — Turn phone photos into clean, professional scans
  • Fraud Detection — Compare documents for visual similarity
  • QA Testing — Compare rendered documents before and after changes

API Documentation

Full API docs: https://pdfapihub.com/docs

License

MIT

源码与版本

源码仓库

PdfApiHub/openclaw-pdf-ocr

打开仓库

源码提交

f53e84049851e278ec83872e0934d87080810a06

查看提交

安装命令

openclaw plugins install clawhub:pdf-ocr-scan

元数据

  • 包名: pdf-ocr-scan
  • 创建时间: 2026/04/17
  • 更新时间: 2026/04/17
  • 执行代码:
  • 源码标签: main

兼容性

  • 构建于 OpenClaw: 2026.3.24-beta.2
  • 插件 API 范围: >=2026.3.24-beta.2
  • 标签: latest
  • 文件数: 7