RapidOCR - Upsonic AI

What is RapidOCR?

Lightweight OCR based on ONNX Runtime for fast inference. Best for speed and lightweight deployment.

Usage

from upsonic.ocr import OCR
from upsonic.ocr.layer_1.engines import RapidOCREngine
# Also available: from upsonic.ocr import RapidOCREngine

# Create engine instance
engine = RapidOCREngine(languages=['en'], confidence_threshold=0.5)

# Create OCR orchestrator
ocr = OCR(layer_1_ocr_engine=engine)

# Extract text from image
text = ocr.get_text('invoice.png')
print(text)

# Process PDF
result = ocr.process_file('document.pdf')
print(f"Extracted {len(result.text)} characters from {result.page_count} pages")

Parameters

Parameter	Type	Default	Description
`languages`	List[str]	`['en']`	List of language codes
`confidence_threshold`	float	`0.0`	Minimum confidence for text blocks
`rotation_fix`	bool	`False`	Auto-detect and fix image rotation
`enhance_contrast`	bool	`False`	Enhance image contrast
`remove_noise`	bool	`False`	Apply noise reduction
`pdf_dpi`	int	`300`	DPI for PDF rendering

Supported Languages

English, Chinese (simplified and traditional), Japanese, Korean, and several other scripts including Tamil, Telugu, Arabic, Cyrillic, and Devanagari.

EasyOCR

Tesseract

Documentation Index

​What is RapidOCR?

​Usage

​Parameters

​Supported Languages

What is RapidOCR?

Usage

Parameters

Supported Languages