Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.upsonic.ai/llms.txt

Use this file to discover all available pages before exploring further.

What is PaddleOCR?

Comprehensive OCR with multiple specialized pipelines for advanced document understanding.

Usage

from upsonic.ocr import OCR
from upsonic.ocr.layer_1.engines import (
    PaddleOCREngine, PPStructureV3Engine,
    PPChatOCRv4Engine, PaddleOCRVLEngine
)
# Also available: from upsonic.ocr import PaddleOCREngine, PPStructureV3Engine, ...

# General OCR (PP-OCRv5)
engine = PaddleOCREngine(lang='en', ocr_version='PP-OCRv5')
ocr = OCR(layer_1_ocr_engine=engine)
text = ocr.get_text('document.pdf')

# Advanced document structure recognition
engine_structure = PPStructureV3Engine(
    use_table_recognition=True,
    use_formula_recognition=True
)
ocr_structure = OCR(layer_1_ocr_engine=engine_structure)
result = ocr_structure.process_file('research_paper.pdf')

# Chat-based document understanding
engine_chat = PPChatOCRv4Engine(
    use_table_recognition=True,
    use_seal_recognition=True
)
ocr_chat = OCR(layer_1_ocr_engine=engine_chat)

# Vision-Language document understanding
engine_vl = PaddleOCRVLEngine(
    use_layout_detection=True,
    use_chart_recognition=True,
    format_block_content=True
)
ocr_vl = OCR(layer_1_ocr_engine=engine_vl)

PaddleOCREngine (General OCR)

ParameterTypeDefaultDescription
langstr'en'Language code
ocr_versionstr'PP-OCRv5'OCR version (‘PP-OCRv3’, ‘PP-OCRv4’, ‘PP-OCRv5’)
use_doc_orientation_classifyboolNoneEnable document orientation classification
use_doc_unwarpingboolNoneEnable document unwarping
use_textline_orientationboolNoneEnable text line orientation detection
text_det_limit_side_lenintNoneLimit on detection input side length
text_rec_score_threshfloatNoneText recognition score threshold
return_word_boxboolNoneReturn word-level bounding boxes

PPStructureV3Engine (Document Structure)

ParameterTypeDefaultDescription
use_table_recognitionboolNoneEnable table recognition
use_formula_recognitionboolNoneEnable formula recognition
use_seal_recognitionboolNoneEnable seal text recognition
use_chart_recognitionboolNoneEnable chart recognition
layout_thresholdfloatNoneLayout detection score threshold
langstr'en'Language code

PPChatOCRv4Engine (Chat-based OCR)

ParameterTypeDefaultDescription
use_table_recognitionboolNoneEnable table recognition
use_seal_recognitionboolNoneEnable seal recognition
mllm_chat_bot_configdictNoneMultimodal LLM configuration
retriever_configdictNoneRetriever configuration for vector search

PaddleOCRVLEngine (Vision-Language)

ParameterTypeDefaultDescription
use_layout_detectionboolNoneEnable layout detection
use_chart_recognitionboolNoneEnable chart recognition
format_block_contentboolNoneFormat content as Markdown
vl_rec_backendstr'local'VL recognition backend
temperaturefloatNoneSampling temperature for VLM

Supported Languages

40+ languages for PP-OCRv5, with extensive support in PP-OCRv3 for Asian, European, and Middle Eastern languages.