layout-detection

Here are 20 public repositories matching this topic...

PaddlePaddle / PaddleX

All-in-One Development Tool based on PaddlePaddle

ocr time-series deployment speech-recognition classification segmentation object-detection ai-pipelines layout-detection formula-recognition pp-chatocr pdf2markdown

Updated Jun 12, 2026
Python

Layout-Parser / layout-parser

Star

A Unified Toolkit for Deep Learning Based Document Image Analysis

ocr computer-vision deep-learning object-detection document-image-processing layout-analysis document-layout-analysis detectron2 layout-parser layout-detection

Updated Aug 15, 2024
Python

Open-source batch OCR workbench — a free, local alternative to ABBYY FineReader. Powered by Ollama + GLM-OCR + PP-DocLayoutV3, ~0.5s/page on RTX 4090. Three-panel editor, layout-aware, PDF/image batch processing, Markdown/Word export. 批量OCR工作台，纯本地运行，免费平替ABBYY，适合书籍文档数字化。

privacy ocr offline book-digitization document-processing document-ocr layout-detection markdown-export pdf-ocr local-ai ollama batch-ocr glm-ocr abbyy-alternative

Updated Jun 18, 2026
Python

mbzuai-oryx / KITAB-Bench

Star

[ACL 2025 🔥] A Comprehensive Multi-Domain Benchmark for Arabic OCR and Document Understanding

benchmark ocr vqa pdf-to-text arabic table-detection layout-detection vlms

Updated May 24, 2025
Python

gridaco / ui-dataset

Star

A pre labelled dataset for ui element / layout detection

ui reflect labelling data-set ui-dataset layout-detection

Updated Jun 15, 2023

sparkfish / shabby-pages

Sponsor

Star

ShabbyPages is a state-of-the-art corpus of born-digital document images with both ground truth and distorted versions appropriate for use in training models to reverse distortions and recover to original denoised documents.

data-science computer-vision corpus dataset binarization denoising layout-detection born-digital

Updated Mar 12, 2025
Jupyter Notebook

jiangnanboy / layout_analysis

Star

中文版面检测（Chinese layout detection），yolov8 is used to detect the layout of Chinese document images。

layout-detection yolov8 cdla

Updated Apr 28, 2023
Python

jiangnanboy / AutoText

Star

智能文本自动处理工具（Intelligent text automatic processing tool）。AutoText的功能主要有文本纠错，图片ocr、版面检测以及表格结构识别等。The main functions of this project include text error correction, ocr, layout-detection and table structure recognition.

ocr table-structure-recognition layout-detection text-error-correction

Updated May 17, 2023
Java

pleb631 / pdfLayoutDet

Star

pdfDet aims to simplify PDF layout detect tasks for users.

document-analysis layout-analysis pdf-document-processor layout-parser layout-detection

Updated Mar 28, 2024
Python

DCC-BS / docling-pp-doc-layout

Star

Docling plugin to integrate PP-DocLayout-V3 model into docling to enhance layout detection capabilities

python pypi-package layout-detection docling docling-plugin

Updated Jun 18, 2026
Python

NanoNets / llm-data-converter

Star

Convert any document format into LLM-ready data format (markdown) with advanced intelligent document processing capabilities powered by pre-trained models.

Updated Aug 14, 2025
Python

PT-Perkasa-Pilar-Utama / ppu-doclayout

Star

A lightweight, type-safe, PaddlePaddle PP-DocLayoutV3 & V2 implementation in Bun/Node.js for document layout analysis in JavaScript environments.

bun paddlepaddle layout-analysis onnx onnxruntime layout-detection doclayout

Updated Apr 6, 2026
TypeScript

jiangnanboy / layout_detection

Star

利用c++加载yolov8模型，进行版面检测。yolov8-c++ is used to detect the layout of Chinese document images

c-plusplus onnx layout-detection yolov8

Updated Nov 17, 2023
C++

charlie6echo / VBDLDSCC

Star

Vision Based Document Layout Detection, Segmentation and context classification using MaskRCNN on Tensorflow-Keras, PyTorch & Detectron2.

pytorch document-classification bounding-boxes instance-segmentation mask-rcnn ms-coco custom-dataset document-layout-analysis focal-loss keras-tensoflow detectron2 publaynet layout-detection

Updated Jul 28, 2021
Jupyter Notebook

abhilashpanda04 / layout_parser

Star

Document layout analysis tool for extracting structured information from documents using computer vision

ocr computer-vision document-analysis layout-detection

Updated Jul 25, 2022
Jupyter Notebook

wattkaiserviaduct / ABBYY-FineReader-PDF-15-Unlocked

Star

ABBYY FineReader PDF 15 Full Version Download | Unlocked Build | Pre-Activated Setup

pdf ocr ocr-service banking-applications document-processing document-ocr layout-detection pdf-ocr ocr-software abbyy-finereader-pdf abbyy-finereader-server abbyy-finereader-11 abbyy-finereader-pdf-15-standard abbyy-finereader-file-conversion glm-ocr abbyy-alternative windows-software-2026

Updated Jun 22, 2026

hnextits / NextitsLM_DataPreProcessing

Star

OCR, STT

ocr stt layout-detection

Updated Jan 16, 2026
Python

frekkoz3 / LittleReader

Star

A lightweight hybrid system for parsing and digitizing historical newspaper pages

ocr computer-vision vlm layout-detection

Updated May 17, 2026
Python

ozefe / ytcc-pipeline

Star

A synchronous Python library that converts an academic-thesis PDF into a structured JSON document plus a tar bundle of cropped figures, tables, and formulas.

ocr pdf-parser grobid table-recognition ai-pipelines layout-detection formula-recognition

Updated May 21, 2026
Python

tagay1n / ocr-benchmark-pipeline

Star

Language-agnostic OCR benchmark pipeline to discover document images, review/edit layouts in a web UI, run OCR extraction, and build high-quality evaluation datasets.

benchmark ocr annotation sqlite evaluation dataset fastapi document-ai layout-detection

Updated Jun 13, 2026
Python

Improve this page

Add a description, image, and links to the layout-detection topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the layout-detection topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

layout-detection

Here are 20 public repositories matching this topic...

PaddlePaddle / PaddleX

Layout-Parser / layout-parser

vorojar / Folio-OCR

mbzuai-oryx / KITAB-Bench

gridaco / ui-dataset

sparkfish / shabby-pages

jiangnanboy / layout_analysis

jiangnanboy / AutoText

pleb631 / pdfLayoutDet

DCC-BS / docling-pp-doc-layout

NanoNets / llm-data-converter

PT-Perkasa-Pilar-Utama / ppu-doclayout

jiangnanboy / layout_detection

charlie6echo / VBDLDSCC

abhilashpanda04 / layout_parser

wattkaiserviaduct / ABBYY-FineReader-PDF-15-Unlocked

hnextits / NextitsLM_DataPreProcessing

frekkoz3 / LittleReader

ozefe / ytcc-pipeline

tagay1n / ocr-benchmark-pipeline

Improve this page

Add this topic to your repo