Skip to content

Proposal: optional Pillow backend to avoid OpenCV/Shapely dependency #683

Description

@MomiJiSan

Summary

Would the RapidOCR maintainers be open to an optional Pillow/numpy/scipy image-processing backend for the legacy rapidocr-onnxruntime package, so deployments that already use ONNX Runtime can avoid pulling OpenCV and Shapely when they do not otherwise need them?

I am working on a downstream desktop app that embeds rapidocr-onnxruntime==1.4.4 as an OCR backend. In that dependency chain, OpenCV and Shapely are the only heavy transitive packages that are not otherwise needed by the app runtime.

Downstream experiment

I built a local proof-of-concept fork based on rapidocr-onnxruntime==1.4.4 with the import package name kept as rapidocr_onnxruntime for compatibility.

Dependency changes in the fork:

  • Removed: opencv-python, opencv-python-headless, shapely, six, tqdm
  • Kept: numpy, onnxruntime, Pillow, pyclipper, PyYAML, scipy

PyYAML is intentionally kept because the read_yaml() configuration path still needs it.

Replacement mapping used in the experiment

The v1.4.4 code path mostly uses OpenCV for image I/O, resizing, channel conversion, DB post-processing, perspective crop, and visualization helpers. The downstream fork replaced those with:

  • Pillow for image read/write and simple visual drawing
  • numpy for channel conversion, arithmetic, polygon area/perimeter, and perspective matrix solving
  • scipy.ndimage for dilation / coordinate mapping where needed
  • pyclipper remains responsible for the DB unclip offset operation

One important detail from the experiment: cv2.dilate(..., kernel=np.ones((2, 2))) needed a matching anchor/origin behavior. Using scipy's default origin shifted the DB mask enough to change OCR results. Matching OpenCV's 2x2 anchor fixed the detected box parity in our tests.

Validation performed downstream

The local fork currently passes these checks in our downstream app:

  • No cv2 or shapely imports/usages remain in the fork source
  • Fork metadata excludes OpenCV/Shapely/six/tqdm and keeps PyYAML/pyclipper
  • Helper parity checks against OpenCV/Shapely for color conversion, masks, add, border, polygon area/perimeter, min-area boxes, resize, and perspective point mapping
  • Full RapidOCR smoke test with cv2 and shapely absent from the active environment
  • Synthetic OCR corpus exact text comparison against upstream rapidocr-onnxruntime==1.4.4 + opencv-python==4.11.0.86
    • 60 generated samples total
    • 20 Chinese-like text images
    • 20 Japanese-like text images
    • 10 mixed CJK/English images
    • 5 vertical text images
    • 5 tilted text images
    • Result: all OCR text outputs matched exactly in that corpus

Question

Would you be interested in a PR that introduces this as an optional backend or otherwise removes the hard OpenCV/Shapely dependency for the onnxruntime package?

If yes, I can prepare the change in the structure you prefer. If no, we will keep maintaining this as a downstream fork.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions