Replies: 2 comments 2 replies
-
|
I'm wondering why you're planning to remove the dependencies on OpenCV and Shapely. Do these two packages consume a lot of resources? Is Pillow a better alternative? |
Beta Was this translation helpful? Give feedback.
-
The reason we’re considering removing OpenCV and Shapely is that they are introduced through RapidOCR’s upstream dependency chain and add extra package size and native dependency complexity to our program. Based on our current usage, we don’t directly rely on OpenCV or Shapely in our own code, and the parts RapidOCR uses them for may be replaceable with lighter alternatives. Thank you for taking the time to reply. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Summary
Would the RapidOCR maintainers be open to an optional Pillow/numpy/scipy image-processing backend for the legacy
rapidocr-onnxruntimepackage, so deployments that already use ONNX Runtime can avoid pulling OpenCV and Shapely when they do not otherwise need them?I am working on a downstream desktop app that embeds
rapidocr-onnxruntime==1.4.4as an OCR backend. In that dependency chain, OpenCV and Shapely are the only heavy transitive packages that are not otherwise needed by the app runtime.Downstream experiment
I built a local proof-of-concept fork based on
rapidocr-onnxruntime==1.4.4with the import package name kept asrapidocr_onnxruntimefor compatibility.Dependency changes in the fork:
opencv-python,opencv-python-headless,shapely,six,tqdmnumpy,onnxruntime,Pillow,pyclipper,PyYAML,scipyPyYAMLis intentionally kept because theread_yaml()configuration path still needs it.Replacement mapping used in the experiment
The v1.4.4 code path mostly uses OpenCV for image I/O, resizing, channel conversion, DB post-processing, perspective crop, and visualization helpers. The downstream fork replaced those with:
One important detail from the experiment:
cv2.dilate(..., kernel=np.ones((2, 2)))needed a matching anchor/origin behavior. Using scipy's default origin shifted the DB mask enough to change OCR results. Matching OpenCV's 2x2 anchor fixed the detected box parity in our tests.Validation performed downstream
The local fork currently passes these checks in our downstream app:
cv2orshapelyimports/usages remain in the fork sourcecv2andshapelyabsent from the active environmentrapidocr-onnxruntime==1.4.4+opencv-python==4.11.0.86Question
Would you be interested in a PR that introduces this as an optional backend or otherwise removes the hard OpenCV/Shapely dependency for the onnxruntime package?
If yes, I can prepare the change in the structure you prefer. If no, we will keep maintaining this as a downstream fork.
Beta Was this translation helpful? Give feedback.
All reactions