fix: preserve utf8 decoding across response chunks#5414
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #5414 +/- ##
=======================================
Coverage 93.46% 93.47%
=======================================
Files 110 110
Lines 37106 37106
=======================================
+ Hits 34682 34684 +2
+ Misses 2424 2422 -2 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
mcollina
left a comment
There was a problem hiding this comment.
I'm relatively certain that Readable implements this logic. I think the problem is in the override of setEncoding, we should also call super.setEncoding
Signed-off-by: 王胜 <2318857637@qq.com>
67d5488 to
83fbbf6
Compare
|
Thanks for the review. After rebasing, The current GitHub checks are passing across the CI matrix. Could you please take another look when you have a chance? |
This relates to...
Fixes #5002
Rationale
BodyReadable#setEncoding() only stored the encoding on the readable state. That made streamed consumers decode each buffered chunk independently, so an incomplete UTF-8 sequence at a response chunk boundary produced replacement characters.
The internal body consumers still need raw Buffer chunks so body.text() and body.json() can aggregate and decode the full payload.
Changes
Features
N/A
Bug Fixes
Breaking Changes and Deprecations
N/A
Status
Verification:
Note: npm ci --ignore-scripts was used locally because npm install exited with "Exit handler never called!" after dependency extraction. The test/lint commands above were run against the installed dependency tree. Local Node.js is v22.17.0 while the package currently requires >=22.19.0.