fix: preserve utf8 decoding across response chunks by pupuking723 · Pull Request #5414 · nodejs/undici

pupuking723 · 2026-06-11T07:44:06Z

This relates to...

Rationale

BodyReadable#setEncoding() only stored the encoding on the readable state. That made streamed consumers decode each buffered chunk independently, so an incomplete UTF-8 sequence at a response chunk boundary produced replacement characters.

The internal body consumers still need raw Buffer chunks so body.text() and body.json() can aggregate and decode the full payload.

Changes

Features

N/A

Bug Fixes

Keep raw Buffer chunks for internal body consumption.
Use a StringDecoder when BodyReadable is consumed through the Readable API after setEncoding().
Add a regression test for async iteration over a response body where a 3-byte UTF-8 character spans response chunks.

Breaking Changes and Deprecations

N/A

Status

Verification:

node --test --test-name-pattern "request multibyte (json|text) with setEncoding|async iteration and setEncoding" test/client-request.js
npx borp --timeout 180000 -p test/client-request.js
npm run lint
npm run test:typescript
git diff --check

Note: npm ci --ignore-scripts was used locally because npm install exited with "Exit handler never called!" after dependency extraction. The test/lint commands above were run against the installed dependency tree. Local Node.js is v22.17.0 while the package currently requires >=22.19.0.

codecov-commenter · 2026-06-12T09:15:29Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 93.47%. Comparing base (0f1f890) to head (83fbbf6).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #5414   +/-   ##
=======================================
  Coverage   93.46%   93.47%           
=======================================
  Files         110      110           
  Lines       37106    37106           
=======================================
+ Hits        34682    34684    +2     
+ Misses       2424     2422    -2

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

mcollina

I'm relatively certain that Readable implements this logic. I think the problem is in the override of setEncoding, we should also call super.setEncoding

metcoder95

also PR has conflicts

Signed-off-by: 王胜 <2318857637@qq.com>

pupuking723 · 2026-06-24T07:45:15Z

Thanks for the review. After rebasing, BodyReadable#setEncoding() now delegates to super.setEncoding() in the current implementation, so this PR only keeps the focused regression test covering async iteration after body.setEncoding("utf8") across multibyte chunk boundaries.

The current GitHub checks are passing across the CI matrix. Could you please take another look when you have a chance?

metcoder95 requested a review from ronag June 12, 2026 08:59

metcoder95 approved these changes Jun 12, 2026

View reviewed changes

metcoder95 mentioned this pull request Jun 12, 2026

fix: decode split chunks after readable setEncoding #5394

Open

mcollina requested changes Jun 15, 2026

View reviewed changes

metcoder95 reviewed Jun 18, 2026

View reviewed changes

Comment thread lib/api/readable.js Outdated

fix: preserve utf8 decoding across response chunks

83fbbf6

Signed-off-by: 王胜 <2318857637@qq.com>

pupuking723 force-pushed the fix/utf8-response-decoder-boundaries branch from 67d5488 to 83fbbf6 Compare June 22, 2026 03:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

fix: preserve utf8 decoding across response chunks#5414

fix: preserve utf8 decoding across response chunks#5414
pupuking723 wants to merge 1 commit into
nodejs:mainfrom
pupuking723:fix/utf8-response-decoder-boundaries

pupuking723 commented Jun 11, 2026

Uh oh!

codecov-commenter commented Jun 12, 2026 •

edited

Loading

Uh oh!

mcollina left a comment

Uh oh!

metcoder95 left a comment

Uh oh!

Uh oh!

pupuking723 commented Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Uh oh!

Conversation

pupuking723 commented Jun 11, 2026

This relates to...

Rationale

Changes

Features

Bug Fixes

Breaking Changes and Deprecations

Status

Uh oh!

codecov-commenter commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

mcollina left a comment

Choose a reason for hiding this comment

Uh oh!

metcoder95 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pupuking723 commented Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov-commenter commented Jun 12, 2026 •

edited

Loading