Thanks to visit codestin.com
Credit goes to github.com

Skip to content

fix(core): handle multibyte UTF-8 characters in socket message consumption#34151

Merged
AgentEnder merged 5 commits intonrwl:masterfrom
Chanki-Min:fix/multibyte-socket-message-handling
Feb 2, 2026
Merged

fix(core): handle multibyte UTF-8 characters in socket message consumption#34151
AgentEnder merged 5 commits intonrwl:masterfrom
Chanki-Min:fix/multibyte-socket-message-handling

Conversation

@Chanki-Min
Copy link
Contributor

Current Behavior

When socket data chunks split a multibyte UTF-8 character (e.g., CJK characters like Korean, Chinese, Japanese) at an arbitrary byte boundary, Buffer.toString() decodes incomplete byte sequences as replacement characters (�), causing message corruption.

This can occur when:

  • File paths contain non-ASCII characters
  • Project names include multibyte characters
  • Any JSON message contains international text

Expected Behavior

Multibyte UTF-8 characters should be properly decoded even when split across multiple socket data chunks. The fix uses Node.js StringDecoder which buffers incomplete multibyte sequences until the remaining bytes arrive.

Related Issue(s)

Fixes socket message corruption for paths/names containing multibyte characters.

@Chanki-Min Chanki-Min requested a review from a team as a code owner January 20, 2026 10:20
@Chanki-Min Chanki-Min requested a review from Cammisuli January 20, 2026 10:20
@netlify
Copy link

netlify bot commented Jan 20, 2026

👷 Deploy request for nx-docs pending review.

Visit the deploys page to approve it

Name Link
🔨 Latest commit cbfc1ba

@vercel
Copy link

vercel bot commented Jan 20, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
nx-dev Ready Ready Preview Feb 1, 2026 3:04pm

Request Review

@nx-cloud
Copy link
Contributor

nx-cloud bot commented Jan 22, 2026

View your CI Pipeline Execution ↗ for commit cbfc1ba

Command Status Duration Result
nx affected --targets=lint,test,test-kt,build,e... ✅ Succeeded 46m 35s View ↗
nx run-many -t check-imports check-lock-files c... ✅ Succeeded 1m 45s View ↗
nx-cloud record -- nx-cloud conformance:check ✅ Succeeded 8s View ↗
nx-cloud record -- nx format:check ✅ Succeeded 1s View ↗
nx-cloud record -- nx sync:check ✅ Succeeded <1s View ↗

☁️ Nx Cloud last updated this comment at 2026-02-01 20:28:53 UTC

…ption

Use StringDecoder to properly decode UTF-8 data that may be split across
socket chunks. This prevents corruption when multibyte characters (such
as CJK characters) are split at arbitrary byte boundaries.
@Chanki-Min Chanki-Min force-pushed the fix/multibyte-socket-message-handling branch from 2527020 to e57a6e9 Compare January 23, 2026 01:54
@Chanki-Min Chanki-Min requested a review from AgentEnder January 23, 2026 01:54
@Chanki-Min
Copy link
Contributor Author

Hello @AgentEnder . Thanks for reviewing my PR!

I just rebased onto main branch to resolve conflict. (should be merge..)So I need another approval to run CI workflow. Thanks

Copy link
Contributor

@nx-cloud nx-cloud bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nx Cloud has identified a flaky task in your failed CI:

Since the failure was identified as flaky, the solution is to rerun CI. Because this branch comes from a fork, it is not possible for us to push directly, but you can rerun by pushing an empty commit:

git commit --allow-empty -m "chore: trigger rerun"
git push

Nx Cloud View detailed reasoning in Nx Cloud ↗


🎓 Learn more about Self-Healing CI on nx.dev

@AgentEnder AgentEnder merged commit e35dcd2 into nrwl:master Feb 2, 2026
15 of 16 checks passed
FrozenPandaz pushed a commit that referenced this pull request Feb 3, 2026
…ption (#34151)

## Current Behavior

When socket data chunks split a multibyte UTF-8 character (e.g., CJK
characters like Korean, Chinese, Japanese) at an arbitrary byte
boundary, `Buffer.toString()` decodes incomplete byte sequences as
replacement characters (�), causing message corruption.

This can occur when:
- File paths contain non-ASCII characters
- Project names include multibyte characters
- Any JSON message contains international text

## Expected Behavior

Multibyte UTF-8 characters should be properly decoded even when split
across multiple socket data chunks. The fix uses Node.js `StringDecoder`
which buffers incomplete multibyte sequences until the remaining bytes
arrive.

## Related Issue(s)

Fixes socket message corruption for paths/names containing multibyte
characters.

(cherry picked from commit e35dcd2)
@github-actions
Copy link
Contributor

github-actions bot commented Feb 8, 2026

This pull request has already been merged/closed. If you experience issues related to these changes, please open a new issue referencing this pull request.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 8, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants