Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Refactor JSON Path Navigation and Improve Performance#9

Merged
mhr3 merged 5 commits intomasterfrom
speedups
Jun 6, 2025
Merged

Refactor JSON Path Navigation and Improve Performance#9
mhr3 merged 5 commits intomasterfrom
speedups

Conversation

@mhr3
Copy link
Owner

@mhr3 mhr3 commented Jan 11, 2025

Summary

This PR introduces significant performance and safety improvements to the jsoniter library, focusing on JSON path navigation, Unicode handling, and iterator bounds checking. The changes eliminate callback-based iteration in favor of direct iteration patterns and optimize Unicode hex digit parsing.

Key Changes

🚀 Performance Improvements

Path Navigation Refactoring

  • Replaced callback-based iteration with direct iteration using ReadObjectRaw() and ReadArray()
  • Eliminated intermediate byte array allocations by removing SkipAndReturnBytes() + ResetBytes() pattern
  • Renamed functions for clarity: locateObjectFieldfindObjectField, locateArrayElementfindArrayElement
  • Changed return types from []byte to bool for more efficient existence checking

Unicode Hex Parsing Optimization

  • Simplified hex digit parsing using bitwise operations (c |= 0x20) to convert uppercase to lowercase
  • Reduced code duplication across parseU4(), readU4(), and readAndFillU4() methods
  • Improved performance by eliminating separate uppercase/lowercase handling branches

Iterator Bounds Checking

  • Added bounds validation in isNextTokenBuffered(), trySkipNumber(), and skipString()
  • Prevents buffer overruns with checks: iter.head < 0 || iter.head >= iter.tail || iter.tail > len(iter.buf)

Benefits

  1. Performance: Direct iteration eliminates callback overhead and intermediate allocations
  2. Memory Efficiency: Reduces temporary byte array creation during path navigation
  3. Safety: Comprehensive bounds checking prevents potential buffer overruns
  4. Maintainability: Cleaner, more consistent code patterns
  5. Unicode Handling: More efficient and robust hex digit parsing

Testing

  • Added comprehensive Unicode tests in iter_str_test.go covering:
    • Basic ASCII characters (\u0020)
    • Chinese characters with lowercase hex (\u4e2d)
    • Chinese characters with uppercase hex (\u4E2D)
    • Invalid hex sequences (\u658U)

Backward Compatibility

This refactoring maintains public API compatibility while changing internal implementation details. The path navigation behavior should remain functionally equivalent.

@mhr3 mhr3 changed the title Speedups Refactor JSON Path Navigation and Improve Performance Jun 6, 2025
@mhr3 mhr3 merged commit a66be24 into master Jun 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant