What's Changed
- Fix real-time streaming with proper HTTP flushing by @TianYi0217 in #8
- Add unit tests for streaming functionality
Changes
Bug Fixes
Real-time Streaming with Proper HTTP Flushing (#8)
Fixed streaming responses that were buffered and only appeared at the end instead of showing tokens in real-time.
Technical Details:
- Replaced
io.Copy()with manual read/write loop with immediateFlush()calls - Implemented
http.Flusherinterface onresponseWrapper - Use 512-byte buffer optimized for Anthropic SSE events (typically 100-200 bytes)
- Proper error handling for write vs read errors
- Graceful degradation when underlying ResponseWriter doesn't support flushing
Impact:
- Streaming responses now deliver content in real-time instead of buffering until completion
- Significantly improved user experience for streaming requests
- Lower perceived latency for SSE-based responses
Testing
- Added unit tests for
responseWrapper.Flush()implementation - Added test coverage for graceful degradation scenarios
Contributors
Thanks to @TianYi0217 for this excellent contribution!
Full Changelog: v1.0.2...v1.0.4