Tags: javacruft/litellm
Tags
refactor(http_parsing_utils.py): streamline request body handling and… … memory management - Removed the weak reference cache for request bodies and replaced it with direct storage on the request object to prevent memory leaks. - Implemented immediate garbage collection every 100 requests to manage memory usage effectively. - Added a cleanup function to explicitly free memory associated with request processing. - Updated the logic for retrieving and storing parsed request bodies to enhance performance and reliability.
Add caching for failed deployments due to configuration errors - Introduced a new cache to track deployments that fail due to configuration issues, preventing repeated retry attempts. - Implemented logic to skip deployments that have previously failed due to configuration errors. - Added a method to clear the failed deployments cache, allowing for retries of specific or all failed deployments.
Implement pass-through route registration and removal to prevent dupl… …icates and memory leaks
fix(health_check_helpers.py): set max tokens for wildcard call to 10,… … fixes calling gpt-5-nano via wildcard on openai (BerriAI#13482) gpt-5-nano raises errors for max_tokens=1
PreviousNext