Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@modev2301
Copy link
Contributor

@modev2301 modev2301 commented Oct 21, 2025

Summary

This PR adds a NetFlow source implementation to Vector, supporting NetFlow v5, NetFlow v9, IPFIX, and sFlow protocols. The implementation includes template management, enterprise field support.

Vector configuration

sources:
  netflow_data:
    type: "netflow"
    address: "0.0.0.0:2055"
    protocols: ["netflow_v5", "netflow_v9", "ipfix", "sflow"]
    max_packet_size: 65535
    max_templates: 1000
    template_timeout: 1800
    parse_enterprise_fields: true
    parse_options_templates: true
    parse_variable_length_fields: true
    buffer_missing_templates: true
    max_buffered_records: 1000

How did you test this PR?

I tested this implementation across multiple environments and scenarios:

Development Testing:

  • Unit tests for all protocol parsers (NetFlow v5, v9, IPFIX, sFlow) with various packet formats
  • Integration tests covering template management, enterprise field parsing, and error handling
  • Configuration validation testing across all supported options

Enterprise Environment Testing:

  • Deployed in a low-environment enterprise setting with real network infrastructure
  • Tested with actual Cisco network devices generating NetFlow v5 and v9
  • Tested with Silver Peak devices generating IPFIX flows
  • Validated template buffering and enterprise field parsing with vendor-specific extensions
  • Tested template cleanup and memory management under sustained load

Production Goals:

  • We plan to use this for handling 20 million NetFlow records per minute in production

The implementation has been running stable in our enterprise test environment for several weeks while making some minor tweaks along the way, processing a good chunk of flow records daily from various network vendors without issues.

Change Type

  • Bug fix
  • New feature
  • Non-functional (chore, refactoring, docs)
  • Performance

Is this a breaking change?

  • Yes
  • No

Does this PR include user facing changes?

  • Yes. Please add a changelog fragment based on our guidelines.
  • No. A maintainer will apply the no-changelog label to this PR.

References

Closes: #7386

Notes

  • Please read our Vector contributor resources.
  • Do not hesitate to use @vectordotdev/vector to reach out to us regarding this PR.
  • Some CI checks run only after we manually approve them.
    • We recommend adding a pre-push hook, please see this template.
    • Alternatively, we recommend running the following locally before pushing to the remote branch:
      • make fmt
      • make check-clippy (if there are failures it's possible some of them can be fixed with make clippy-fix)
      • make test
  • After a review is requested, please avoid force pushes to help us review incrementally.
    • Feel free to push as many commits as you want. They will be squashed into one before merging.
    • For example, you can run git merge origin master and git push.
  • If this PR introduces changes Vector dependencies (modifies Cargo.lock), please
    run make build-licenses to regenerate the license inventory and commit the changes (if any). More details here.

modev2301 added 30 commits July 19, 2025 23:11
- Add comprehensive NetFlow v5, v9, IPFIX, and sFlow parsing
- Implement template caching for template-based protocols
- Add defensive parsing to prevent infinite loops on malformed packets
- Include extensive test coverage with 15+ test cases
- Add configuration examples and documentation
- Support multicast groups and configurable protocols
- Add raw data inclusion for debugging

This source supports all major flow protocols used in network monitoring
and observability, with robust error handling and performance optimizations.
…us to UInt8 and deleted some irrelvant files needed for testing
- Add template buffering system to handle missing templates gracefully
- Reduce log flooding by changing template-missing errors to DEBUG level
- Add configuration options for buffering (buffer_missing_templates, max_buffered_records)
- Improve error messages with better context and rate limiting
- Add automatic cleanup of expired buffered records
- Add new events for buffered record processing
- Create example configurations for simple and advanced use cases
- Follow Vector best practices for error handling and configuration

Fixes issues with IPFIX template dependency and log flooding.
Makes the source more production-ready with graceful degradation.
- Add raw template data logging when template ID 1024 is received
- Add detailed debugging when enterprise field parsing fails
- Log hex dump and base64 encoded template data for analysis
- This will help identify the enterprise field structure from HPE devices
- Add missing base64::Engine import to both ipfix.rs and templates.rs
- Fixes compilation error for base64 encoding in debugging code
- Now properly imports the Engine trait needed for encode() method
- Add field length validation to handle unreasonably large field lengths
- Treat fields with length > 1000 as variable length fields
- Add graceful handling for malformed enterprise fields in template ID 1024
- Continue parsing even when enterprise field data is insufficient
- This fixes the 'Insufficient data for enterprise field' error for HPE devices
- Change ERROR level to DEBUG level for template ID 1024 debugging
- Reduce log flooding while maintaining debugging capability
- Template parsing is working correctly, just reducing noise
- This should eliminate the repeated error messages in logs
- Add missing buffer_missing_templates parameter to IPFIX parse calls in tests
- Add missing buffer_missing_templates and max_buffered_records fields to NetflowConfig in tests
- Update test expectations for buffering behavior (header events vs unparseable events)
- Fix malformed packet test to handle both success and failure cases gracefully
- Create comprehensive template analysis tool for debugging
- Real-time monitoring of NetFlow/IPFIX templates on UDP port
- Statistical analysis of field types and lengths
- Problem detection for suspicious field lengths (65535, >1000)
- Enterprise field analysis for HPE devices
- Template frequency and field type distribution reporting
- Helps debug 'unreasonably large length' warnings for field types 96, 236, etc.
- Recognize 65535 as standard IPFIX variable-length field indicator
- Remove excessive warnings for legitimate variable-length fields (types 96, 236, 32793, etc.)
- Reduce debug noise for template ID 1024 enterprise field parsing
- Based on template inspector analysis showing these are legitimate HPE enterprise fields
- Fixes 'unreasonably large length' warnings for standard IPFIX variable-length fields
- Clean up unused import that was left after removing debug logging
- Fixes compilation error with deny(warnings) lint level
- Add hex preview of packet headers for debugging
- Add --ipfix-only flag to filter out NetFlow v5/v9 packets
- Show packet version and first 16 bytes for analysis
- Helps debug why v5/v9 packets are appearing when expecting IPFIX
- Add validate_flow_record method to check for reasonable data
- Reject records with unrealistic byte counts (> 1TB)
- Reject records with unrealistic packet counts (> 1 billion)
- Require at least source or destination IP address
- Prevents malformed templates from creating garbage flow records
- Fixes issues with 5.2TB bytes and missing flow_id fields
- Add debug logging for template fields and field types
- Log raw field data for problematic field types (96, 236, 32793)
- Add debugging for large byte counts and XNET protocol detection
- Log parsed flow record fields and field counts
- Helps diagnose whether issue is malformed templates or parsing logic
- No data rejection - just debugging to understand the problem
- Add missing Value import for type checking
- Fix get_all() method call to use iter() instead
- Resolve compilation errors for debugging functionality
- Add Scope Field Count parsing for Options Templates (Set ID 3)
- Update TemplateField and Template structures to support scope fields
- Create parse_ipfix_options_template_fields() for proper Options Template parsing
- Add field name lookups for Options Template fields (346, 303, 339, 344, 341, 345)
- Add comprehensive unit tests for Template 1024 parsing
- Fix 2-byte offset issue that was causing field 32767 errors

This resolves the parsing failure for Silver Peak EdgeConnect Options Templates
that provide exporter metadata (Template ID 1024).
…ld initializations

- Add is_scope: false to all existing TemplateField initializations
- Fix unused variable warning in options template parsing
- Ensure all TemplateField structs include the new is_scope field

This resolves the compilation errors introduced by adding the is_scope field
to the TemplateField structure for Options Template support.
- Add options_template_handling config option with three modes:
  - 'discard': Ignore Options Template data (default)
  - 'emit': Emit Options Template data as separate events
  - 'enrich': Use Options Template data for enrichment only
- Separate Options Template data from regular flow data
- Add data_type field to distinguish exporter metadata from flow data
- Update IPFIX parser to handle Options Template data appropriately

This prevents Options Template metadata (like Template 1024) from being
treated as flow data and allows proper configuration control.
MAJOR CONFIG IMPROVEMENTS:
- Update defaults to production-ready values:
  * max_templates: 10 → 1000 (handles hundreds of exporters)
  * template_timeout: 400s → 1800s (30 min, matches resend intervals)
  * max_buffered_records: 100 → 1000 (reasonable production buffer)
  * max_packet_size: 1500 → 65535 (UDP max, supports all valid packets)

- Consolidate size configuration:
  * Remove max_length, max_field_length, max_message_size
  * Add single max_packet_size (65535 default)
  * Simplifies configuration and prevents confusion

- Rename options_template_handling → options_template_mode
- Change default from 'discard' → 'emit_metadata' (more flexible)

- Update all references throughout codebase
- Fix tests to use new field names

This makes NetFlow configuration much more user-friendly with
sensible defaults that work out-of-the-box for production use.
- Update all IpfixParser::new() calls in tests to include options_template_mode parameter
- Remove references to removed max_message_size field in test configurations
- All tests now use the new simplified configuration structure
- Fix remaining IpfixParser::new() calls that were missed in previous fix
- Add missing options_template_mode field to NetflowConfig test initializations
- All compilation errors now resolved
- Update test to use proper Options Template format with scope field count
- Add both scope field (observationDomainId) and option field (sourceIPv4Address)
- Fix packet length calculation to match the new structure
- Test now properly validates Options Template parsing functionality
- Correct packet length from 30 to 34 bytes to match actual data size
- IPFIX header (16) + Set header (4) + Template data (14) = 34 bytes
- This should resolve the template caching issue in the test
- Test parse_ipfix_options_template_fields function directly first
- This will help identify if the issue is in the function or the IPFIX parsing logic
- Added proper template data structure with scope and option fields
- Test validates both the direct function call and full IPFIX parsing
- Template is cached with port 2055, not test_peer_addr() port 4739
- Updated test to look for correct cache key: (192.168.1.100:2055, 1, 257)
- Added necessary imports for SocketAddr and IpAddr
- Test should now pass with correct template cache lookup
- Test parse_ipfix_options_template_fields function directly
- Test Template::new_options to verify scope field count is preserved
- Avoid protocol detection confusion between IPFIX and NetFlow v9
- Test validates core Options Template parsing functionality
- Add #[serde(default)] to all config fields except address
- Users now only need to specify address and options_template_mode
- All other fields use smart production-ready defaults
- Fixes the issue where all fields were still required

This enables the minimal config:
sources:
  netflow:
    type: netflow
    address: 0.0.0.0:9995
    options_template_mode: discard
- Add field 96: unknownField96 (appears in all templates)
- Add field 350: unknownField350 (appears in all templates)
- Add field 303: unknownField303 (Options Template field)
- Add field 339: unknownField339 (Options Template field)
- Complete coverage for PEN 23867 (HPE Aruba) fields
- All fields from your packet analysis are now handled
- Update field descriptions to match official HPE Aruba documentation
- Mark fields 96 and 350 as undocumented (not in official docs)
- Update enterprise fields (10001-10006) with official descriptions
- Fields 96 and 350 appear in production but are not documented by HPE Aruba
- All official HPE Aruba fields now have correct descriptions
- Add debug logging to track Options Template caching and retrieval
- Add debug logging to show when Options Template data is detected
- Add debug logging to show discard mode behavior
- This will help diagnose why Template 1024 data is still being processed
- Remove undocumented fields from HPE Aruba field definitions
- Simplify field descriptions to be more concise
- Improve config defaults with better serde handling
- Add custom deserialization for proper default handling
- Remove manual Deserialize derive as configurable_component macro provides it
- Keep serde default attributes for proper deserialization
- Fix default function return types to match field types
- Handle UInt32 values > i32::MAX by storing as strings
- Prevents PostgreSQL 'value out of range for type integer' errors
- Add debug logging for overflow cases
- Affects Options Template 1024 fields like overlayTunnelID and policyMatchID
…ields

- Add overflow protection for DateTimeSeconds (u32 -> i64)
- Add overflow protection for DateTimeMilliseconds (u64 -> i64)
- Add overflow protection for DateTimeMicroseconds/Nanoseconds (u64 -> i64)
- Large values are now stored as strings to prevent PostgreSQL integer overflow
- This should eliminate all remaining PostgreSQL 'value out of range' errors
- Covers all remaining data types that could cause integer overflow issues
- Change UInt32 overflow logging from debug to warn for visibility
- Add detailed field parsing logs to identify which fields cause overflow
- Log field type, length, enterprise number, scope, and raw data
- This will help identify the exact source of PostgreSQL integer overflow errors
- Add field parsing debug logs showing field name, type, length, enterprise, scope
- Add template ID to field parsing logs in IPFIX
- This will help identify exactly which field in which template is causing PostgreSQL overflow
- Combined with existing UInt32 overflow warnings, we can trace the source of the problem
- Change debug! to info! for field parsing logs to ensure they show up
- This will help identify which fields and templates are causing overflow
- Info level should be visible even without RUST_LOG=debug
- Log whether values are being inserted as strings or integers
- This will help identify if the overflow protection is working correctly
- We can see if string values are being converted back to integers somewhere
- Log every UInt32 value being parsed with overflow check
- This will help identify why overflow protection isn't triggering for value 4032056033
- Shows the actual value, i32::MAX, and whether overflow will occur
- Remove all debugging logs and overflow protection code
- Clean up UInt32, DateTime parsing to use standard integer conversion
- Remove unnecessary comparison documentation file
- Add info-level logging for Options Template record processing
- Prepare code for production use and PR submission

The PostgreSQL schema change (INTEGER -> BIGINT) resolved the overflow issues,
so the application-level overflow protection is no longer needed.
- Remove info-level logging for Options Template record processing
- Keep code completely clean without any debug/info logging
- Ready for production use
- Remove examples/netflow-advanced.yaml
- Remove examples/netflow-simple.yaml
- Remove scripts/inspect_netflow_templates.py
- Clean up codebase to match Vector standards
- Code is production-ready with proper documentation
- Implement comprehensive NetFlow, IPFIX, and sFlow protocol support
- Add template caching and buffering for NetFlow v9/IPFIX
- Support enterprise field parsing and configuration
- Add comprehensive internal events and metrics
- Include configuration example and integration tests
- Support all major flow protocols: NetFlow v5, v9, IPFIX, sFlow
- Add template management with cleanup and buffering
- Implement robust error handling and protocol detection
- Add extensive test coverage for all protocols
@modev2301 modev2301 requested a review from a team as a code owner October 21, 2025 04:37
@github-actions github-actions bot added the domain: sources Anything related to the Vector's sources label Oct 21, 2025
@github-actions github-actions bot added the domain: ci Anything related to Vector's CI environment label Oct 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

domain: ci Anything related to Vector's CI environment domain: sources Anything related to the Vector's sources

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Network Flow Handling

1 participant