-
-
Notifications
You must be signed in to change notification settings - Fork 791
feat: Added Calender Based Indexing. #157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- This should stabalize manual syning.
|
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
WalkthroughThe indexing functionality for connector content was enhanced to support optional date range selection. Backend endpoints, background tasks, and indexing functions now accept and propagate Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Frontend
participant Backend
participant IndexingTask
User->>Frontend: Click "Index with Date Range"
Frontend->>User: Show date picker dialog
User->>Frontend: Select start and end dates, confirm
Frontend->>Backend: POST /search-source-connectors/{id}/index?start_date&end_date
Backend->>IndexingTask: Trigger indexing task with date range
IndexingTask->>Backend: Index content within date range
Backend->>Frontend: Respond with indexing status
Frontend->>User: Show success/failure notification
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
|
|
1 similar comment
|
|
| async def index_connector_content( | ||
| connector_id: int, | ||
| search_space_id: int = Query(..., description="ID of the search space to store indexed content"), | ||
| start_date: str = Query(None, description="Start date for indexing (YYYY-MM-DD format). If not provided, uses last_indexed_at or defaults to 365 days ago"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing date format validation for start_date parameter. While the description specifies YYYY-MM-DD format, there is no validation of the input string format. Invalid date strings could cause runtime errors in datetime operations later in the code.
React with 👍 to tell me that this comment was useful, or 👎 if not (and I'll stop posting more comments like this in the future)
| const startDateStr = startDate ? format(startDate, "yyyy-MM-dd") : undefined; | ||
| const endDateStr = endDate ? format(endDate, "yyyy-MM-dd") : undefined; | ||
|
|
||
| await indexConnector(selectedConnectorForIndexing, searchSpaceId, startDateStr, endDateStr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code attempts to pass startDateStr and endDateStr parameters to indexConnector, but the function call at line 172 reveals that the original implementation only accepts two parameters (connectorId and searchSpaceId). This mismatch in function parameters will likely cause the date range indexing to fail as the backend API may not be prepared to handle these additional parameters.
React with 👍 to tell me that this comment was useful, or 👎 if not (and I'll stop posting more comments like this in the future)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it accepts it bro
| const handleIndexConnector = async () => { | ||
| if (selectedConnectorForIndexing === null) return; | ||
|
|
||
| setIndexingConnectorId(selectedConnectorForIndexing); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code sets indexingConnectorId before making the API call and managing error states. If the API call fails, the finally block in handleIndexConnector correctly resets it, but if there's a problem before the API call (e.g., date validation failure), the loading state won't be cleared since the finally block won't be reached. This could leave the UI in a perpetual loading state.
React with 👍 to tell me that this comment was useful, or 👎 if not (and I'll stop posting more comments like this in the future)
|
😱 Found 3 issues. Time to roll up your sleeves! 😱 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🔭 Outside diff range comments (1)
surfsense_backend/app/tasks/connectors_indexing_tasks.py (1)
556-567: 🛠️ Refactor suggestionUpdate docstring to clarify that date parameters are not used for GitHub indexing.
The function accepts
start_dateandend_dateparameters but doesn't use them for filtering. This should be documented clearly.""" Index code and documentation files from accessible GitHub repositories. Args: session: Database session connector_id: ID of the GitHub connector search_space_id: ID of the search space to store documents in + start_date: Accepted for API consistency but not used - GitHub indexes all files + end_date: Accepted for API consistency but not used - GitHub indexes all files update_last_indexed: Whether to update the last_indexed_at timestamp (default: True) Returns: Tuple containing (number of documents indexed, error message or None) """
♻️ Duplicate comments (1)
surfsense_backend/app/tasks/connectors_indexing_tasks.py (1)
759-786: Apply the same date handling simplification as suggested for Slack indexing.This function has the same timezone comparison issues and redundant date calculation logic as
index_slack_messages.
🧹 Nitpick comments (4)
surfsense_web/app/dashboard/[search_space_id]/connectors/(manage)/page.tsx (1)
269-327: Consider extracting the date picker buttons to improve maintainability.The dual-button implementation with tooltips works well functionally, but the inline button definitions make the component quite verbose.
Consider extracting this to a separate component:
const IndexingButtons = ({ connector, isIndexing, onOpenDatePicker, onQuickIndex }) => ( <div className="flex gap-1"> <TooltipProvider> <Tooltip> <TooltipTrigger asChild> <Button variant="outline" size="sm" onClick={() => onOpenDatePicker(connector.id)} disabled={isIndexing} > {isIndexing ? ( <RefreshCw className="h-4 w-4 animate-spin" /> ) : ( <CalendarIcon className="h-4 w-4" /> )} <span className="sr-only">Index with Date Range</span> </Button> </TooltipTrigger> <TooltipContent> <p>Index with Date Range</p> </TooltipContent> </Tooltip> </TooltipProvider> {/* Quick index button */} </div> );surfsense_web/components/ui/calendar.tsx (1)
43-123: Comprehensive styling implementation but consider refactoring for maintainability.The className definitions provide extensive customization but are becoming difficult to maintain due to their length and complexity.
Consider extracting complex className definitions to constants:
const calendarClassNames = { root: cn("w-fit", defaultClassNames.root), months: cn("flex gap-4 flex-col md:flex-row relative", defaultClassNames.months), // ... other class definitions day: cn( "relative w-full h-full p-0 text-center", "[&:first-child[data-selected=true]_button]:rounded-l-md", "[&:last-child[data-selected=true]_button]:rounded-r-md", "group/day aspect-square select-none", defaultClassNames.day ), };surfsense_backend/app/tasks/connectors_indexing_tasks.py (1)
999-1028: Good timezone handling, but consider simplifying the date logic.This function correctly uses timezone-aware datetime objects, which is better than the other indexing functions. However, the date calculation logic could still be simplified.
Consider extracting the date conversion logic into a helper function to reduce duplication:
def convert_to_iso_date(date_str: str, timezone_obj=timezone.utc) -> str: """Convert YYYY-MM-DD string to ISO format with timezone.""" if date_str: return datetime.strptime(date_str, "%Y-%m-%d").replace(tzinfo=timezone_obj).isoformat() return NoneThis would make the code more maintainable and consistent across all indexing functions.
surfsense_backend/app/routes/search_source_connectors_routes.py (1)
323-327: Use ternary operator for conciseness.As suggested by static analysis, simplify the conditional:
- if end_date is None: - indexing_to = today_str - else: - indexing_to = end_date + indexing_to = today_str if end_date is None else end_date🧰 Tools
🪛 Ruff (0.11.9)
323-326: Use ternary operator
indexing_to = today_str if end_date is None else end_dateinstead ofif-else-blockReplace
if-else-block withindexing_to = today_str if end_date is None else end_date(SIM108)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
surfsense_web/pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (6)
surfsense_backend/app/routes/search_source_connectors_routes.py(9 hunks)surfsense_backend/app/tasks/connectors_indexing_tasks.py(12 hunks)surfsense_web/app/dashboard/[search_space_id]/connectors/(manage)/page.tsx(6 hunks)surfsense_web/components/ui/calendar.tsx(1 hunks)surfsense_web/hooks/useSearchSourceConnectors.ts(1 hunks)surfsense_web/package.json(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
surfsense_backend/app/routes/search_source_connectors_routes.py (2)
surfsense_backend/app/db.py (1)
SearchSourceConnectorType(55-63)surfsense_backend/app/tasks/connectors_indexing_tasks.py (5)
index_slack_messages(23-286)index_notion_pages(288-546)index_github_repos(548-714)index_linear_issues(716-952)index_discord_messages(954-1225)
🪛 Ruff (0.11.9)
surfsense_backend/app/routes/search_source_connectors_routes.py
323-326: Use ternary operator indexing_to = today_str if end_date is None else end_date instead of if-else-block
Replace if-else-block with indexing_to = today_str if end_date is None else end_date
(SIM108)
🔇 Additional comments (13)
surfsense_web/package.json (1)
53-53:✅ Verification successful
Verify the latest version of react-day-picker.
The dependency addition looks good and aligns with the calendar-based indexing feature. However, please verify that version ^9.7.0 is the latest stable version available.
🌐 Web query:
What is the latest stable version of react-day-picker npm package?💡 Result:
The latest stable version of react-day-picker npm package is 9.7.0, which was released on May 14, 2025[1][7][8]. This version includes improvements to accessibility, styling, internationalization, and performance compared to previous versions. The package is designed for creating date pickers, calendars, and date inputs for React web applications, and it's written in TypeScript with compilation to CommonJS and ESM formats[1][8].
React DayPicker offers numerous features including customizable calendars, support for various selection modes (single day, multiple days, or ranges), localization capabilities, and compatibility with different calendar systems including ISO 8601, Persian, and broadcast calendars[1]. It also complies with WCAG 2.1 AA requirements for accessibility[1][8].
It's worth noting that there have been some compatibility issues reported with React 19, so developers using the latest React version may encounter deployment challenges[9]. The package requires React 16.8 or later for proper functionality[1].
Citations:
- 1: https://www.npmjs.com/package/react-day-picker
- 2: https://github.com/gpbl/react-day-picker
- 3: https://daypicker.dev/start
- 4: https://daypicker.dev/v8/start
- 5: https://daypicker.dev/upgrading
- 6: https://github.com/gpbl/react-day-picker/blob/master/CHANGELOG.md
- 7: https://www.npmjs.com/package/react-datepicker
- 8: https://www.npmjs.com/package/react-day-picker/v/9.0.0-rc.4
- 9: shadcn-ui/ui#6271
- 10: https://npm-compare.com/react-day-picker
react-day-picker version confirmed as latest (9.7.0)
Confirmed that version 9.7.0 is the current stable release of react-day-picker (released May 14, 2025). No further changes are needed.
surfsense_web/hooks/useSearchSourceConnectors.ts (2)
256-261: Well-implemented function signature enhancement.The addition of optional
startDateandendDateparameters maintains backward compatibility while extending functionality for the calendar-based indexing feature.
269-279: Correct query parameter construction.The URLSearchParams usage is appropriate and handles optional date parameters correctly. The implementation properly constructs the query string and includes it in the fetch URL.
surfsense_web/app/dashboard/[search_space_id]/connectors/(manage)/page.tsx (5)
12-12: Good import organization for new calendar functionality.The new imports properly support the calendar-based indexing feature. The addition of Calendar icon and date formatting utilities aligns well with the enhanced UI.
Also applies to: 49-63
107-111: Proper state management for date picker functionality.The new state variables are well-named and appropriately typed for managing the date picker dialog and selected dates.
134-166: Well-structured date-based indexing handler.The implementation correctly handles date formatting, state management, and error handling. The cleanup logic ensures proper state reset after operations.
168-184: Good separation of quick vs date-range indexing.Providing separate handlers for quick indexing (without dates) and date-range indexing improves user experience and maintains the existing workflow.
390-516: Comprehensive date picker dialog implementation.The dialog provides excellent user experience with calendar popovers, preset ranges, and proper validation. The date constraints prevent invalid selections and the preset buttons offer convenient options.
surfsense_web/components/ui/calendar.tsx (3)
14-26: Well-structured Calendar component interface.The component properly extends DayPicker props and adds custom buttonVariant option. The TypeScript typing is accurate and the prop defaults are sensible.
28-42: Good usage of react-day-picker foundation.The component properly leverages the DayPicker base with appropriate defaults and custom formatters. The month dropdown formatting enhancement is a nice touch.
172-208: Well-implemented custom day button with proper focus management.The CalendarDayButton component correctly handles focus states, data attributes, and styling variants. The ref management and useEffect for focus is properly implemented.
surfsense_backend/app/routes/search_source_connectors_routes.py (2)
311-319: Clarify the logic for adjusting the start date when last indexed today.The special case where the start date is set to yesterday when
last_indexed_atis today could be confusing and may lead to re-indexing the same data.Consider:
- Document why this adjustment is necessary
- Make it configurable or remove it if not essential
- Consider using timestamps instead of dates to avoid this issue
- if connector.last_indexed_at.date() == today: - # If last indexed today, go back 1 day to ensure we don't miss anything - indexing_from = (today - timedelta(days=1)).strftime("%Y-%m-%d") - else: - indexing_from = connector.last_indexed_at.strftime("%Y-%m-%d") + # Use the exact last indexed timestamp + indexing_from = connector.last_indexed_at.strftime("%Y-%m-%d") + logger.info(f"Using last_indexed_at date: {indexing_from}")
410-634: Well-structured background task updates!The consistent pattern of updating all background task functions to accept and propagate the date parameters is clean and maintainable. The separation between session wrappers and actual task runners is a good design choice.
| className={cn( | ||
| "data-[selected-single=true]:bg-primary data-[selected-single=true]:text-primary-foreground data-[range-middle=true]:bg-accent data-[range-middle=true]:text-accent-foreground data-[range-start=true]:bg-primary data-[range-start=true]:text-primary-foreground data-[range-end=true]:bg-primary data-[range-end=true]:text-primary-foreground group-data-[focused=true]/day:border-ring group-data-[focused=true]/day:ring-ring/50 dark:hover:text-accent-foreground flex aspect-square size-auto w-full min-w-(--cell-size) flex-col gap-1 leading-none font-normal group-data-[focused=true]/day:relative group-data-[focused=true]/day:z-10 group-data-[focused=true]/day:ring-[3px] data-[range-end=true]:rounded-md data-[range-end=true]:rounded-r-md data-[range-middle=true]:rounded-none data-[range-start=true]:rounded-md data-[range-start=true]:rounded-l-md [&>span]:text-xs [&>span]:opacity-70", | ||
| defaultClassNames.day, | ||
| className | ||
| )} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Refactor the extremely long className for better maintainability.
The className string is over 1000 characters long, making it very difficult to read and maintain.
Break down the className into logical groups:
const baseClasses = cn(
"flex aspect-square size-auto w-full min-w-(--cell-size) flex-col gap-1 leading-none font-normal",
defaultClassNames.day,
className
);
const stateClasses = cn(
"data-[selected-single=true]:bg-primary data-[selected-single=true]:text-primary-foreground",
"data-[range-middle=true]:bg-accent data-[range-middle=true]:text-accent-foreground",
"data-[range-start=true]:bg-primary data-[range-start=true]:text-primary-foreground",
"data-[range-end=true]:bg-primary data-[range-end=true]:text-primary-foreground"
);
const focusClasses = cn(
"group-data-[focused=true]/day:border-ring group-data-[focused=true]/day:ring-ring/50",
"group-data-[focused=true]/day:relative group-data-[focused=true]/day:z-10 group-data-[focused=true]/day:ring-[3px]"
);
const roundingClasses = cn(
"data-[range-end=true]:rounded-md data-[range-end=true]:rounded-r-md",
"data-[range-middle=true]:rounded-none",
"data-[range-start=true]:rounded-md data-[range-start=true]:rounded-l-md"
);
className={cn(baseClasses, stateClasses, focusClasses, roundingClasses, "[&>span]:text-xs [&>span]:opacity-70")}🤖 Prompt for AI Agents
In surfsense_web/components/ui/calendar.tsx around lines 200 to 204, the
className string is excessively long and hard to maintain. Refactor by splitting
the className into multiple logical groups such as baseClasses, stateClasses,
focusClasses, and roundingClasses, each containing related class names. Then
combine these groups using the cn function to form the final className. This
improves readability and maintainability without changing functionality.
| if start_date is None or end_date is None: | ||
| # Fall back to calculating dates | ||
| calculated_end_date = datetime.now() | ||
| calculated_start_date = calculated_end_date - timedelta(days=365) # Check for last 1 year of pages | ||
|
|
||
| # Use calculated dates if not provided | ||
| if start_date is None: | ||
| start_date_iso = calculated_start_date.strftime("%Y-%m-%dT%H:%M:%SZ") | ||
| else: | ||
| # Convert YYYY-MM-DD to ISO format | ||
| start_date_iso = datetime.strptime(start_date, "%Y-%m-%d").strftime("%Y-%m-%dT%H:%M:%SZ") | ||
|
|
||
| if end_date is None: | ||
| end_date_iso = calculated_end_date.strftime("%Y-%m-%dT%H:%M:%SZ") | ||
| else: | ||
| # Convert YYYY-MM-DD to ISO format | ||
| end_date_iso = datetime.strptime(end_date, "%Y-%m-%d").strftime("%Y-%m-%dT%H:%M:%SZ") | ||
| else: | ||
| # Convert provided dates to ISO format for Notion API | ||
| start_date_iso = datetime.strptime(start_date, "%Y-%m-%d").strftime("%Y-%m-%dT%H:%M:%SZ") | ||
| end_date_iso = datetime.strptime(end_date, "%Y-%m-%d").strftime("%Y-%m-%dT%H:%M:%SZ") | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Simplify date conversion logic for better readability.
The nested conditionals for date conversion are hard to follow and could be simplified.
- # Calculate date range
- if start_date is None or end_date is None:
- # Fall back to calculating dates
- calculated_end_date = datetime.now()
- calculated_start_date = calculated_end_date - timedelta(days=365) # Check for last 1 year of pages
-
- # Use calculated dates if not provided
- if start_date is None:
- start_date_iso = calculated_start_date.strftime("%Y-%m-%dT%H:%M:%SZ")
- else:
- # Convert YYYY-MM-DD to ISO format
- start_date_iso = datetime.strptime(start_date, "%Y-%m-%d").strftime("%Y-%m-%dT%H:%M:%SZ")
-
- if end_date is None:
- end_date_iso = calculated_end_date.strftime("%Y-%m-%dT%H:%M:%SZ")
- else:
- # Convert YYYY-MM-DD to ISO format
- end_date_iso = datetime.strptime(end_date, "%Y-%m-%d").strftime("%Y-%m-%dT%H:%M:%SZ")
- else:
- # Convert provided dates to ISO format for Notion API
- start_date_iso = datetime.strptime(start_date, "%Y-%m-%d").strftime("%Y-%m-%dT%H:%M:%SZ")
- end_date_iso = datetime.strptime(end_date, "%Y-%m-%d").strftime("%Y-%m-%dT%H:%M:%SZ")
+ # Convert dates to ISO format for Notion API
+ if start_date:
+ start_date_iso = datetime.strptime(start_date, "%Y-%m-%d").strftime("%Y-%m-%dT%H:%M:%SZ")
+ else:
+ start_date_iso = (datetime.now() - timedelta(days=365)).strftime("%Y-%m-%dT%H:%M:%SZ")
+
+ if end_date:
+ end_date_iso = datetime.strptime(end_date, "%Y-%m-%d").strftime("%Y-%m-%dT%H:%M:%SZ")
+ else:
+ end_date_iso = datetime.now().strftime("%Y-%m-%dT%H:%M:%SZ")Also, consider adding error handling for invalid date formats:
try:
start_date_iso = datetime.strptime(start_date, "%Y-%m-%d").strftime("%Y-%m-%dT%H:%M:%SZ")
except ValueError as e:
logger.error(f"Invalid start_date format: {start_date}")
return 0, f"Invalid start_date format. Expected YYYY-MM-DD"🤖 Prompt for AI Agents
In surfsense_backend/app/tasks/connectors_indexing_tasks.py around lines 332 to
353, the date conversion logic is overly nested and hard to read. Refactor the
code to flatten the conditional structure by first setting default dates if
start_date or end_date is None, then convert both dates to ISO format in a
straightforward manner. Additionally, add try-except blocks around the
datetime.strptime calls to catch ValueError exceptions for invalid date formats,
log an error message with the invalid input, and return an appropriate error
response.
| if start_date is None or end_date is None: | ||
| # Fall back to calculating dates based on last_indexed_at | ||
| calculated_end_date = datetime.now() | ||
|
|
||
| # Check if last_indexed_at is in the future or after end_date | ||
| if last_indexed_naive > end_date: | ||
| logger.warning(f"Last indexed date ({last_indexed_naive.strftime('%Y-%m-%d')}) is in the future. Using 365 days ago instead.") | ||
| start_date = end_date - timedelta(days=365) | ||
| # Use last_indexed_at as start date if available, otherwise use 365 days ago | ||
| if connector.last_indexed_at: | ||
| # Convert dates to be comparable (both timezone-naive) | ||
| last_indexed_naive = connector.last_indexed_at.replace(tzinfo=None) if connector.last_indexed_at.tzinfo else connector.last_indexed_at | ||
|
|
||
| # Check if last_indexed_at is in the future or after end_date | ||
| if last_indexed_naive > calculated_end_date: | ||
| logger.warning(f"Last indexed date ({last_indexed_naive.strftime('%Y-%m-%d')}) is in the future. Using 365 days ago instead.") | ||
| calculated_start_date = calculated_end_date - timedelta(days=365) | ||
| else: | ||
| calculated_start_date = last_indexed_naive | ||
| logger.info(f"Using last_indexed_at ({calculated_start_date.strftime('%Y-%m-%d')}) as start date") | ||
| else: | ||
| start_date = last_indexed_naive | ||
| logger.info(f"Using last_indexed_at ({start_date.strftime('%Y-%m-%d')}) as start date") | ||
| calculated_start_date = calculated_end_date - timedelta(days=365) # Use 365 days as default | ||
| logger.info(f"No last_indexed_at found, using {calculated_start_date.strftime('%Y-%m-%d')} (365 days ago) as start date") | ||
|
|
||
| # Use calculated dates if not provided | ||
| start_date_str = start_date if start_date else calculated_start_date.strftime("%Y-%m-%d") | ||
| end_date_str = end_date if end_date else calculated_end_date.strftime("%Y-%m-%d") | ||
| else: | ||
| start_date = end_date - timedelta(days=365) # Use 365 days as default | ||
| logger.info(f"No last_indexed_at found, using {start_date.strftime('%Y-%m-%d')} (30 days ago) as start date") | ||
|
|
||
| # Format dates for Slack API | ||
| start_date_str = start_date.strftime("%Y-%m-%d") | ||
| end_date_str = end_date.strftime("%Y-%m-%d") | ||
| # Use provided dates | ||
| start_date_str = start_date | ||
| end_date_str = end_date | ||
|
|
||
| logger.info(f"Indexing Slack messages from {start_date_str} to {end_date_str}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Simplify date calculation logic and fix timezone handling issues.
The date calculation logic has several issues:
- The route handler already calculates and passes the date range, making this fallback logic redundant when called from the API.
- Timezone comparison on line 76 could fail - you're comparing timezone-naive datetimes.
- The variable naming is confusing -
start_dateparameter vsstart_date_strvariable.
Consider simplifying this logic:
- # Calculate date range
- if start_date is None or end_date is None:
- # Fall back to calculating dates based on last_indexed_at
- calculated_end_date = datetime.now()
-
- # Use last_indexed_at as start date if available, otherwise use 365 days ago
- if connector.last_indexed_at:
- # Convert dates to be comparable (both timezone-naive)
- last_indexed_naive = connector.last_indexed_at.replace(tzinfo=None) if connector.last_indexed_at.tzinfo else connector.last_indexed_at
-
- # Check if last_indexed_at is in the future or after end_date
- if last_indexed_naive > calculated_end_date:
- logger.warning(f"Last indexed date ({last_indexed_naive.strftime('%Y-%m-%d')}) is in the future. Using 365 days ago instead.")
- calculated_start_date = calculated_end_date - timedelta(days=365)
- else:
- calculated_start_date = last_indexed_naive
- logger.info(f"Using last_indexed_at ({calculated_start_date.strftime('%Y-%m-%d')}) as start date")
- else:
- calculated_start_date = calculated_end_date - timedelta(days=365) # Use 365 days as default
- logger.info(f"No last_indexed_at found, using {calculated_start_date.strftime('%Y-%m-%d')} (365 days ago) as start date")
-
- # Use calculated dates if not provided
- start_date_str = start_date if start_date else calculated_start_date.strftime("%Y-%m-%d")
- end_date_str = end_date if end_date else calculated_end_date.strftime("%Y-%m-%d")
- else:
- # Use provided dates
- start_date_str = start_date
- end_date_str = end_date
+ # Use provided dates or calculate defaults
+ if start_date and end_date:
+ start_date_str = start_date
+ end_date_str = end_date
+ else:
+ # This should only happen when called directly, not from the API
+ logger.warning("Date range not provided to index_slack_messages. Using defaults.")
+ end_date_str = datetime.now().strftime("%Y-%m-%d")
+ if connector.last_indexed_at:
+ start_date_str = connector.last_indexed_at.strftime("%Y-%m-%d")
+ else:
+ start_date_str = (datetime.now() - timedelta(days=365)).strftime("%Y-%m-%d")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if start_date is None or end_date is None: | |
| # Fall back to calculating dates based on last_indexed_at | |
| calculated_end_date = datetime.now() | |
| # Check if last_indexed_at is in the future or after end_date | |
| if last_indexed_naive > end_date: | |
| logger.warning(f"Last indexed date ({last_indexed_naive.strftime('%Y-%m-%d')}) is in the future. Using 365 days ago instead.") | |
| start_date = end_date - timedelta(days=365) | |
| # Use last_indexed_at as start date if available, otherwise use 365 days ago | |
| if connector.last_indexed_at: | |
| # Convert dates to be comparable (both timezone-naive) | |
| last_indexed_naive = connector.last_indexed_at.replace(tzinfo=None) if connector.last_indexed_at.tzinfo else connector.last_indexed_at | |
| # Check if last_indexed_at is in the future or after end_date | |
| if last_indexed_naive > calculated_end_date: | |
| logger.warning(f"Last indexed date ({last_indexed_naive.strftime('%Y-%m-%d')}) is in the future. Using 365 days ago instead.") | |
| calculated_start_date = calculated_end_date - timedelta(days=365) | |
| else: | |
| calculated_start_date = last_indexed_naive | |
| logger.info(f"Using last_indexed_at ({calculated_start_date.strftime('%Y-%m-%d')}) as start date") | |
| else: | |
| start_date = last_indexed_naive | |
| logger.info(f"Using last_indexed_at ({start_date.strftime('%Y-%m-%d')}) as start date") | |
| calculated_start_date = calculated_end_date - timedelta(days=365) # Use 365 days as default | |
| logger.info(f"No last_indexed_at found, using {calculated_start_date.strftime('%Y-%m-%d')} (365 days ago) as start date") | |
| # Use calculated dates if not provided | |
| start_date_str = start_date if start_date else calculated_start_date.strftime("%Y-%m-%d") | |
| end_date_str = end_date if end_date else calculated_end_date.strftime("%Y-%m-%d") | |
| else: | |
| start_date = end_date - timedelta(days=365) # Use 365 days as default | |
| logger.info(f"No last_indexed_at found, using {start_date.strftime('%Y-%m-%d')} (30 days ago) as start date") | |
| # Format dates for Slack API | |
| start_date_str = start_date.strftime("%Y-%m-%d") | |
| end_date_str = end_date.strftime("%Y-%m-%d") | |
| # Use provided dates | |
| start_date_str = start_date | |
| end_date_str = end_date | |
| logger.info(f"Indexing Slack messages from {start_date_str} to {end_date_str}") | |
| # Use provided dates or calculate defaults | |
| if start_date and end_date: | |
| start_date_str = start_date | |
| end_date_str = end_date | |
| else: | |
| # This should only happen when called directly, not from the API | |
| logger.warning("Date range not provided to index_slack_messages. Using defaults.") | |
| end_date_str = datetime.now().strftime("%Y-%m-%d") | |
| if connector.last_indexed_at: | |
| start_date_str = connector.last_indexed_at.strftime("%Y-%m-%d") | |
| else: | |
| start_date_str = (datetime.now() - timedelta(days=365)).strftime("%Y-%m-%d") | |
| logger.info(f"Indexing Slack messages from {start_date_str} to {end_date_str}") |
🤖 Prompt for AI Agents
In surfsense_backend/app/tasks/connectors_indexing_tasks.py around lines 66 to
94, simplify the date calculation by removing redundant fallback logic since the
route handler already provides the date range. Fix timezone handling by ensuring
all datetime comparisons use timezone-aware objects or convert consistently to
naive datetimes before comparing. Rename variables to clearly distinguish
between input parameters (start_date, end_date) and formatted string variables
(e.g., start_date_str) to avoid confusion. Streamline the logic to directly use
provided dates when available and only calculate defaults if both are None,
handling timezone consistently.
| start_date: str = Query(None, description="Start date for indexing (YYYY-MM-DD format). If not provided, uses last_indexed_at or defaults to 365 days ago"), | ||
| end_date: str = Query(None, description="End date for indexing (YYYY-MM-DD format). If not provided, uses today's date"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add date format validation for query parameters.
The endpoint accepts date strings but doesn't validate their format. Invalid dates will cause errors in the indexing functions.
Add validation:
from datetime import datetime
# After getting the parameters
if start_date:
try:
datetime.strptime(start_date, "%Y-%m-%d")
except ValueError:
raise HTTPException(
status_code=400,
detail="Invalid start_date format. Expected YYYY-MM-DD"
)
if end_date:
try:
datetime.strptime(end_date, "%Y-%m-%d")
except ValueError:
raise HTTPException(
status_code=400,
detail="Invalid end_date format. Expected YYYY-MM-DD"
)🤖 Prompt for AI Agents
In surfsense_backend/app/routes/search_source_connectors_routes.py around lines
273 to 274, the start_date and end_date query parameters lack validation for the
expected YYYY-MM-DD format, which can cause errors later. Add validation after
receiving these parameters by attempting to parse them with datetime.strptime
using the "%Y-%m-%d" format inside try-except blocks. If parsing fails, raise an
HTTPException with status code 400 and a clear error message indicating the
invalid date format for each parameter.
feat: Added Calender Based Indexing.
feat: Added Calender Based Indexing.
Description
feat: Added Calendar Based Indexing.
Motivation and Context
To stabilize manual syncing.
Screenshots
@

API Changes
Types of changes
Testing
Checklist:
Summary by CodeRabbit
New Features
Improvements
Dependencies
react-day-pickerlibrary to support the new calendar component.