Fix duplicate skeletons during labels merge #2075

gitttt-1234 · 2024-12-20T19:07:20Z

Description

This PR fixes the duplicate skeleton issue when merging labels file. After every update to the labels file, we check if there's a existing skeleton that matches with a new Skeleton associated with an instance in the Labeled frame. If the skeleton doesn't match, then we add it to the list of skeletons in the Labels object.

Types of changes

Does this address any currently open issues?

Found multiple skeletons after running inference #2025
Error occurs during creation of subsequent training configurations and merging predictions in SLEAP GUI #1090
Loading new skeleton from file causes labels to contain multiple conflicting skeletons #713

Outside contributors checklist

Review the guidelines for contributing to this repository
Read and sign the CLA and add yourself to the authors list
Make sure you are making a pull request against the develop branch (not main). Also you should start your branch off develop
Add tests that prove your fix is effective or that your feature works
Add necessary documentation (if appropriate)

Thank you for contributing to SLEAP!

❤️

Summary by CodeRabbit

Refactor
- Enhanced the internal label management for more reliable merging and display.
New Features
- Improved the label import process, resulting in more accurate grouping and consolidation.
- Streamlined the export flow by automatically applying default filenames, removing the need for manual file selection.
Tests
- Updated tests to align with the revised label import behaviors.
- Adjusted assertions in tests to reflect changes in expected track counts.
- Removed a test function related to skeleton unification, indicating a shift in testing strategy.

coderabbitai · 2024-12-20T19:07:28Z

Walkthrough

This update refines the label update process in the dataset module and adjusts GUI command tests. In the dataset code, the merging logic for skeletons, nodes, and tracks has been reorganized for clarity and reliability. Additionally, the expected track count for DeepLabCut imports has been modified, and a test function related to skeleton unification has been removed, indicating a shift in testing focus.

Changes

File(s)	Change Summary
sleap/io/dataset.py	Modified the `_update_from_labels` method in the `Labels` class to update skeletons only when empty and to add a merge block when the merge flag is set. Simplified node updates by removing merge logic, and streamlined track merging. Also includes minor code cleanup for clarity.
tests/gui/test_commands.py	Updated the expected track count in `test_import_labels_from_dlc_folder` (from 3 to 2).
tests/io/test_dataset.py	Removed the `test_dont_unify_skeletons` function, which tested the behavior of the `Labels` class regarding skeleton unification.

Sequence Diagram(s)

sequenceDiagram
    participant L as Labels Instance
    participant S as Skeletons List
    participant N as Nodes List
    participant T as Tracks List

    L->>L: _update_from_labels(merge)
    alt Skeleton list is empty
        L->>S: Create new skeletons
    else merge flag is true
        L->>S: Check and merge duplicate skeletons
    end
    alt Nodes list is empty
        L->>N: Build nodes from skeletons
    end
    alt Tracks list is empty
        L->>T: Update and merge tracks
    end

Poem

In the code garden, I happily hop,
Updating skeletons till the bugs all stop.
Nodes and tracks align in a row,
Merging logic makes the clean code glow.
Hoppity changes from a rabbit with a techy heart 🐇💻!

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d497566 and 74455e2.

📒 Files selected for processing (2)

sleap/io/dataset.py (2 hunks)
tests/gui/test_commands.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

tests/gui/test_commands.py
sleap/io/dataset.py

⏰ Context from checks skipped due to timeout of 90000ms (3)

GitHub Check: Tests (macos-14)
GitHub Check: Tests (windows-2022)
GitHub Check: Tests (ubuntu-22.04)

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai plan to trigger planning for file edits and PR creation.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

sleap/io/dataset.py (1)

486-503: Nested loops may add skeletons repeatedly or hamper performance.

This triple-nested loop can re-check skeleton matches an excessive number of times. Once a match has been found, consider breaking early to avoid redundant checks. Additionally, partial matches across multiple frames might make merges fail or run slower than necessary.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7785f66 and d0af4e2.

📒 Files selected for processing (3)

sleap/io/dataset.py (4 hunks)
tests/gui/test_commands.py (1 hunks)
tests/io/test_dataset.py (0 hunks)

💤 Files with no reviewable changes (1)

tests/io/test_dataset.py

🔇 Additional comments (1)

tests/gui/test_commands.py (1)

73-73: Change in the number of expected tracks from 3 to 2.

This updated assertion likely reflects the new logic that merges or removes duplicates. Confirm that the new expectation accurately represents the final track count after the improved merging routine.

sleap/io/dataset.py

codecov · 2024-12-23T16:53:08Z

Codecov Report

❌ Patch coverage is 97.14286% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 76.15%. Comparing base (7991f14) to head (74455e2).
⚠️ Report is 181 commits behind head on develop.

Files with missing lines	Patch %	Lines
sleap/io/dataset.py	97.14%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #2075      +/-   ##
===========================================
+ Coverage    75.43%   76.15%   +0.71%     
===========================================
  Files          134      134              
  Lines        24749    25050     +301     
===========================================
+ Hits         18670    19077     +407     
+ Misses        6079     5973     -106

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (5)

sleap/io/dataset.py (5)
469-470: Minor optimization suggestion for set union.

Since self.skeletons is guaranteed to be empty in this block, the union with self.skeletons is redundant. You can directly build the list of skeletons from the labeled frames:
-if len(self.skeletons) == 0:
-    self.skeletons = list(
-        set(self.skeletons).union(
-            {
-                instance.skeleton
-                for label in self.labels
-                for instance in label.instances
-            }
-        )
-    )
+if not self.skeletons:
+    self.skeletons = list(
+        {
+            instance.skeleton
+            for label in self.labels
+            for instance in label.instances
+        }
+    )
483-484: Use generator expression instead of list comprehension for sets.

Minor style improvement: you can simplify the comprehension by dropping the brackets inside set():
-set([node for skeleton in self.skeletons for node in skeleton.nodes])
+set(node for skeleton in self.skeletons for node in skeleton.nodes)
507-525: Consider consolidating track merging logic further.

Your approach for deduplicating tracks relies on checking each new track against all existing tracks using any(track.matches(t) for t in new_tracks). This works but is O(n^2) in the worst case for large sets. Also, consider whether you need to unify references in the labeled frames (like with skeleton merging). You might unify track objects similarly so that instances reference a single canonical track.

1926-1926: Avoid repeating “update nodes” logic.

You are repeating the same “collect all nodes from skeletons” pattern here (similar to lines 483-484). Consider extracting this into a helper method (e.g. _update_nodes_from_skeletons) to maintain consistency and reduce duplication.

2326-2328: Use “if not ret” instead of “if ret == False”.

It’s more Pythonic to write:
-if ret == False:
+if not ret:
This improves readability and better conveys intent.

🧰 Tools

🪛 Ruff (0.8.2)

2327-2327: Avoid equality comparisons to False; use if not ret: for false checks

Replace with not ret

(E712)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 218fe94 and 90ed917.

📒 Files selected for processing (1)

sleap/io/dataset.py (3 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (3)

GitHub Check: Tests (macos-14)
GitHub Check: Tests (windows-2022)
GitHub Check: Tests (ubuntu-22.04)

🔇 Additional comments (1)

sleap/io/dataset.py (1)

481-481: No immediate issues.

The guarded check on len(self.nodes) == 0 is logically consistent with how skeletons are handled. It cleanly ensures nodes are updated only if they’re empty.

sleap/io/dataset.py

Fix duplicate skeletons during merge

6061951

Fix merge skeletons

12081d7

gitttt-1234 mentioned this pull request Dec 23, 2024

Found multiple skeletons after running inference #2025

Open

Fix tests

d0af4e2

gitttt-1234 requested a review from roomrys December 23, 2024 16:22

gitttt-1234 marked this pull request as ready for review December 23, 2024 16:22

coderabbitai bot reviewed Dec 23, 2024

View reviewed changes

sleap/io/dataset.py Outdated Show resolved Hide resolved

sleap/io/dataset.py Outdated Show resolved Hide resolved

gitttt-1234 added 2024-hackathon bug Something isn't working labels Jan 2, 2025

Fix merge tracks

65a0fa0

roomrys mentioned this pull request Mar 18, 2025

Command Line Error During Inference: OOM #2142

Open

gitttt-1234 added the april-2025-hackathon label Apr 4, 2025

emdavis02 self-assigned this Apr 7, 2025

emdavis02 added 3 commits April 7, 2025 10:27

Merge branch 'develop' into divya/fix-merge-labels

9bffbf3

fixing error in track merging

218fe94

fixed bug in iterating across existing tracks

90ed917

coderabbitai bot reviewed Apr 7, 2025

View reviewed changes

sleap/io/dataset.py Outdated Show resolved Hide resolved

emdavis02 added 9 commits April 8, 2025 10:50

track sorting error

5aad60c

Merge branch 'develop' into divya/fix-merge-labels

bf0aab0

skeleton error

111712d

black

b93cdba

removing test_dont_unify_skeletons

ac72df9

cleaning up code

d497566

edge cases

a61a6ea

black

45e5d3c

Merge branch 'develop' into divya/fix-merge-labels

74455e2

talmo closed this Oct 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix duplicate skeletons during labels merge #2075

Fix duplicate skeletons during labels merge #2075

Uh oh!

gitttt-1234 commented Dec 20, 2024 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Dec 20, 2024 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Dec 23, 2024 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Fix duplicate skeletons during labels merge #2075

Fix duplicate skeletons during labels merge #2075

Uh oh!

Conversation

gitttt-1234 commented Dec 20, 2024 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Types of changes

Does this address any currently open issues?

Outside contributors checklist

Thank you for contributing to SLEAP!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Dec 20, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Dec 23, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

gitttt-1234 commented Dec 20, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 20, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

codecov bot commented Dec 23, 2024 •

edited

Loading