Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

jnrusson1
Copy link
Contributor

Reference Issues/PRs

Fixes #8696
May clash with #8710

What does this implement/fix? Explain your changes.

This PR fixes a critical dimension mismatch issue in the LSTM-FCN network's attention mechanism. The original custom attention implementation was designed for sequence-level processing but was being used in an LSTM cell context, causing the attention mechanism to fail when processing individual timesteps.

Changes made:

  • Replaced the complex, buggy custom AttentionLSTM implementation with the standard Keras Attention layer
  • Fixed the input shape handling by applying attention to the full sequence before LSTM processing
  • Removed the problematic _time_distributed_dense function that expected timesteps that didn't exist at the cell level
  • Maintained backward compatibility - the attention=True parameter still works as expected
  • Simplified the codebase by removing ~800 lines of custom attention implementation

Why this fix was needed:
The original attention mechanism tried to process individual timesteps (batch, features) but expected full sequences (batch, timesteps, features). This caused the _time_distributed_dense function to fail when trying to reshape inputs that had no timestep dimension.

Does your contribution introduce a new dependency? If yes, which one?

No new dependencies. The Keras Attention layer is already available in TensorFlow, which is already a dependency of sktime.

What should a reviewer concentrate their feedback on?

  • Verify that the attention mechanism now works correctly with LSTM-FCN
  • Check that the fix maintains backward compatibility
  • Confirm that the simplified implementation is cleaner and more maintainable
  • Ensure that the dimension handling is now correct

Did you add any tests for the change?

Yes, the existing tests should continue to pass. The fix resolves the core issue that was preventing the attention mechanism from working, so existing LSTM-FCN tests with attention=True should now work correctly. (not sure if there were any of these tests though(

Any other comments?

This fix addresses a fundamental architectural issue where the attention mechanism was misapplied. The original custom implementation was overly complex for the use case and introduced bugs. The Keras Attention layer provides the same functionality in a simpler, more reliable way.

The fix also makes the codebase more maintainable by removing custom attention code that was difficult to debug and maintain.

@fkiraly fkiraly added module:classification classification module: time series classification enhancement Adding new functionality labels Aug 21, 2025
@fkiraly
Copy link
Collaborator

fkiraly commented Aug 21, 2025

fixed a small linting error and started the tests

Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The failure is unrelated

@fkiraly fkiraly merged commit 3696d8c into sktime:main Aug 21, 2025
64 of 66 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Adding new functionality module:classification classification module: time series classification
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] LSTMFCNCLassifier throws an error when attention=True
2 participants