- 
                Notifications
    You must be signed in to change notification settings 
- Fork 299
Approximate joint diagonalization #571
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| Really excited to have this in TensorLy @aarmey, it's a neat method! | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Just left some notes in the code.
|  | ||
|  | ||
| def jointdiag( | ||
| X, | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd name this matrices or matrices_tensor - X makes me think it's a single matrix at first glance (though granted should read the docstring).
        
          
                tensorly/utils/jointdiag.py
              
                Outdated
          
        
      | print(f"Sweep # 0: e = {e:.3e}") | ||
|  | ||
| # Additional output parameters | ||
| Q_total = tl.eye(D) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same for these intermediate variables: it would be helpful to give them intuitive names
| for k in range(max_iter): | ||
| # loop over all pairs of slices | ||
| for p, q in combinations(range(D), 2): | ||
| # Finds matrix slice with greatest variability among diagonal elements | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we just refer to them as matrices rather than matrix slice everywhere when we talk of an element X[:, :, j] ?
| + tl.norm(Xh[all_but_pq, p]) ** 2 | ||
| + tl.norm(Xh[all_but_pq, q]) ** 2 | ||
| ) | ||
| xih = Xh[p, q] - Xh[q, p] | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, the name isn't super helpful
|  | ||
| T. Fu and X. Gao, “Simultaneous diagonalization with similarity transformation for | ||
| non-defective matrices”, in Proc. IEEE International Conference on Acoustics, Speech | ||
| and Signal Processing (ICASSP 2006), vol. IV, pp. 1137-1140, Toulouse, France, May 2006. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make it as a proper reference [1]_ and add it a reference section of the doctoring (see e.g.
tensorly/tensorly/decomposition/_cp.py
Line 681 in d4652c8
| .. [3] Casey Battaglino, Grey Ballard and Tamara G. Kolda, | 
| for i in range(k): | ||
| temp_diag = np.diag(rng.randn(d)) | ||
| diags[:, :, i] = temp_diag | ||
| synthetic[:, :, i] = np.linalg.inv(mixing) @ temp_diag @ mixing | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually quite useful to understand the type of mixing allowed: might be useful to have a short Notes section in the doctoring to explain what kind of scrambling the algorithm supports and roughly what it does?
This adds a function for deriving an approximate joint diagonalization for a set of matrices. This function is the basis of solvers that derive PARAFAC factorization by diagonalizing the Tucker decomposition. By using only a few iterations, this can also be used to improve the starting estimate for initializing PARAFAC.