-
Notifications
You must be signed in to change notification settings - Fork 104
Description
Hi People,
First, thanks for your astonishing work! I am currently trying out how HIPT performs in a Grading task and recognised, performance is underwhelming using the features I extracted with the proposed two-stage HIPT4K (using the weights provided in this repo). I then realized, that prepared features are supplied here and recreated my efforts using them. Sadly, to no avail :( I think there is also another ticket with similar issues #19. This is the scatter plot of a 2-component PCA on the slide-level (mean) TCGA-PRAD features:
Note, that I filtered 16 relevant features of the total 192 by calculating the Pearson-r against the referred Gleason score (that the labels in the scatter also refer to). There appears to be a bit of clustering for Gleason 7 and 9, but overall, it doesn't seem the pretrained models capture important properties. My theory is, that this is since other Gleason scores have too few examples. I have, however, already worked with SSL vision transformers for prostate cancer histopathology and found the models had good extraction capabilities. I am also aware of work that confirms good SSL feature extraction capabilities when using the TCGA-PRAD data.
Therefore, I wanted to ask if I really got things right here. So:
- I am using the tensors stored in
HIPT/3-Self-Supervised-Eval/embeddings_slide_lib/embeddings_slide_lib/vit256mean_tcga_slide_embeddings. I assume correctly that these refer to the features of each WSI's extracted 4k patches? - The number of tumours used differs in this repo and the paper, but the provided models HAVE indeed been pretrained on the TCGA-PRAD as well, right?
Besides my technical questions, I'd really love to hear what you think about this. I'm a little short on time currently, but if you think it's worth the effort, I'd also volunteer for adapting the approach such that it achieves sufficient results on TCGA-PRAD. Maybe working on a different magnification could already do the trick, for prostate tumours cell-level information is not of uttermost importance (AFAIK). This would be of great value for the digital pathology world :)
Kind regards
M