problems about feature extraction models

Hi,tgc! I tried using Torch's fasterrcnn_resnet50_fpn pre-trained model to extract the region_features of the video, but found that the feature shapes I extracted were only [823, 4], which is far from [26, 36, 2048] and [26, 36, 5] in the dataset you provided. What does the extra dimension mean, or what do these three dimensions mean respectively? 
I wonder that is it feasible to use Torchvision's fasterrcnn_resnet50_fpn model to extract features without using caffe's Fast R-CNN model?The sizeof features extracted using Torchvision's fasterrcnn_resnet50_fpn model is significantly insufficient.How can I extract more features and accurate feature dimensions that meet the requirements?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

problems about feature extraction models #31

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

problems about feature extraction models #31

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions