require suggestion for tensorrt fp16 debug

I had  a model from AMP training, to be used in  CUDA12.8 trt fp16 inference.   
 I convert it to onnx under the onnxruntime-gpu==1.19.2,  and tried to convert the onnx.fp32, into tensorrt fp16 under tensorrt.  I had tried trt v9.2 and trt v10.0.1.6,  but both failed for inference,  generate Nan values.   
under polygraphy,  it report :  
```
E 9 Skipping tactic 0x00000 due to exception [shape.cpp:verify_output_type:1274] Mismatched type for tensor ONNXTRT_Broadcast_106_output   fp1 vs expected type: f32  
E 10 [optimizer.cpp::computeCosts::40448] Error Code 10: Internal Error (Could not find any implementation for node {ForeignNode[(Unnamed Layer * 1183)[Cast]...ONNXTRT_unsqueezeTensor_12483]})
```    
I had tried to replace all  unsqueeze op into reshape in the code, but the same error.   
and try to increase the workspace value,  the same error.    
I do not know how to map the polygraphy info back to python code .   Is there any suggestion? what is wrong in my onnx model?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

require suggestion for tensorrt fp16 debug #1030

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

require suggestion for tensorrt fp16 debug #1030

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions