-
Notifications
You must be signed in to change notification settings - Fork 544
Open
Description
I had a model from AMP training, to be used in CUDA12.8 trt fp16 inference.
I convert it to onnx under the onnxruntime-gpu==1.19.2, and tried to convert the onnx.fp32, into tensorrt fp16 under tensorrt. I had tried trt v9.2 and trt v10.0.1.6, but both failed for inference, generate Nan values.
under polygraphy, it report :
E 9 Skipping tactic 0x00000 due to exception [shape.cpp:verify_output_type:1274] Mismatched type for tensor ONNXTRT_Broadcast_106_output fp1 vs expected type: f32
E 10 [optimizer.cpp::computeCosts::40448] Error Code 10: Internal Error (Could not find any implementation for node {ForeignNode[(Unnamed Layer * 1183)[Cast]...ONNXTRT_unsqueezeTensor_12483]})
I had tried to replace all unsqueeze op into reshape in the code, but the same error.
and try to increase the workspace value, the same error.
I do not know how to map the polygraphy info back to python code . Is there any suggestion? what is wrong in my onnx model?
Metadata
Metadata
Assignees
Labels
No labels