Skip to main content

Custom YOLO model not showing bounding box predictions on ONNX-HailoRT

Completed

Comments

6 comments

  • Ayoub Assis
    • Network Optix team

    Hi Ael, 

    Thanks for the detailed report of the issue you're encountering.

    Firstly, let me answer the general questions, then I'll address the issues you're running into with your POC.

    1. What makes the Nx Demo model compatible with the Hailo Runtime while our custom model fails?
      Currently, the Nx AI Cloud doesn't have an automated process to convert any uploaded ONNX to Hailo's ONNX format, mainly due to Hailo dataflow compiler requiring some inputs that vary based on the model. It's currently only capable of compiling Teachable Machine models for Hailo chips. We're still working on way to extend the set of model architectures that can be automatically exported to Hailo. 
      Given this limitation, we provide a workaround for deploying models on Hailo chips by directly uploading the Hailo compiled model to the Nx AI cloud instead of the regular ONNX (or compress them together in a zip file and upload it to be able to deploy on both CPU/GPU and Hailo as well).
      You can find a tutorial with code that was used to compile one of the Nx demo models for Hailo chips here.
    2. Are there specific ONNX model optimizations or configurations required for Hailo compatibility?
      The Nx demo models that we compile for Hailo don't require optimizations to be compatible for Hailo. I'm not sure about other models.
      Regarding the configuration, when compiling a model for Hailo you will probably need to specify the start_node_names and end_node_names. Please refer to Hailo dataflow compiler user guide for more details about the compilation configuration.
    3. Which specific versions of ONNX, Python, and ONNX runtime should we use when preparing models for Hailo?
      If you're referring to Python and ONNX versions required to export the trained Yolov10 from PyTorch to ONNX as detailed here. I've just updated the requirements.txt file and tested it using Python 3.10. Let me know if things don't work as expected.
      But if you're referring to compiling (or preparing as you said) model for Hailo, you can checkout the developer zone in hailo.

    With that said, to help us figure out what's going wrong, could you please run the script in this page and send us the archive it generates. Also can you share the UUID of Yolov10 model to look into why the boxes don't show up when you run in on CPU. Once the latter is resolved, you can then compile the model for Hailo, upload it, then deploy it on the Hailo chip.

    0
  • Ael Lee

    Here is the content of the ai_manager_run.txt file. There are also many other files that were generated, but I am unable to attach them to the comments:

    MODULE: 1741121280670 000000202: Notice: retrieving possible post-processing options.
    MODULE: 1741121280670 000000134: Created new SHM for 0 : 524350
    MODULE: 1741121280670 000000179: Starting inference engine with: 1 1
    MODULE: 1741121280671 000000526: Notice: Starting inference engine with pid: 134256.
    MODULE: 1741121280671 000000009: Notice: Module is waiting for ready signal from 0
    MODULE: 1741121280776 000104566: Informational: Received ready signal from 0
    MODULE: 1741121280776 000000010: Inference engine 0 ready.
    MODULE: 1741121280776 000000117: Notice: Starting thread 0
    MODULE: 1741121280776 000000021: Informational: Module socket path [/tmp/nxai_manager.sock]
    MODULE: 1741121280776 000000031: Notice: Waiting for thread 0 to finish.
    MODULE: 1741121280776 000000009: Notice: Thread 0 started.
    MODULE: 1741121280776 000000004: Notice: Sclbl module start inference.
    MODULE: 1741121280776 000000004: Informational: Checking for input message.
    MODULE: 1741121280776 000000004: Informational: Waiting for input message.
    MODULE: 1741121280809 000033699: Informational: Input received 0.
    MODULE: 1741121280810Error! Failed to send signal to 0. Possibly inference engine crashed.
     000000277: Notice: sclbl_module_video_run_single_inference.
    MODULE: 1741121280810 000000074: Notice: Preprocessing.
    MODULE: 1741121280814 000003966: Notice: start inference.
    MODULE: 1741121280814 000000089: Notice: New SHM ID for 0 is 524355 with capacity: 1228878
    MODULE: 1741121280815 000001422: Notice: Sending signal to 0
    MODULE: 1741121280815 000000083: Error! Failed to send signal to 0. Possibly inference engine crashed.
    MODULE: 1741121280815 000000080: Notice: after inference.
    MODULE: 1741121280815 000000003: Notice: after sclbl_set_ts_model_end.
    MODULE: 1741121280815 000000003: Warning: Failed to run sclbl_module_client_inference.
    MODULE: 1741121280816 000000127: Input object released
    MODULE: 1741121280816 000000048: Informational: Waiting for socket listener to exit.
    MODULE: 1741121280838 000022295: Notice: Socket listener exited.
    MODULE: 1741121280838 000000008: Informational: Sclbl module has been stopped.
    MODULE: 1741121280838 000000150: Notice: Waiting for inference engine 134256.
    MODULE: 1741121280838 000000036: Inference engine exited 0
    MODULE: 1741121280838 000000009: Finalizing logging.
    
    
    0
  • Nick Bedbury

    Hello, I am working with Ael on this.

    Is there any update to the request?  Can we provide any more detail to advance this troubleshooting?

    0
  • Ayoub Assis
    • Network Optix team

    Hi Nick,

    We've asked Ael for more information privately. I'll loop you in.

    0
  • Norman
    • Network Optix team

    This topic was converted into a support ticket. 

    0
  • Ichiro
    • Network Optix team

    Nick came up with a solution.

    He ended up getting this working by starting with an ONNX file from the Scailable Model Zool rather than Ultralytics.  With that starting point, I got the YOLOv8s-640x640 model compiled, uploaded, and running on target hardware with a Hailo8 chip.

    0

Post is closed for comments.