My plug-in caused vms to restart

Answered

Comments

11 comments

  • Avatar
    Andrey Terentyev

    Hello,

    pushUncompressedVideoFrame() is called per every frame receive by the Server. That means LoadModel_Async() called multiple times causing the model to be loaded/reloaded multiple times at the rate of current stream FPS.

    pushUncompressedVideoFrame() is not a proper place for any initialization procedures like loading models, reading config file etc, which are supposed to be done once.

    I'd suggest placing LoadModel_Async() to the DeviceAgent constructor.

    Another and better option would be loading the model in the m_app member constructor.

    0
    Comment actions Permalink
  • Avatar
    zwq

    Thank you very much for your reply
    In fact, pushUncompressedVideoFrame() is called only once, and LoadModel_Async() called once too
    I tried placing LoadModel_Async() to the DeviceAgent constructor, the vms restarts as the same
    I wonder if vms has loaded other versions of the openvino library, which conflicts with my openvino2022.3

    0
    Comment actions Permalink
  • Avatar
    Andrey Terentyev

    Hi,

    I tried placing LoadModel_Async() to the DeviceAgent constructor, the vms restarts as the same

    Obviously, there is something inside the LoadModel_Async() method that crashes and causes the Server to restart.

    Try to comment step-by-step the code inside the method to find out which part of code or function causes the crash.

    I wonder if vms has loaded other versions of the openvino library, which conflicts with my openvino2022.3

    Out of the box, VMS does not have openvino library.

    For troubleshooting, you could search server log files for messages having the PluginManager string.

    I guess the plugin binary just can't find the openvino library in the OS.

    How have you installed the openvino library? Have you refactored the project and indicated openvino header file locations, library binary locations?

    Have you read our developer guide and tried to build opencv plugin?

    https://meta.nxvms.com/docs/developers/knowledgebase/200-introduction-to-creating-a-video-analytics-plugin

    Could you provide OS version, Metadata SDK version?

    Could you share your project code?

     

    0
    Comment actions Permalink
  • Avatar
    zwq

    Hi, I'm sorry for the late reply because of the holiday.

    Obviously, there is something inside the LoadModel_Async() method that crashes and causes the Server to restart.
    Try to comment step-by-step the code inside the method to find out which part of code or function causes the crash.

    You are right, my plugin crashed at ov::Core::Core,
    I find an core.xxxx in /opt/networkoptix/mediaserver/bin, bt info as below:

        Program terminated with signal SIGABRT, Aborted.
        #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
        50    ../sysdeps/unix/sysv/linux/raise.c: 没有那个文件或目录.
        [Current thread is 1 (Thread 0x7f453a73b700 (LWP 14693))]
        (gdb) bt
        #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
        #1  0x00007f45ee9b8859 in __GI_abort () at abort.c:79
        #2  0x00007f45eee59c20 in __gnu_cxx::__verbose_terminate_handler ()
            at /home/jenkins/conan-build-2/.conan/data/gcc-toolchain/10.2/_/_/build/f24d9d4a49445fd389b06d3f22addc2784700473/.build/x86_64-linux-gnu/src/gcc/libstdc++-v3/libsupc++/vterminate.cc:95
        #3  0x00007f45eee583bd in __cxxabiv1::__terminate (handler=<optimized out>)
            at /home/jenkins/conan-build-2/.conan/data/gcc-toolchain/10.2/_/_/build/f24d9d4a49445fd389b06d3f22addc2784700473/.build/x86_64-linux-gnu/src/gcc/libstdc++-v3/libsupc++/eh_terminate.cc:48
        #4  0x00007f45eee583ff in std::terminate ()
            at /home/jenkins/conan-build-2/.conan/data/gcc-toolchain/10.2/_/_/build/f24d9d4a49445fd389b06d3f22addc2784700473/.build/x86_64-linux-gnu/src/gcc/libstdc++-v3/libsupc++/eh_terminate.cc:58
        #5  0x00007f45eee585e8 in __cxxabiv1::__cxa_throw (obj=0x7f455800c150, tinfo=0x7f45d47cd0c8, 
            dest=0x7f457e8ff9e0)
            at /home/jenkins/conan-build-2/.conan/data/gcc-toolchain/10.2/_/_/build/f24d9d4a49445fd389b06d3f22addc2784700473/.build/x86_64-linux-gnu/src/gcc/libstdc++-v3/libsupc++/eh_throw.cc:95
        #6  0x00007f457e8619cb in ?? ()
           from /opt/networkoptix/mediaserver/bin/plugins/JVSmokePhone_analytics_plugin/libopenvino.so.2230
        #7  0x00007f457effe65d in ov::Core::Core(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) ()
           from /opt/networkoptix/mediaserver/bin/plugins/JVSmokePhone_analytics_plugin/libopenvino.so.2230
        #8  0x00007f45d419afdb in ?? ()
        #9  0x00007f455800a890 in ?? ()
        #10 0x00007f455800a880 in ?? ()
        #11 0x00007f45d47da480 in ?? ()
        #12 0x00007f4558009c40 in ?? ()
        #13 0x00007f453a73a730 in ?? ()
        #14 0x0000000000000000 in ?? ()

    How have you installed the openvino library? Have you refactored the project and indicated openvino header file locations, library binary locations?

    There are four files in the attachment:
    1 App_SmokePhone.tgz
    2 JVSmokePhone_analytics_plugin.tgz
    3 core.1683170987
    4 metadata_sdk.zip : plugin source code

    1&2 is the plugin, you can install as below:
    cd /opt
    tar zxf App_SmokePhone.tgz
    tar zxf JVSmokePhone_analytics_plugin.tgz
    sudo cp JVSmokePhone_analytics_plugin /opt/networkoptix/mediaserver/bin/plugins/ -rfp
    sudo service networkoptix-mediaserver restart

    4 is plugin source code, my code is in metadata_sdk/samples/App_SmokePhone:

     


    Have you read our developer guide and tried to build opencv plugin?
    https://meta.nxvms.com/docs/developers/knowledgebase/200-introduction-to-creating-a-video-analytics-plugin

    no, This document was not found before
    I compiled with the default g++9. Switch to g++8 according to the documentation, and the problem remains

    Could you provide OS version, Metadata SDK version?

    OS: Linux mix 5.15.0-71-generic #78~20.04.1-Ubuntu SMP Wed Apr 19 11:26:48 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
    Metadata SDK: nxwitness-server-5.0.0.36634-linux_x64.deb

    Could you share your project code?

    yes

    Attachment:

        https://pan.baidu.com/s/1RT6FlcYpKHRDJnEOs9p9nQ?pwd=nhp3 
        code:nhp3

     

    0
    Comment actions Permalink
  • Avatar
    Andrey Terentyev

    Hello,

    Could you please relocate your archive to another cloud storage? For downloading the file, gaidu requires baidu network disk to be installed, which is not something I'd like to do.

    0
    Comment actions Permalink
  • Avatar
    Andrey Terentyev

    Hello,

    I got your materials. Thanks.

    I copied the JVSmokePhone_analytics_plugin to the Server and restarted the Server. Here is what I see in

    /opt/networkoptix/mediaserver/var/log/log_file.log

    2023-05-05 16:02:25.889 1203745   INFO PluginManager(0x7f73240493a0): Considering to load Server plugin [/opt/networkoptix/mediaserver/bin/plugins/JVSmokePhone_analytics_plugin/libJVSmokePhone_analytics_plugin.so]
    2023-05-05 16:02:25.890 1203745   ERROR PluginManager(0x7f73240493a0): Failed loading Server Plugin [/opt/networkoptix/mediaserver/bin/plugins/JVSmokePhone_analytics_plugin/libJVSmokePhone_analytics_plugin.so] (): [cannotLoadLibrary]: Cannot load library /opt/networkoptix/mediaserver/bin/plugins/JVSmokePhone_analytics_plugin/libJVSmokePhone_analytics_plugin.so: (libpugixml.so.1: cannot open shared object file: No such file or directory)

    Your plugin is missing a dependency (the libpugixml library), which should be placed in the same directory the plugin binary resides, i.e. JVSmokePhone_analytics_plugin.

    0
    Comment actions Permalink
  • Avatar
    zwq

    Hi

    Sorry for that, my device has installed OpenVINO SDK, so it's not clean. 
    on the dirty device, the plugin can find libpugixml.so in /usr/lib/x86_64-linux-gnu/, which is installed by OpenVINO SDK.

    You can download the patch and then install it by the following steps:
    https://drive.google.com/file/d/1lYTnRb6fZa-R24OPeeEjySPHAxxpdtQi/view?usp=share_link

    sudo tar xf JVSmokePhone_analytics_plugin.patch.tar -C /opt/networkoptix/mediaserver/bin/plugins/JVSmokePhone_analytics_plugin/
    sudo service networkoptix-mediaserver restart

    Then vms restarts all the time

    0
    Comment actions Permalink
  • Avatar
    Andrey Terentyev

    Hello,

    I have investigated your code.

    Please, follow the recommendations given above.

    https://support.networkoptix.com/hc/en-us/community/posts/14135390034583/comments/14190278658711

    I'll try to explain using your code. Here it is.

    /**
     * Called when the Server sends a new uncompressed frame from a camera.
     */
    bool DeviceAgent::pushUncompressedVideoFrame(const IUncompressedVideoFrame* videoFrame)
    {
        ++m_frameIndex;
        m_lastVideoFrameTimestampUs = videoFrame->timestampUs();

        uint64_t ts_begin = jvnn::CTool::GetTimeStapUs();
        if (!m_app) {
            m_app = new jvsp::AppSmokePhone(this);
            int planeCount = videoFrame->planeCount();
            int lineSize0 = videoFrame->lineSize(0);
            int width = videoFrame->width();
            int height = videoFrame->height();
            nnDbg(NN_LOG, "creating app, frame info: wh=%d*%d, planeCount=%d, lineSize(0)=%d\n", width, height, planeCount, lineSize0);
            if (planeCount == 1 && lineSize0 == width * 3) {
                m_app->SetFrameInfo(width, height);
                m_app->LoadModel_Async();
            } else {
                nnDbg(NN_ERR, "unsupport image format, planeCount != 1 || lineSize0 != width * 3\n");
            }
        }
        m_app->HandleFrame(videoFrame->data(0), m_lastVideoFrameTimestampUs, videoFrame->width(), videoFrame->height());
        uint64_t ts_end = jvnn::CTool::GetTimeStapUs();
        nnDbg(NN_LOG, "frm: ts=%zu, cost time %zu us\n", m_lastVideoFrameTimestampUs, ts_end-ts_begin);

        return true; //< There were no errors while processing the video frame.
    }

    As the comment states, the method will be called per every frame. Let's assume the frame rate is 8 Frames per second. The method will be invoked 8 times per second.

    As I understand, you need one model per camera, meaning the model does not change from frame to frame. So, it can be loaded only once.

    Your model works correctly only if this condition is met.

    if (planeCount == 1 && lineSize0 == width * 3)

    Here is something, similar to what I'd recommend refactoring.

    DeviceAgent::DeviceAgent(const nx::sdk::IDeviceInfo* deviceInfo):
        // Call the DeviceAgent helper class constructor telling it to verbosely report to stderr.
        ConsumingDeviceAgent(deviceInfo, /*enableOutput*/ true)
    {
        nnDbg(NN_LOG, "Creating the app")
        m_app = new jvsp::AppSmokePhone(this);
        if (m_app) {
            nnDbg(NN_LOG, "Loading the model")
            m_app->LoadModel();
        }
    }

    Pay your attention, the model is loaded synchronously.

    bool DeviceAgent::pushUncompressedVideoFrame(const IUncompressedVideoFrame* videoFrame)
    {
        ++m_frameIndex;
        m_lastVideoFrameTimestampUs = videoFrame->timestampUs();

        uint64_t ts_begin = jvnn::CTool::GetTimeStapUs();
        if (!m_app) {
            return false
        }
               
        int planeCount = videoFrame->planeCount();
        int lineSize0 = videoFrame->lineSize(0);
        int width = videoFrame->width();
        int height = videoFrame->height();
        nnDbg(NN_LOG, "frame info: wh=%d*%d, planeCount=%d, lineSize(0)=%d\n", width, height, planeCount, lineSize0);
        if !(planeCount == 1 && lineSize0 == width * 3) {
            nnDbg(NN_ERR, "unsupport image format, planeCount != 1 || lineSize0 != width * 3\n");
            return false
        }    

        m_app->SetFrameInfo(width, height);
        m_app->HandleFrame(videoFrame->data(0), m_lastVideoFrameTimestampUs, videoFrame->width(), videoFrame->height());
        uint64_t ts_end = jvnn::CTool::GetTimeStapUs();
        nnDbg(NN_LOG, "frm: ts=%zu, cost time %zu us\n", m_lastVideoFrameTimestampUs, ts_end-ts_begin);

        return true; //< There were no errors while processing the video frame.
    }
    0
    Comment actions Permalink
  • Avatar
    zwq

    Hi

    According to your suggestion, modify the code as below, and vms restarts as the same

    DeviceAgent::DeviceAgent(const nx::sdk::IDeviceInfo* deviceInfo):
        // Call the DeviceAgent helper class constructor telling it to verbosely report to stderr.
        ConsumingDeviceAgent(deviceInfo, /*enableOutput*/ true)
    {
        if (!m_app) {
            m_app = new jvsp::AppSmokePhone(this);
            m_app->SetFrameInfo(1280, 720);
            m_app->LoadModel();
        }
    }

    bool DeviceAgent::pushUncompressedVideoFrame(const IUncompressedVideoFrame* videoFrame)
    {
        ++m_frameIndex;
        m_lastVideoFrameTimestampUs = videoFrame->timestampUs();

        uint64_t ts_begin = jvnn::CTool::GetTimeStapUs();
        if (m_app) {
            m_app->HandleFrame(videoFrame->data(0), m_lastVideoFrameTimestampUs, videoFrame->width(), videoFrame->height());
        }
        
        uint64_t ts_end = jvnn::CTool::GetTimeStapUs();
        nnDbg(NN_LOG, "frm: ts=%zu, cost time %zu us\n", m_lastVideoFrameTimestampUs, ts_end-ts_begin);

        return true; //< There were no errors while processing the video frame.
    }

    And new plugin is aviailable: https://drive.google.com/file/d/19qskQw5j8JrITn8CsDTPDd37SsdZLUHq/view?usp=share_link


    There is an core file in /opt/networkoptix/mediaserver/bin:
    you can open it with gdb, print the bt message as below:

    cd /opt/networkoptix/mediaserver/bin
    sudo chmod 777 core
    gdb mediaserver core

    (gdb) bt
    #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
    #1  0x00007f92dd4b3859 in __GI_abort () at abort.c:79
    #2  0x00007f92dd954c20 in __gnu_cxx::__verbose_terminate_handler ()
        at /home/jenkins/conan-build-2/.conan/data/gcc-toolchain/10.2/_/_/build/f24d9d4a49445fd389b06d3f22addc2784700473/.build/x86_64-linux-gnu/src/gcc/libstdc++-v3/libsupc++/vterminate.cc:95
    #3  0x00007f92dd9533bd in __cxxabiv1::__terminate (handler=<optimized out>)
        at /home/jenkins/conan-build-2/.conan/data/gcc-toolchain/10.2/_/_/build/f24d9d4a49445fd389b06d3f22addc2784700473/.build/x86_64-linux-gnu/src/gcc/libstdc++-v3/libsupc++/eh_terminate.cc:48
    #4  0x00007f92dd9533ff in std::terminate ()
        at /home/jenkins/conan-build-2/.conan/data/gcc-toolchain/10.2/_/_/build/f24d9d4a49445fd389b06d3f22addc2784700473/.build/x86_64-linux-gnu/src/gcc/libstdc++-v3/libsupc++/eh_terminate.cc:58
    #5  0x00007f92dd9535e8 in __cxxabiv1::__cxa_throw (obj=0x7f92680154e0, tinfo=0x7f92bc7cd0c8, 
        dest=0x7f928d8fd9e0)
        at /home/jenkins/conan-build-2/.conan/data/gcc-toolchain/10.2/_/_/build/f24d9d4a49445fd389b06d3f22addc2784700473/.build/x86_64-linux-gnu/src/gcc/libstdc++-v3/libsupc++/eh_throw.cc:95
    #6  0x00007f928d85f9cb in ?? ()
       from /opt/networkoptix/mediaserver/bin/plugins/JVSmokePhone_analytics_plugin/libopenvino.so.2230
    #7  0x00007f928dffc65d in ov::Core::Core(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) ()
       from /opt/networkoptix/mediaserver/bin/plugins/JVSmokePhone_analytics_plugin/libopenvino.so.2230

    I suspect the cause of the crash is an incompatible version of libstd++

    0
    Comment actions Permalink
  • Avatar
    Andrey Terentyev

    Hello,

    I suspect the cause of the crash is an incompatible version of libstd++

    That's might be the case. It's frequently faced issue.

    However, I can't reproduce the crash in my machine with the latest binaries you've shared.

    For detail, you could read the "Depending on libstdc++ on Linux" section in metadata_sdk/src/nx/sdk/dynamic_libraries.md.

    There are two ways.

    1.

    Note that the plugin must use the version of `libstdc++` compatible with the one of the Server with
    which the plugin is supposed to work.

    2.

    SOLUTION: For the plugin, use Clang together with its native `libc++` instead of `libstdc++`, and 
    link to `libc++` statically to make sure that the plugin will function properly even if the Server
    will start using `libc++` at some point in the future.

    In my machine, dependencies look like this

    You could try building your plugin with the same conan profile settings the Server was build to ensure library compatibility. You could find profiles in our repo on GitHub https://github.com/networkoptix/nx_open in "conan profiles".

    Make sure to check out the "vms_5.0" branch after cloning the repo.

     

     

    0
    Comment actions Permalink

Please sign in to leave a comment.