且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何通过共享内存将 cv::Mat 发送到 python?

更新时间:2023-01-01 16:30:10

总体思路(在 OpenCV Python 绑定中使用)是创建一个与 ndarray 共享其数据缓冲区的 numpy ndarraycode>Mat 对象,并将其传递给 Python 函数.

The general idea (as used in the OpenCV Python bindings) is to create a numpy ndarray that shares its data buffer with the Mat object, and pass that to the Python function.

注意:在这一点上,我将示例仅限于连续矩阵.

Note: At this point, I'll limit the example to continuous matrices only.

我们可以利用 pybind11::数组类.

We can take advantage of the pybind11::array class.

  • 我们需要为要使用的 numpy 数组确定合适的 dtype.这是一个简单的 1 对 1 映射,我们可以使用 switch 来实现:

  • We need to determine the appropriate dtype for the numpy array to use. This is a simple 1-to-1 mapping, which we can do using a switch:

py::dtype determine_np_dtype(int depth)
{
    switch (depth) {
    case CV_8U: return py::dtype::of<uint8_t>();
    case CV_8S: return py::dtype::of<int8_t>();
    case CV_16U: return py::dtype::of<uint16_t>();
    case CV_16S: return py::dtype::of<int16_t>();
    case CV_32S: return py::dtype::of<int32_t>();
    case CV_32F: return py::dtype::of<float>();
    case CV_64F: return py::dtype::of<double>();
    default:
        throw std::invalid_argument("Unsupported data type.");
    }
}

  • 确定 numpy 数组的形状.为了使其行为类似于 OpenCV,让我们将 1 通道 Mats 映射到 2D numpy 数组,将多通道 Mats 映射到 3D numpy 数组.

  • Determine the shape for the numpy array. To make this behave similarly to OpenCV, let's have it map 1-channel Mats to 2D numpy arrays, and multi-channel Mats to 3D numpy arrays.

    std::vector<std::size_t> determine_shape(cv::Mat& m)
    {
        if (m.channels() == 1) {
            return {
                static_cast<size_t>(m.rows)
                , static_cast<size_t>(m.cols)
            };
        }
    
        return {
            static_cast<size_t>(m.rows)
            , static_cast<size_t>(m.cols)
            , static_cast<size_t>(m.channels())
        };
    }
    

  • 提供将共享缓冲区的生命周期延长到 numpy 数组的生命周期的方法.我们可以围绕源 Mat 的浅拷贝创建一个 pybind11::capsule -- 由于对象的实现方式,这有效地增加了所需的引用计数时间.

  • Provide means of extending the shared buffer's lifetime to the lifetime of the numpy array. We can create a pybind11::capsule around a shallow copy of the source Mat -- due to the way the object is implemented, this effectively increases its reference count for the required amount of time.

    py::capsule make_capsule(cv::Mat& m)
    {
        return py::capsule(new cv::Mat(m)
            , [](void *v) { delete reinterpret_cast<cv::Mat*>(v); }
            );
    }
    

  • 现在,我们可以执行转换了.

    Now, we can perform the conversion.

    py::array mat_to_nparray(cv::Mat& m)
    {
        if (!m.isContinuous()) {
            throw std::invalid_argument("Only continuous Mats supported.");
        }
    
        return py::array(determine_np_dtype(m.depth())
            , determine_shape(m)
            , m.data
            , make_capsule(m));
    }
    

    让我们假设,我们有一个 Python 函数,如

    Let's assume, we have a Python function like

    def foo(arr):
        print(arr.shape)
    

    在 pybind 对象中捕获 fun.然后使用 Mat 作为源从 C++ 调用这个函数,我们会做这样的事情:

    captured in a pybind object fun. Then to call this function from C++ using a Mat as a source we'd do something like this:

    cv::Mat img; // Initialize this somehow
    
    auto result = fun(mat_to_nparray(img));
    

    示例程序

    #include <pybind11/pybind11.h>
    #include <pybind11/embed.h>
    #include <pybind11/numpy.h>
    #include <pybind11/stl.h>
    
    #include <opencv2/opencv.hpp>
    
    #include <iostream>
    
    namespace py = pybind11;
    
    // The 4 functions from above go here...
    
    int main()
    {
        // Start the interpreter and keep it alive
        py::scoped_interpreter guard{};
    
        try {
            auto locals = py::dict{};
    
            py::exec(R"(
                import numpy as np
    
                def test_cpp_to_py(arr):
                    return (arr[0,0,0], 2.0, 30)
            )");
    
            auto test_cpp_to_py = py::globals()["test_cpp_to_py"];
    
    
            for (int i = 0; i < 10; i++) {
                int64 t0 = cv::getTickCount();
    
                cv::Mat img(cv::Mat::zeros(1024, 1024, CV_8UC3) + cv::Scalar(1, 1, 1));
    
                int64 t1 = cv::getTickCount();
    
                auto result = test_cpp_to_py(mat_to_nparray(img));
    
                int64 t2 = cv::getTickCount();
    
                double delta0 = (t1 - t0) / cv::getTickFrequency() * 1000;
                double delta1 = (t2 - t1) / cv::getTickFrequency() * 1000;
    
                std::cout << "* " << delta0 << " ms | " << delta1 << " ms" << std::endl;
            }        
        } catch (py::error_already_set& e) {
            std::cerr << e.what() << "
    ";
        }
    
        return 0;
    }
    

    控制台输出

    * 4.56413 ms | 0.225657 ms
    * 3.95923 ms | 0.0736127 ms
    * 3.80335 ms | 0.0438603 ms
    * 3.99262 ms | 0.0577587 ms
    * 3.82262 ms | 0.0572 ms
    * 3.72373 ms | 0.0394603 ms
    * 3.74014 ms | 0.0405079 ms
    * 3.80621 ms | 0.054546 ms
    * 3.72177 ms | 0.0386222 ms
    * 3.70683 ms | 0.0373651 ms