且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

C扩展Python

更新时间:2022-08-12 16:15:58

使用C扩展Python

  1. PyErr_*()函数是将一个异常对象压入到Python解释器的异常栈中
  2. PyErr_Clear()函数是将Python异常栈中栈顶的元素弹出, 调用这个函数通常就相当于在Python程序中的try: except: 语句中except的作用
  3. Python中的API, 每一个模块对应的API是PyMODNAME_FUNCINMOD()
  4. Python扩展模块的函数模板:


// First
static PyObject *exfunc(PyObject *self, PyObject *args) {
    ...
    // Get parameters
    /* 
        Return 0 means an exception has been raised, you need to stop your program and rreturn NULL
        According to the official:
            It returns NULL (the error indicator for functions returning object pointers) if an error is detected in the argument list
    */
    if (!PyArgs_ParseTuple(args, "format_string", &the_addr_of_c_variables)) {
        return NULL;
    }
    
    /*
        In this area, if you fail to call an user-defined function, what you are expected to do is call Python 
    Exception API such as PyErr_SetString(PyExc_Type, "Exc_Info") to put an Exception Object in the Exception 
    Stack in Python.
    */
    ...
}


// Second
// This is `method table`
static PyMethodDef exfunc[] = {
    
    /* 
        {NULL, NULL, 0, NULL} is the symbol of the end of PyMethodDef array, Python Interpretor will stop walking
    this PyMethodDef array when meeting it.
     */
     
     /*
        According to the official:
            When using only METH_VARARGS, the function should expect the Python-level parameters to be passed in as a tuple 
            acceptable for parsing via PyArg_ParseTuple()
     */
    {"MethodName", your_func_pointer, METH_VARGS or METH_VARGS | METH_KEYWORDS, "Documents here"},
    {NULL, NULL, 0, NULL} 
};

// Third
static struct PyModuleDef exmodule = {
    PyMODULEDEF_HEAD_INIT
    "name",
    "doc",
    size,
    exfunc,
};


// Fourth
/*
    Actually, what Python Interpretor will call is the PyInit_*() function, so that will be entry to your 
extensional C module
*/
PyMODINIT_FUNC PyInit_exmodule {
    // PyModule_Init("ModuleName", &module_def);
    PyObject *m;
    /*
         When create an module object, an array of methods is need, but this rest is the history, 
         there are classes, exceptions and so on existing in a module, so if you want to add something
         into a module, just use PyModule_AddObject() API like `PyErr_NewException("spam.error", NULL, NULL);`
         this will create a base exception implementing Exception
    */
    m = PyModule_Create(&the_addr_of_PyModuleDef_array);
    if (!m) {
        return NULL;
    }
}


/*******************************************************************************************************/

// In Python 2.x, the third and fourth are different of Python3.x


// Third
PyMODINIT_FUNC initname {
    // Don't return, because PyMODINIT_FUNC is void which is PyObject * in Python 3.x
    // Use PyInit_Module("name", MethodDef); rather than PyModule_Create("name", ModuleDef)
    PyInit_Module("name", MethodDef);
}


  1. ***不直接使用malloc()函数, 如果该函数出现了错误, 应该调用PyErr_NoMemory()函数, 调用Python的API如PyLong_FromLong可以不考虑这个问题, 因为他底层已经封装了该异常
  2. Python大多数函数, 以正数或者0表示成功, 以NULL或者-1表示失败, 官方手册中
PyArg_ParseTuple() and friends, functions that return an integer status usually 
return a positive value or zero for success and -1 for failure
  1. 如果扩展模块中的函数不希望有返回值的话, 返回Py_None对象
Py_INCREF(Py_None);
return Py_None;
  1. 为NULL返回值的函数在Python中视为发生了异常的函数
  2. 除了Python解释器要调用的PyMODINIT_FUNC PyInit_name()没有static修饰, 其他所有的函数和变量都是要statis修饰的
  3. PyMODINIT_FUNC PyInit_name()函数的名字必须是PyInit_开头的, 并且返回模块对象, 这样在Python中sys.modules序列才能捕捉到该引用, 将其添加到sys.modules这个全局的模块池中
  4. 当在Python中使用import extend_modules时, Python解释器就会调用C层面上的PyInit_extend_modules()函数, 在该函数中执行PyModule_Create创建模块, 并且加载module table中定义的函数集合
  5. 可是这样说, Python中那么多的built in模块都是Guido按照自己规定的API扩展出来的, 所以我们编写的扩展模块也可以称之为built in模块
  6. Python中的heapq, functools等内置模块的源码在Modules/xxmoudle.c
  7. PyCallable_Check()判断是否一个PyObject *对象可以调用
  8. PyObject_CallObject(my_callback, arglist); 第一个参数是PyObject *, 第二个参数就是调用该PyObject *对象需要的参数, 也为一个PyObject *(是一个Tuple对象), 这是没有关键词参数的情况; 使用PyObject_Call(my_callback, argslist, argkw); 可以有位置参数和关键词参数
  9. 使用Py_BuildValue()函数创建对象是官方建议的
  10. 每次调用一个Python API, 需要判断其返回值, 判断是否会发生异常, 所有C编写扩展模块会比较冗长
  11. 参数就是一个PyTupleObject, 用户创建一个参数列表:
PyObject *arglist;
arglist = P_BuildValue("(i, l)", 8, 8888888)
  1. 慎用Py_INCREASE, Py_DECREASE, Py_XINCREASE, Py_XDREASE
  2. 我们自己创建的参数列表, 在放入到一个函数中执行完之后, 使用Py_DREASE宏减少参数列表的引用计数, 说白了就是让这个Tuple消失, 因为已经没有用了
int PyArg_ParseTupleAndKeywords(PyObject *arg, PyObject *kwdict, const char *format, char *kwlist[], ...);
arg: tuple对象
kwdict: dict对象
format: 字符串格式
kwlist: 存放关键词参数的键, 最后为NULL表示结束


NOTE!! 此函数只能借此keyword而不能解析位置参数了
  1. PyBuild_Value()函数
Py_BuildValue("")                        None
Py_BuildValue("i", 123)                  123
Py_BuildValue("iii", 123, 456, 789)      (123, 456, 789)
Py_BuildValue("s", "hello")              'hello'
Py_BuildValue("y", "hello")              b'hello'
Py_BuildValue("ss", "hello", "world")    ('hello', 'world')
Py_BuildValue("s#", "hello", 4)          'hell'
Py_BuildValue("y#", "hello", 4)          b'hell'
Py_BuildValue("()")                      ()
Py_BuildValue("(i)", 123)                (123,)
Py_BuildValue("(ii)", 123, 456)          (123, 456)
Py_BuildValue("(i,i)", 123, 456)         (123, 456)
Py_BuildValue("[i,i]", 123, 456)         [123, 456]
Py_BuildValue("{s:i,s:i}",
              "abc", 123, "def", 456)    {'abc': 123, 'def': 456}
Py_BuildValue("((ii)(ii)) (ii)",
              1, 2, 3, 4, 5, 6)          (((1, 2), (3, 4)), (5, 6))
  1. 我们知道Python知道用特定的Py_Init
  2. 如果是借来的引用的话, 是不会增加该对象的引用计数的, 但是在借来的基础上, 使用Py_INCREASE增加它的引用计数, 可以把借来的转为自己持有的
  3. 几乎所有的Python创建PyObject对象的API都会返回一个拥有的引用, 同时大多数的从一个对象中取出另一个对象的函数API也都是拥有的引用, 例外的PyTuple_GetItem(), PyList_GetItem(), PyDict_GetItem(), PyImport_AddModule() and PyDict_GetItemString()等函数则是返回借来的对象
  4. 传入参数到Python的API中, 大多数的参数对象都是不会增加引用的, 但是PyTuple_SetItem() and PyList_SetItem()是例外
  5. 对24和25的补充, 使用借来的引用, 它的存活周期是一个PythonCFunction, 如果需要对象存起来的话, 就会调动Py_INCREASE增加引用计数, 但是如果要返回一个Python对象的话, 需要增加其引用计数
  6. 一般PythonAPI不怎么检验Python对象为NULL的情况
  7. 模块的一些初始化操作在PyInit_
  8. 如果希望添加新的内置类型, 这个我最熟悉(那是多么痛的领悟), 因为我写过, 定义一个type类型结构体, 第一个是PyObject_HEAD, 接着在初始化type(一定要有.tp_flags = Py_TPFLAGS_DEFAULT,), 在PyInit_
  9. 在扩展模块中设置的名称为modulename.subname
This initializes the Custom type, filling in a number of members to the appropriate default values, including ob_type that we initially set to NULL.
if (PyType_Ready(&CustomType) < 0)
    return;
  

This adds the type to the module dictionary.
PyModule_AddObject(m, "Custom", (PyObject *) &CustomType);
  1. 一个完整的type赋值, 根据1, 2, 3得知, 为什么每次创建一个类型都会有new, init, del, 因为在其对应的PyTypeObject中需要指明, 并且不是通过PyMethodDef表控制的
static PyTypeObject CustomType = {
    PyVarObject_HEAD_INIT(NULL, 0)
    .tp_name = "custom2.Custom",
    .tp_doc = "Custom objects",
    .tp_basicsize = sizeof(CustomObject),
    .tp_itemsize = 0,
    .tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE,
    .tp_new = Custom_new, // __new__  1
    .tp_init = (initproc) Custom_init, // __init__ 2
    .tp_dealloc = (destructor) Custom_dealloc, // __del__ 3
    .tp_traverse = (traverseproc) Custom_traverse, // GC will call it 
    .tp_clear = (inquiry) Custom_clear, // GC will call it
    .tp_members = Custom_members, // Is the PyMemberDef table
    .tp_methods = Custom_methods, // Is the PyMethodDef table 
    .tp_getset = Custom_getsetters, // Setter and getter methods are special, so part them with tp_methods
};


// About the traverse and clear
static int
Custom_traverse(CustomObject *self, visitproc visit, void *arg)
{
    Py_VISIT(self->first);
    Py_VISIT(self->last);
    return 0;
}

static int
Custom_clear(CustomObject *self)
{
    Py_CLEAR(self->first);
    Py_CLEAR(self->last);
    return 0;
}
  1. 如何要继承一个类型的话, 那么定义的类型的第一行就是那个类型对象而不是HEAD了, 那个类型对象中已经包含了HEAD了, 所有新建的对象就可以转换成新建的这个对象类型或者他的父类型, 同时在Init函数中需要指定该类型对象的tp_base是谁, 这样在Python层面上实现集成

Demo file, according to the official

#include <Python.h>

typedef struct {
    PyListObject list;
    int state;
} SubListObject;

static PyObject *
SubList_increment(SubListObject *self, PyObject *unused)
{
    self->state++;
    return PyLong_FromLong(self->state);
}

static PyMethodDef SubList_methods[] = {
    {"increment", (PyCFunction) SubList_increment, METH_NOARGS,
     PyDoc_STR("increment state counter")},
    {NULL},
};

static int
SubList_init(SubListObject *self, PyObject *args, PyObject *kwds)
{
    if (PyList_Type.tp_init((PyObject *) self, args, kwds) < 0)
        return -1;
    self->state = 0;
    return 0;
}

static PyTypeObject SubListType = {
    PyVarObject_HEAD_INIT(NULL, 0)
    .tp_name = "sublist.SubList",
    .tp_doc = "SubList objects",
    .tp_basicsize = sizeof(SubListObject),
    .tp_itemsize = 0,
    .tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE,
    .tp_init = (initproc) SubList_init,
    .tp_methods = SubList_methods,
};

static PyModuleDef sublistmodule = {
    PyModuleDef_HEAD_INIT,
    .m_name = "sublist",
    .m_doc = "Example module that creates an extension type.",
    .m_size = -1,
};

PyMODINIT_FUNC
PyInit_sublist(void)
{
    PyObject *m;
    SubListType.tp_base = &PyList_Type;
    if (PyType_Ready(&SubListType) < 0)
        return NULL;

    m = PyModule_Create(&sublistmodule);
    if (m == NULL)
        return NULL;

    Py_INCREF(&SubListType);
    PyModule_AddObject(m, "SubList", (PyObject *) &SubListType);
    return m;
}

Another Demo

#include <Python.h>
#include "structmember.h"

typedef struct {
    PyObject_HEAD
    PyObject *first; /* first name */
    PyObject *last;  /* last name */
    int number;
} CustomObject;

static void
Custom_dealloc(CustomObject *self)
{
    Py_XDECREF(self->first);
    Py_XDECREF(self->last);
    Py_TYPE(self)->tp_free((PyObject *) self);
}

static PyObject *
Custom_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
{
    CustomObject *self;
    self = (CustomObject *) type->tp_alloc(type, 0);
    if (self != NULL) {
        self->first = PyUnicode_FromString("");
        if (self->first == NULL) {
            Py_DECREF(self);
            return NULL;
        }
        self->last = PyUnicode_FromString("");
        if (self->last == NULL) {
            Py_DECREF(self);
            return NULL;
        }
        self->number = 0;
    }
    return (PyObject *) self;
}

static int
Custom_init(CustomObject *self, PyObject *args, PyObject *kwds)
{
    static char *kwlist[] = {"first", "last", "number", NULL};
    PyObject *first = NULL, *last = NULL, *tmp;

    if (!PyArg_ParseTupleAndKeywords(args, kwds, "|UUi", kwlist,
                                     &first, &last,
                                     &self->number))
        return -1;

    if (first) {
        tmp = self->first;
        Py_INCREF(first);
        self->first = first;
        Py_DECREF(tmp);
    }
    if (last) {
        tmp = self->last;
        Py_INCREF(last);
        self->last = last;
        Py_DECREF(tmp);
    }
    return 0;
}

static PyMemberDef Custom_members[] = {
    {"number", T_INT, offsetof(CustomObject, number), 0,
     "custom number"},
    {NULL}  /* Sentinel */
};

static PyObject *
Custom_getfirst(CustomObject *self, void *closure)
{
    Py_INCREF(self->first);
    return self->first;
}

static int
Custom_setfirst(CustomObject *self, PyObject *value, void *closure)
{
    PyObject *tmp;
    if (value == NULL) {
        PyErr_SetString(PyExc_TypeError, "Cannot delete the first attribute");
        return -1;
    }
    if (!PyUnicode_Check(value)) {
        PyErr_SetString(PyExc_TypeError,
                        "The first attribute value must be a string");
        return -1;
    }
    tmp = self->first;
    Py_INCREF(value);
    self->first = value;
    Py_DECREF(tmp);
    return 0;
}

static PyObject *
Custom_getlast(CustomObject *self, void *closure)
{
    Py_INCREF(self->last);
    return self->last;
}

static int
Custom_setlast(CustomObject *self, PyObject *value, void *closure)
{
    PyObject *tmp;
    if (value == NULL) {
        PyErr_SetString(PyExc_TypeError, "Cannot delete the last attribute");
        return -1;
    }
    if (!PyUnicode_Check(value)) {
        PyErr_SetString(PyExc_TypeError,
                        "The last attribute value must be a string");
        return -1;
    }
    tmp = self->last;
    Py_INCREF(value);
    self->last = value;
    Py_DECREF(tmp);
    return 0;
}

static PyGetSetDef Custom_getsetters[] = {
    {"first", (getter) Custom_getfirst, (setter) Custom_setfirst,
     "first name", NULL},
    {"last", (getter) Custom_getlast, (setter) Custom_setlast,
     "last name", NULL},
    {NULL}  /* Sentinel */
};

static PyObject *
Custom_name(CustomObject *self, PyObject *Py_UNUSED(ignored))
{
    return PyUnicode_FromFormat("%S %S", self->first, self->last);
}

static PyMethodDef Custom_methods[] = {
    {"name", (PyCFunction) Custom_name, METH_NOARGS,
     "Return the name, combining the first and last name"
    },
    {NULL}  /* Sentinel */
};

static PyTypeObject CustomType = {
    PyVarObject_HEAD_INIT(NULL, 0)
    .tp_name = "custom3.Custom",
    .tp_doc = "Custom objects",
    .tp_basicsize = sizeof(CustomObject),
    .tp_itemsize = 0,
    .tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE,
    .tp_new = Custom_new,
    .tp_init = (initproc) Custom_init,
    .tp_dealloc = (destructor) Custom_dealloc,
    .tp_members = Custom_members,
    .tp_methods = Custom_methods,
    .tp_getset = Custom_getsetters,
};

static PyModuleDef custommodule = {
    PyModuleDef_HEAD_INIT,
    .m_name = "custom3",
    .m_doc = "Example module that creates an extension type.",
    .m_size = -1,
};

PyMODINIT_FUNC
PyInit_custom3(void)
{
    PyObject *m;
    if (PyType_Ready(&CustomType) < 0)
        return NULL;

    m = PyModule_Create(&custommodule);
    if (m == NULL)
        return NULL;

    Py_INCREF(&CustomType);
    PyModule_AddObject(m, "Custom", (PyObject *) &CustomType);
    return m;
}