更新时间:2023-02-26 17:38:28
fitpack_basis
使用double循环迭代地修改bb
中的元素
和b
.我看不到使用NumPy向量化这些循环的方法,因为该值
迭代的每个阶段的bb
和b
的值取决于
以前的迭代.在这种情况下,Cython有时可以用来
改善循环的性能.
fitpack_basis
uses a double loop which iteratively modifies elements in bb
and b
. I don't see a way to use NumPy to vectorize these loops since the value
of bb
and b
at each stage of the iteration depends on the values from
previous iterations. In situations like this sometimes Cython can be used to
improve the performance of the loops.
这里是fitpack_basis
的Cython版本,运行速度与
bspline_basis .主要思想
使用Cython提升速度的方法是声明每个变量的类型,并且
使用纯整数索引将NumPy花式索引的所有用法重写为循环.
Here is a Cython-ized version of fitpack_basis
, which runs as fast as
bspline_basis. The main ideas
used to boost speed using Cython is to declare the type of every variable, and
rewrite all uses of NumPy fancy indexing as loops using plain integer indexing.
请参见此 页面 有关如何构建代码并从python运行代码的说明.
See this page for instructions on how to build the code and run it from python.
import numpy as np
cimport numpy as np
cimport cython
ctypedef np.float64_t DTYPE_f
ctypedef np.int64_t DTYPE_i
@cython.boundscheck(False)
@cython.wraparound(False)
@cython.nonecheck(False)
def cython_fitpack_basis(int c, int n=100, int d=3,
double rMinOffset=0, double rMaxOffset=0):
""" fitpack's spline basis function
c = number of control points.
n = number of points on the curve.
d = curve degree
"""
cdef Py_ssize_t i, j, k, l
cdef double f
# Create knot vector
cdef np.ndarray[DTYPE_i, ndim=1] kv = np.array(
[0]*d + range(c-d+1) + [c-d]*d, dtype=np.int64)
# Create sample range
cdef np.ndarray[DTYPE_f, ndim=1] u = np.linspace(
rMinOffset, rMaxOffset + c - d, n)
# basis
cdef np.ndarray[DTYPE_f, ndim=2] b = np.zeros((n,c))
# basis buffer
cdef np.ndarray[DTYPE_f, ndim=2] bb = np.zeros((n,c))
# left knot vector indices
cdef np.ndarray[DTYPE_i, ndim=1] left = np.clip(np.floor(u), 0, c-d-1).astype(np.int64)
# right knot vector indices
cdef np.ndarray[DTYPE_i, ndim=1] right = left+d+1
for k in range(n):
b[k, left[k]] = 1.0
for j in range(1, d+1):
for l in range(j):
for k in range(n):
bb[k, left[k] + l] = b[k, left[k] + l]
b[k, left[k]] = 0.0
for i in range(j):
for k in range(n):
f = bb[k, left[k]+i] / (kv[right[k]+i] - kv[right[k]+i-j])
b[k, left[k]+i] = b[k, left[k]+i] + f * (kv[right[k]+i] - u[k])
b[k, left[k]+i+1] = f * (u[k] - kv[right[k]+i-j])
return b
使用此timeit代码对性能进行基准测试,
Using this timeit code to benchmark it's performance,
import timeit
import numpy as np
import cython_bspline as CB
import numpy_bspline as NB
c = 6
n = 10**5
d = 3
fb = NB.fitpack_basis(c, n, d)
bb = NB.bspline_basis(c, n, d)
cfb = CB.cython_fitpack_basis(c,n,d)
assert np.allclose(bb, fb)
assert np.allclose(cfb, fb)
# print(NB.fitpack_basis(c,5,d,rMinOffset=-0.1,rMaxOffset=.2))
timing = dict()
timing['NB.fitpack_basis'] = timeit.timeit(
stmt='NB.fitpack_basis(c, n, d)',
setup='from __main__ import NB, c, n, d',
number=10)
timing['NB.bspline_basis'] = timeit.timeit(
stmt='NB.bspline_basis(c, n, d)',
setup='from __main__ import NB, c, n, d',
number=10)
timing['CB.cython_fitpack_basis'] = timeit.timeit(
stmt='CB.cython_fitpack_basis(c, n, d)',
setup='from __main__ import CB, c, n, d',
number=10)
for func_name, t in timing.items():
print "{:>25}: {:.4f}".format(func_name, t)
看来,Cython可以使fitpack_basis
代码的运行速度与 bspline_basis 一样快(可能更快). :
it appears Cython can make the fitpack_basis
code run as fast as (and perhaps a bit faster) than bspline_basis:
NB.bspline_basis: 0.3322
CB.cython_fitpack_basis: 0.2939
NB.fitpack_basis: 0.9182