2.8. Interfacing with C¶
Author: Valentin Haenel
This chapter contains an introduction to the many different routes for
making your native code (primarily C/C++
) available from Python, a
process commonly referred to wrapping. The goal of this chapter is to
give you a flavour of what technologies exist and what their respective
merits and shortcomings are, so that you can select the appropriate one
for your specific needs. In any case, once you do start wrapping, you
almost certainly will want to consult the respective documentation for
your selected technique.
2.8.1. Introduction¶
This chapter covers the following techniques:
These four techniques are perhaps the most well known ones, of which Cython is probably the most advanced one and the one you should consider using first. The others are also important, if you want to understand the wrapping problem from different angles. Having said that, there are other alternatives out there, but having understood the basics of the ones above, you will be in a position to evaluate the technique of your choice to see if it fits your needs.
The following criteria may be useful when evaluating a technology:
Are additional libraries required?
Is the code autogenerated?
Does it need to be compiled?
Is there good support for interacting with NumPy arrays?
Does it support C++?
Before you set out, you should consider your use case. When interfacing with native code, there are usually two use-cases that come up:
Existing code in C/C++ that needs to be leveraged, either because it already exists, or because it is faster.
Python code too slow, push inner loops to native code
Each technology is demonstrated by wrapping the cos
function from
math.h
. While this is a mostly a trivial example, it should serve us well
to demonstrate the basics of the wrapping solution. Since each technique also
includes some form of NumPy support, this is also demonstrated using an
example where the cosine is computed on some kind of array.
Last but not least, two small warnings:
All of these techniques may crash (segmentation fault) the Python interpreter, which is (usually) due to bugs in the C code.
All the examples have been done on Linux, they should be possible on other operating systems.
You will need a C compiler for most of the examples.
2.8.2. Python-C-Api¶
The Python-C-API is the backbone of the standard Python interpreter (a.k.a CPython). Using this API it is possible to write Python extension module in C and C++. Obviously, these extension modules can, by virtue of language compatibility, call any function written in C or C++.
When using the Python-C-API, one usually writes much boilerplate code, first to parse the arguments that were given to a function, and later to construct the return type.
Advantages
Requires no additional libraries
Lots of low-level control
Entirely usable from C++
Disadvantages
May require a substantial amount of effort
Much overhead in the code
Must be compiled
High maintenance cost
No forward compatibility across Python versions as C-API changes
Reference count bugs are easy to create and very hard to track down.
Note
The Python-C-Api example here serves mainly for didactic reasons. Many of the other techniques actually depend on this, so it is good to have a high-level understanding of how it works. In 99% of the use-cases you will be better off, using an alternative technique.
Note
Since reference counting bugs are easy to create and hard to track down, anyone really needing to use the Python C-API should read the section about objects, types and reference counts from the official python documentation. Additionally, there is a tool by the name of cpychecker which can help discover common errors with reference counting.
2.8.2.1. Example¶
The following C-extension module, make the cos
function from the standard
math library available to Python:
/* Example of wrapping cos function from math.h with the Python-C-API. */
#include <Python.h>
#include <math.h>
/* wrapped cosine function */
static PyObject* cos_func(PyObject* self, PyObject* args)
{
double value;
double answer;
/* parse the input, from python float to c double */
if (!PyArg_ParseTuple(args, "d", &value))
return NULL;
/* if the above function returns -1, an appropriate Python exception will
* have been set, and the function simply returns NULL
*/
/* call cos from libm */
answer = cos(value);
/* construct the output from cos, from c double to python float */
return Py_BuildValue("f", answer);
}
/* define functions in module */
static PyMethodDef CosMethods[] =
{
{"cos_func", cos_func, METH_VARARGS, "evaluate the cosine"},
{NULL, NULL, 0, NULL}
};
#if PY_MAJOR_VERSION >= 3
/* module initialization */
/* Python version 3*/
static struct PyModuleDef cModPyDem =
{
PyModuleDef_HEAD_INIT,
"cos_module", "Some documentation",
-1,
CosMethods
};
PyMODINIT_FUNC
PyInit_cos_module(void)
{
return PyModule_Create(&cModPyDem);
}
#else
/* module initialization */
/* Python version 2 */
PyMODINIT_FUNC
initcos_module(void)
{
(void) Py_InitModule("cos_module", CosMethods);
}
#endif
As you can see, there is much boilerplate, both to «massage» the arguments and return types into place and for the module initialisation. Although some of this is amortised, as the extension grows, the boilerplate required for each function(s) remains.
The standard python build system, setuptools
, supports compiling
C-extensions via a setup.py
file:
from setuptools import setup, Extension
# define the extension module
cos_module = Extension("cos_module", sources=["cos_module.c"])
# run the setup
setup(ext_modules=[cos_module])
The setup file is called as follows:
$ cd advanced/interfacing_with_c/python_c_api
$ ls
cos_module.c setup.py
$ python setup.py build_ext --inplace
running build_ext
building 'cos_module' extension
creating build
creating build/temp.linux-x86_64-2.7
gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/esc/anaconda/include/python2.7 -c cos_module.c -o build/temp.linux-x86_64-2.7/cos_module.o
gcc -pthread -shared build/temp.linux-x86_64-2.7/cos_module.o -L/home/esc/anaconda/lib -lpython2.7 -o /home/esc/git-working/scientific-python-lectures/advanced/interfacing_with_c/python_c_api/cos_module.so
$ ls
build/ cos_module.c cos_module.so setup.py
build_ext
is to build extension modules--inplace
will output the compiled extension module into the current directory
The file cos_module.so
contains the compiled extension, which we can now load in the IPython interpreter:
Note
In Python 3, the filename for compiled modules includes metadata on the Python interpreter (see PEP 3149) and is thus longer. The import statement is not affected by this.
In [1]: import cos_module
In [2]: cos_module?
Type: module
String Form:<module 'cos_module' from 'cos_module.so'>
File: /home/esc/git-working/scientific-python-lectures/advanced/interfacing_with_c/python_c_api/cos_module.so
Docstring: <no docstring>
In [3]: dir(cos_module)
Out[3]: ['__doc__', '__file__', '__name__', '__package__', 'cos_func']
In [4]: cos_module.cos_func(1.0)
Out[4]: 0.5403023058681398
In [5]: cos_module.cos_func(0.0)
Out[5]: 1.0
In [6]: cos_module.cos_func(3.14159265359)
Out[6]: -1.0
Now let’s see how robust this is:
In [7]: cos_module.cos_func('foo')
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-10-11bee483665d> in <module>()
----> 1 cos_module.cos_func('foo')
TypeError: a float is required
2.8.2.2. NumPy Support¶
Analog to the Python-C-API, NumPy, which is itself implemented as a C-extension, comes with the NumPy-C-API. This API can be used to create and manipulate NumPy arrays from C, when writing a custom C-extension. See also: Advanced NumPy.
Note
If you do ever need to use the NumPy C-API refer to the documentation about Arrays and Iterators.
The following example shows how to pass NumPy arrays as arguments to functions
and how to iterate over NumPy arrays using the (old) NumPy-C-API. It simply
takes an array as argument applies the cosine function from the math.h
and
returns a resulting new array.
/* Example of wrapping the cos function from math.h using the NumPy-C-API. */
#include <Python.h>
#include <numpy/arrayobject.h>
#include <math.h>
/* wrapped cosine function */
static PyObject* cos_func_np(PyObject* self, PyObject* args)
{
PyArrayObject *arrays[2]; /* holds input and output array */
PyObject *ret;
NpyIter *iter;
npy_uint32 op_flags[2];
npy_uint32 iterator_flags;
PyArray_Descr *op_dtypes[2];
NpyIter_IterNextFunc *iternext;
/* parse single NumPy array argument */
if (!PyArg_ParseTuple(args, "O!", &PyArray_Type, &arrays[0])) {
return NULL;
}
arrays[1] = NULL; /* The result will be allocated by the iterator */
/* Set up and create the iterator */
iterator_flags = (NPY_ITER_ZEROSIZE_OK |
/*
* Enable buffering in case the input is not behaved
* (native byte order or not aligned),
* disabling may speed up some cases when it is known to
* be unnecessary.
*/
NPY_ITER_BUFFERED |
/* Manually handle innermost iteration for speed: */
NPY_ITER_EXTERNAL_LOOP |
NPY_ITER_GROWINNER);
op_flags[0] = (NPY_ITER_READONLY |
/*
* Required that the arrays are well behaved, since the cos
* call below requires this.
*/
NPY_ITER_NBO |
NPY_ITER_ALIGNED);
/* Ask the iterator to allocate an array to write the output to */
op_flags[1] = NPY_ITER_WRITEONLY | NPY_ITER_ALLOCATE;
/*
* Ensure the iteration has the correct type, could be checked
* specifically here.
*/
op_dtypes[0] = PyArray_DescrFromType(NPY_DOUBLE);
op_dtypes[1] = op_dtypes[0];
/* Create the NumPy iterator object: */
iter = NpyIter_MultiNew(2, arrays, iterator_flags,
/* Use input order for output and iteration */
NPY_KEEPORDER,
/* Allow only byte-swapping of input */
NPY_EQUIV_CASTING, op_flags, op_dtypes);
Py_DECREF(op_dtypes[0]); /* The second one is identical. */
if (iter == NULL)
return NULL;
iternext = NpyIter_GetIterNext(iter, NULL);
if (iternext == NULL) {
NpyIter_Deallocate(iter);
return NULL;
}
/* Fetch the output array which was allocated by the iterator: */
ret = (PyObject *)NpyIter_GetOperandArray(iter)[1];
Py_INCREF(ret);
if (NpyIter_GetIterSize(iter) == 0) {
/*
* If there are no elements, the loop cannot be iterated.
* This check is necessary with NPY_ITER_ZEROSIZE_OK.
*/
NpyIter_Deallocate(iter);
return ret;
}
/* The location of the data pointer which the iterator may update */
char **dataptr = NpyIter_GetDataPtrArray(iter);
/* The location of the stride which the iterator may update */
npy_intp *strideptr = NpyIter_GetInnerStrideArray(iter);
/* The location of the inner loop size which the iterator may update */
npy_intp *innersizeptr = NpyIter_GetInnerLoopSizePtr(iter);
/* iterate over the arrays */
do {
npy_intp stride = strideptr[0];
npy_intp count = *innersizeptr;
/* out is always contiguous, so use double */
double *out = (double *)dataptr[1];
char *in = dataptr[0];
/* The output is allocated and guaranteed contiguous (out++ works): */
assert(strideptr[1] == sizeof(double));
/*
* For optimization it can make sense to add a check for
* stride == sizeof(double) to allow the compiler to optimize for that.
*/
while (count--) {
*out = cos(*(double *)in);
out++;
in += stride;
}
} while (iternext(iter));
/* Clean up and return the result */
NpyIter_Deallocate(iter);
return ret;
}
/* define functions in module */
static PyMethodDef CosMethods[] =
{
{"cos_func_np", cos_func_np, METH_VARARGS,
"evaluate the cosine on a NumPy array"},
{NULL, NULL, 0, NULL}
};
#if PY_MAJOR_VERSION >= 3
/* module initialization */
/* Python version 3*/
static struct PyModuleDef cModPyDem = {
PyModuleDef_HEAD_INIT,
"cos_module", "Some documentation",
-1,
CosMethods
};
PyMODINIT_FUNC PyInit_cos_module_np(void) {
PyObject *module;
module = PyModule_Create(&cModPyDem);
if(module==NULL) return NULL;
/* IMPORTANT: this must be called */
import_array();
if (PyErr_Occurred()) return NULL;
return module;
}
#else
/* module initialization */
/* Python version 2 */
PyMODINIT_FUNC initcos_module_np(void) {
PyObject *module;
module = Py_InitModule("cos_module_np", CosMethods);
if(module==NULL) return;
/* IMPORTANT: this must be called */
import_array();
return;
}
#endif
To compile this we can use setuptools
again. However we need to be sure to
include the NumPy headers by using numpy.get_include()
.
from setuptools import setup, Extension
import numpy
# define the extension module
cos_module_np = Extension(
"cos_module_np", sources=["cos_module_np.c"], include_dirs=[numpy.get_include()]
)
# run the setup
setup(ext_modules=[cos_module_np])
To convince ourselves if this does actually works, we run the following test script:
import cos_module_np
import numpy as np
import matplotlib.pyplot as plt
x = np.arange(0, 2 * np.pi, 0.1)
y = cos_module_np.cos_func_np(x)
plt.plot(x, y)
plt.show()
# Below are more specific tests for less common usage
# ---------------------------------------------------
# The function is OK with `x` not having any elements:
x_empty = np.array([], dtype=np.float64)
y_empty = cos_module_np.cos_func_np(x_empty)
assert np.array_equal(y_empty, np.array([], dtype=np.float64))
# The function can handle arbitrary dimensions and non-contiguous data.
# `x_2d` contains the same values, but has a different shape.
# Note: `x_2d.flags` shows it is not contiguous and `x2.ravel() == x`
x_2d = x.repeat(2)[::2].reshape(-1, 3)
y_2d = cos_module_np.cos_func_np(x_2d)
# When reshaped back, the same result is given:
assert np.array_equal(y_2d.ravel(), y)
# The function handles incorrect byte-order fine:
x_not_native_byteorder = x.astype(x.dtype.newbyteorder())
y_not_native_byteorder = cos_module_np.cos_func_np(x_not_native_byteorder)
assert np.array_equal(y_not_native_byteorder, y)
# The function fails if the data type is incorrect:
x_incorrect_dtype = x.astype(np.float32)
try:
cos_module_np.cos_func_np(x_incorrect_dtype)
assert 0, "This cannot be reached."
except TypeError:
# A TypeError will be raised, this can be changed by changing the
# casting rule.
pass
And this should result in the following figure:
2.8.3. Ctypes¶
Ctypes is a foreign function library for Python. It provides C compatible data types, and allows calling functions in DLLs or shared libraries. It can be used to wrap these libraries in pure Python.
Advantages
Part of the Python standard library
Does not need to be compiled
Wrapping code entirely in Python
Disadvantages
Requires code to be wrapped to be available as a shared library (roughly speaking
*.dll
in Windows*.so
in Linux and*.dylib
in Mac OSX.)No good support for C++
2.8.3.1. Example¶
As advertised, the wrapper code is in pure Python.
"""Example of wrapping cos function from math.h using ctypes."""
import ctypes
# find and load the library
# OSX or linux
from ctypes.util import find_library
libm_name = find_library("m")
assert libm_name is not None, "Cannot find libm (math) on this system :/ That's bad."
libm = ctypes.cdll.LoadLibrary(libm_name)
# Windows
# from ctypes import windll
# libm = cdll.msvcrt
# set the argument type
libm.cos.argtypes = [ctypes.c_double]
# set the return type
libm.cos.restype = ctypes.c_double
def cos_func(arg):
"""Wrapper for cos from math.h"""
return libm.cos(arg)
Finding and loading the library may vary depending on your operating system, check the documentation for details
This may be somewhat deceptive, since the math library exists in compiled form on the system already. If you were to wrap a in-house library, you would have to compile it first, which may or may not require some additional effort.
We may now use this, as before:
In [8]: import cos_module
In [9]: cos_module?
Type: module
String Form:<module 'cos_module' from 'cos_module.py'>
File: /home/esc/git-working/scientific-python-lectures/advanced/interfacing_with_c/ctypes/cos_module.py
Docstring: <no docstring>
In [10]: dir(cos_module)
Out[10]:
['__builtins__',
'__doc__',
'__file__',
'__name__',
'__package__',
'cos_func',
'ctypes',
'find_library',
'libm']
In [11]: cos_module.cos_func(1.0)
Out[11]: 0.5403023058681398
In [12]: cos_module.cos_func(0.0)
Out[12]: 1.0
In [13]: cos_module.cos_func(3.14159265359)
Out[13]: -1.0
As with the previous example, this code is somewhat robust, although the error message is not quite as helpful, since it does not tell us what the type should be.
In [14]: cos_module.cos_func('foo')
---------------------------------------------------------------------------
ArgumentError Traceback (most recent call last)
<ipython-input-7-11bee483665d> in <module>()
----> 1 cos_module.cos_func('foo')
/home/esc/git-working/scientific-python-lectures/advanced/interfacing_with_c/ctypes/cos_module.py in cos_func(arg)
12 def cos_func(arg):
13 ''' Wrapper for cos from math.h '''
---> 14 return libm.cos(arg)
ArgumentError: argument 1: <type 'exceptions.TypeError'>: wrong type
2.8.3.2. NumPy Support¶
NumPy contains some support for interfacing with ctypes. In particular there is support for exporting certain attributes of a NumPy array as ctypes data-types and there are functions to convert from C arrays to NumPy arrays and back.
For more information, consult the corresponding section in the NumPy Cookbook and the API documentation for numpy.ndarray.ctypes and numpy.ctypeslib.
For the following example, let’s consider a C function in a library that takes an input and an output array, computes the cosine of the input array and stores the result in the output array.
The library consists of the following header file (although this is not strictly needed for this example, we list it for completeness):
void cos_doubles(double * in_array, double * out_array, int size);
The function implementation resides in the following C source file:
#include <math.h>
/* Compute the cosine of each element in in_array, storing the result in
* out_array. */
void cos_doubles(double * in_array, double * out_array, int size){
int i;
for(i=0;i<size;i++){
out_array[i] = cos(in_array[i]);
}
}
And since the library is pure C, we can’t use setuptools
to compile it, but
must use a combination of make
and gcc
:
m.PHONY : clean
libcos_doubles.so : cos_doubles.o
gcc -shared -Wl,-soname,libcos_doubles.so -o libcos_doubles.so cos_doubles.o
cos_doubles.o : cos_doubles.c
gcc -c -fPIC cos_doubles.c -o cos_doubles.o
clean :
-rm -vf libcos_doubles.so cos_doubles.o cos_doubles.pyc
We can then compile this (on Linux) into the shared library
libcos_doubles.so
:
$ ls
cos_doubles.c cos_doubles.h cos_doubles.py makefile test_cos_doubles.py
$ make
gcc -c -fPIC cos_doubles.c -o cos_doubles.o
gcc -shared -Wl,-soname,libcos_doubles.so -o libcos_doubles.so cos_doubles.o
$ ls
cos_doubles.c cos_doubles.o libcos_doubles.so* test_cos_doubles.py
cos_doubles.h cos_doubles.py makefile
Now we can proceed to wrap this library via ctypes with direct support for (certain kinds of) NumPy arrays:
"""Example of wrapping a C library function that accepts a C double array as
input using the numpy.ctypeslib."""
import numpy as np
import numpy.ctypeslib as npct
from ctypes import c_int
# input type for the cos_doubles function
# must be a double array, with single dimension that is contiguous
array_1d_double = npct.ndpointer(dtype=np.double, ndim=1, flags="CONTIGUOUS")
# load the library, using NumPy mechanisms
libcd = npct.load_library("libcos_doubles", ".")
# setup the return types and argument types
libcd.cos_doubles.restype = None
libcd.cos_doubles.argtypes = [array_1d_double, array_1d_double, c_int]
def cos_doubles_func(in_array, out_array):
return libcd.cos_doubles(in_array, out_array, len(in_array))
Note the inherent limitation of contiguous single dimensional NumPy arrays, since the C functions requires this kind of buffer.
Also note that the output array must be preallocated, for example with
numpy.zeros()
and the function will write into it’s buffer.Although the original signature of the
cos_doubles
function isARRAY, ARRAY, int
the finalcos_doubles_func
takes only two NumPy arrays as arguments.
And, as before, we convince ourselves that it worked:
import numpy as np
import matplotlib.pyplot as plt
import cos_doubles
x = np.arange(0, 2 * np.pi, 0.1)
y = np.empty_like(x)
cos_doubles.cos_doubles_func(x, y)
plt.plot(x, y)
plt.show()
2.8.4. SWIG¶
SWIG, the Simplified Wrapper Interface Generator, is a software development tool that connects programs written in C and C++ with a variety of high-level programming languages, including Python. The important thing with SWIG is, that it can autogenerate the wrapper code for you. While this is an advantage in terms of development time, it can also be a burden. The generated file tend to be quite large and may not be too human readable and the multiple levels of indirection which are a result of the wrapping process, may be a bit tricky to understand.
Note
The autogenerated C code uses the Python-C-Api.
Advantages
Can automatically wrap entire libraries given the headers
Works nicely with C++
Disadvantages
Autogenerates enormous files
Hard to debug if something goes wrong
Steep learning curve
2.8.4.1. Example¶
Let’s imagine that our cos
function lives in a cos_module
which has
been written in c
and consists of the source file cos_module.c
:
#include <math.h>
double cos_func(double arg){
return cos(arg);
}
and the header file cos_module.h
:
double cos_func(double arg);
And our goal is to expose the cos_func
to Python. To achieve this with
SWIG, we must write an interface file which contains the instructions for SWIG.
/* Example of wrapping cos function from math.h using SWIG. */
%module cos_module
%{
/* the resulting C file should be built as a python extension */
#define SWIG_FILE_WITH_INIT
/* Includes the header in the wrapper code */
#include "cos_module.h"
%}
/* Parse the header file to generate wrappers */
%include "cos_module.h"
As you can see, not too much code is needed here. For this simple example it is enough to simply include the header file in the interface file, to expose the function to Python. However, SWIG does allow for more fine grained inclusion/exclusion of functions found in header files, check the documentation for details.
Generating the compiled wrappers is a two stage process:
Run the
swig
executable on the interface file to generate the filescos_module_wrap.c
, which is the source file for the autogenerated Python C-extension andcos_module.py
, which is the autogenerated pure python module.Compile the
cos_module_wrap.c
into the_cos_module.so
. Luckily,setuptools
knows how to handle SWIG interface files, so that oursetup.py
is simply:
from setuptools import setup, Extension
setup(ext_modules=[Extension("_cos_module", sources=["cos_module.c", "cos_module.i"])])
$ cd advanced/interfacing_with_c/swig
$ ls
cos_module.c cos_module.h cos_module.i setup.py
$ python setup.py build_ext --inplace
running build_ext
building '_cos_module' extension
swigging cos_module.i to cos_module_wrap.c
swig -python -o cos_module_wrap.c cos_module.i
creating build
creating build/temp.linux-x86_64-2.7
gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/esc/anaconda/include/python2.7 -c cos_module.c -o build/temp.linux-x86_64-2.7/cos_module.o
gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/esc/anaconda/include/python2.7 -c cos_module_wrap.c -o build/temp.linux-x86_64-2.7/cos_module_wrap.o
gcc -pthread -shared build/temp.linux-x86_64-2.7/cos_module.o build/temp.linux-x86_64-2.7/cos_module_wrap.o -L/home/esc/anaconda/lib -lpython2.7 -o /home/esc/git-working/scientific-python-lectures/advanced/interfacing_with_c/swig/_cos_module.so
$ ls
build/ cos_module.c cos_module.h cos_module.i cos_module.py _cos_module.so* cos_module_wrap.c setup.py
We can now load and execute the cos_module
as we have done in the previous examples:
In [15]: import cos_module
In [16]: cos_module?
Type: module
String Form:<module 'cos_module' from 'cos_module.py'>
File: /home/esc/git-working/scientific-python-lectures/advanced/interfacing_with_c/swig/cos_module.py
Docstring: <no docstring>
In [17]: dir(cos_module)
Out[17]:
['__builtins__',
'__doc__',
'__file__',
'__name__',
'__package__',
'_cos_module',
'_newclass',
'_object',
'_swig_getattr',
'_swig_property',
'_swig_repr',
'_swig_setattr',
'_swig_setattr_nondynamic',
'cos_func']
In [18]: cos_module.cos_func(1.0)
Out[18]: 0.5403023058681398
In [19]: cos_module.cos_func(0.0)
Out[19]: 1.0
In [20]: cos_module.cos_func(3.14159265359)
Out[20]: -1.0
Again we test for robustness, and we see that we get a better error message
(although, strictly speaking in Python there is no double
type):
In [21]: cos_module.cos_func('foo')
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-7-11bee483665d> in <module>()
----> 1 cos_module.cos_func('foo')
TypeError: in method 'cos_func', argument 1 of type 'double'
2.8.4.2. NumPy Support¶
NumPy provides support for SWIG with the numpy.i
file. This interface file defines various so-called typemaps which support
conversion between NumPy arrays and C-Arrays. In the following example we will
take a quick look at how such typemaps work in practice.
We have the same cos_doubles
function as in the ctypes example:
void cos_doubles(double * in_array, double * out_array, int size);
#include <math.h>
/* Compute the cosine of each element in in_array, storing the result in
* out_array. */
void cos_doubles(double * in_array, double * out_array, int size){
int i;
for(i=0;i<size;i++){
out_array[i] = cos(in_array[i]);
}
}
This is wrapped as cos_doubles_func
using the following SWIG interface
file:
/* Example of wrapping a C function that takes a C double array as input using
* NumPy typemaps for SWIG. */
%module cos_doubles
%{
/* the resulting C file should be built as a python extension */
#define SWIG_FILE_WITH_INIT
/* Includes the header in the wrapper code */
#include "cos_doubles.h"
%}
/* include the NumPy typemaps */
%include "numpy.i"
/* need this for correct module initialization */
%init %{
import_array();
%}
/* typemaps for the two arrays, the second will be modified in-place */
%apply (double* IN_ARRAY1, int DIM1) {(double * in_array, int size_in)}
%apply (double* INPLACE_ARRAY1, int DIM1) {(double * out_array, int size_out)}
/* Wrapper for cos_doubles that massages the types */
%inline %{
/* takes as input two NumPy arrays */
void cos_doubles_func(double * in_array, int size_in, double * out_array, int size_out) {
/* calls the original function, providing only the size of the first */
cos_doubles(in_array, out_array, size_in);
}
%}
To use the NumPy typemaps, we need include the
numpy.i
file.Observe the call to
import_array()
which we encountered already in the NumPy-C-API example.Since the type maps only support the signature
ARRAY, SIZE
we need to wrap thecos_doubles
ascos_doubles_func
which takes two arrays including sizes as input.As opposed to the simple SWIG example, we don’t include the
cos_doubles.h
header, There is nothing there that we wish to expose to Python since we expose the functionality throughcos_doubles_func
.
And, as before we can use setuptools
to wrap this:
from setuptools import setup, Extension
import numpy
setup(
ext_modules=[
Extension(
"_cos_doubles",
sources=["cos_doubles.c", "cos_doubles.i"],
include_dirs=[numpy.get_include()],
)
]
)
As previously, we need to use include_dirs
to specify the location.
$ ls
cos_doubles.c cos_doubles.h cos_doubles.i numpy.i setup.py test_cos_doubles.py
$ python setup.py build_ext -i
running build_ext
building '_cos_doubles' extension
swigging cos_doubles.i to cos_doubles_wrap.c
swig -python -o cos_doubles_wrap.c cos_doubles.i
cos_doubles.i:24: Warning(490): Fragment 'NumPy_Backward_Compatibility' not found.
cos_doubles.i:24: Warning(490): Fragment 'NumPy_Backward_Compatibility' not found.
cos_doubles.i:24: Warning(490): Fragment 'NumPy_Backward_Compatibility' not found.
creating build
creating build/temp.linux-x86_64-2.7
gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/esc/anaconda/lib/python2.7/site-packages/numpy/core/include -I/home/esc/anaconda/include/python2.7 -c cos_doubles.c -o build/temp.linux-x86_64-2.7/cos_doubles.o
gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/esc/anaconda/lib/python2.7/site-packages/numpy/core/include -I/home/esc/anaconda/include/python2.7 -c cos_doubles_wrap.c -o build/temp.linux-x86_64-2.7/cos_doubles_wrap.o
In file included from /home/esc/anaconda/lib/python2.7/site-packages/numpy/core/include/numpy/ndarraytypes.h:1722,
from /home/esc/anaconda/lib/python2.7/site-packages/numpy/core/include/numpy/ndarrayobject.h:17,
from /home/esc/anaconda/lib/python2.7/site-packages/numpy/core/include/numpy/arrayobject.h:15,
from cos_doubles_wrap.c:2706:
/home/esc/anaconda/lib/python2.7/site-packages/numpy/core/include/numpy/npy_deprecated_api.h:11:2: warning: #warning "Using deprecated NumPy API, disable it by #defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION"
gcc -pthread -shared build/temp.linux-x86_64-2.7/cos_doubles.o build/temp.linux-x86_64-2.7/cos_doubles_wrap.o -L/home/esc/anaconda/lib -lpython2.7 -o /home/esc/git-working/scientific-python-lectures/advanced/interfacing_with_c/swig_numpy/_cos_doubles.so
$ ls
build/ cos_doubles.h cos_doubles.py cos_doubles_wrap.c setup.py
cos_doubles.c cos_doubles.i _cos_doubles.so* numpy.i test_cos_doubles.py
And, as before, we convince ourselves that it worked:
import numpy as np
import matplotlib.pyplot as plt
import cos_doubles
x = np.arange(0, 2 * np.pi, 0.1)
y = np.empty_like(x)
cos_doubles.cos_doubles_func(x, y)
plt.plot(x, y)
plt.show()
2.8.5. Cython¶
Cython is both a Python-like language for writing C-extensions and an advanced compiler for this language. The Cython language is a superset of Python, which comes with additional constructs that allow you call C functions and annotate variables and class attributes with c types. In this sense one could also call it a Python with types.
In addition to the basic use case of wrapping native code, Cython supports an additional use-case, namely interactive optimization. Basically, one starts out with a pure-Python script and incrementally adds Cython types to the bottleneck code to optimize only those code paths that really matter.
In this sense it is quite similar to SWIG, since the code can be autogenerated but in a sense it also quite similar to ctypes since the wrapping code can (almost) be written in Python.
While others solutions that autogenerate code can be quite difficult to debug (for example SWIG) Cython comes with an extension to the GNU debugger that helps debug Python, Cython and C code.
Note
The autogenerated C code uses the Python-C-Api.
Advantages
Python like language for writing C-extensions
Autogenerated code
Supports incremental optimization
Includes a GNU debugger extension
Support for C++ (Since version 0.13)
Disadvantages
Must be compiled
Requires an additional library ( but only at build time, at this problem can be overcome by shipping the generated C files)
2.8.5.1. Example¶
The main Cython code for our cos_module
is contained in the file
cos_module.pyx
:
""" Example of wrapping cos function from math.h using Cython. """
cdef extern from "math.h":
double cos(double arg)
def cos_func(arg):
return cos(arg)
Note the additional keywords such as cdef
and extern
. Also the
cos_func
is then pure Python.
Again we can use the standard setuptools
module, but this time we need some
additional pieces from Cython.Build
:
from setuptools import setup, Extension
from Cython.Build import cythonize
extensions = [Extension("cos_module", sources=["cos_module.pyx"])]
setup(ext_modules=cythonize(extensions))
Compiling this:
$ cd advanced/interfacing_with_c/cython
$ ls
cos_module.pyx setup.py
$ python setup.py build_ext --inplace
running build_ext
cythoning cos_module.pyx to cos_module.c
building 'cos_module' extension
creating build
creating build/temp.linux-x86_64-2.7
gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/esc/anaconda/include/python2.7 -c cos_module.c -o build/temp.linux-x86_64-2.7/cos_module.o
gcc -pthread -shared build/temp.linux-x86_64-2.7/cos_module.o -L/home/esc/anaconda/lib -lpython2.7 -o /home/esc/git-working/scientific-python-lectures/advanced/interfacing_with_c/cython/cos_module.so
$ ls
build/ cos_module.c cos_module.pyx cos_module.so* setup.py
And running it:
In [22]: import cos_module
In [23]: cos_module?
Type: module
String Form:<module 'cos_module' from 'cos_module.so'>
File: /home/esc/git-working/scientific-python-lectures/advanced/interfacing_with_c/cython/cos_module.so
Docstring: <no docstring>
In [24]: dir(cos_module)
Out[24]:
['__builtins__',
'__doc__',
'__file__',
'__name__',
'__package__',
'__test__',
'cos_func']
In [25]: cos_module.cos_func(1.0)
Out[25]: 0.5403023058681398
In [26]: cos_module.cos_func(0.0)
Out[26]: 1.0
In [27]: cos_module.cos_func(3.14159265359)
Out[27]: -1.0
And, testing a little for robustness, we can see that we get good error messages:
In [28]: cos_module.cos_func('foo')
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-7-11bee483665d> in <module>()
----> 1 cos_module.cos_func('foo')
/home/esc/git-working/scientific-python-lectures/advanced/interfacing_with_c/cython/cos_module.so in cos_module.cos_func (cos_module.c:506)()
TypeError: a float is required
Additionally, it is worth noting that Cython
ships with complete
declarations for the C math library, which simplifies the code above to become:
""" Simpler example of wrapping cos function from math.h using Cython. """
from libc.math cimport cos
def cos_func(arg):
return cos(arg)
In this case the cimport
statement is used to import the cos
function.
2.8.5.2. NumPy Support¶
Cython has support for NumPy via the numpy.pyx
file which allows you to add
the NumPy array type to your Cython code. I.e. like specifying that variable
i
is of type int
, you can specify that variable a
is of type
numpy.ndarray
with a given dtype
. Also, certain optimizations such as
bounds checking are supported. Look at the corresponding section in the Cython
documentation. In case you
want to pass NumPy arrays as C arrays to your Cython wrapped C functions, there
is a section about this in the Cython documentation.
In the following example, we will show how to wrap the familiar cos_doubles
function using Cython.
void cos_doubles(double * in_array, double * out_array, int size);
#include <math.h>
/* Compute the cosine of each element in in_array, storing the result in
* out_array. */
void cos_doubles(double * in_array, double * out_array, int size){
int i;
for(i=0;i<size;i++){
out_array[i] = cos(in_array[i]);
}
}
This is wrapped as cos_doubles_func
using the following Cython code:
""" Example of wrapping a C function that takes C double arrays as input using
the NumPy declarations from Cython """
# cimport the Cython declarations for NumPy
cimport numpy as np
# if you want to use the NumPy-C-API from Cython
# (not strictly necessary for this example, but good practice)
np.import_array()
# cdefine the signature of our c function
cdef extern from "cos_doubles.h":
void cos_doubles (double * in_array, double * out_array, int size)
# create the wrapper code, with NumPy type annotations
def cos_doubles_func(np.ndarray[double, ndim=1, mode="c"] in_array not None,
np.ndarray[double, ndim=1, mode="c"] out_array not None):
cos_doubles(<double*> np.PyArray_DATA(in_array),
<double*> np.PyArray_DATA(out_array),
in_array.shape[0])
And can be compiled using setuptools
:
from setuptools import setup, Extension
from Cython.Build import cythonize
import numpy
extensions = [
Extension(
"cos_doubles",
sources=["_cos_doubles.pyx", "cos_doubles.c"],
include_dirs=[numpy.get_include()],
)
]
setup(ext_modules=cythonize(extensions))
As with the previous compiled NumPy examples, we need the
include_dirs
option.
$ ls
cos_doubles.c cos_doubles.h _cos_doubles.pyx setup.py test_cos_doubles.py
$ python setup.py build_ext -i
running build_ext
cythoning _cos_doubles.pyx to _cos_doubles.c
building 'cos_doubles' extension
creating build
creating build/temp.linux-x86_64-2.7
gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/esc/anaconda/lib/python2.7/site-packages/numpy/core/include -I/home/esc/anaconda/include/python2.7 -c _cos_doubles.c -o build/temp.linux-x86_64-2.7/_cos_doubles.o
In file included from /home/esc/anaconda/lib/python2.7/site-packages/numpy/core/include/numpy/ndarraytypes.h:1722,
from /home/esc/anaconda/lib/python2.7/site-packages/numpy/core/include/numpy/ndarrayobject.h:17,
from /home/esc/anaconda/lib/python2.7/site-packages/numpy/core/include/numpy/arrayobject.h:15,
from _cos_doubles.c:253:
/home/esc/anaconda/lib/python2.7/site-packages/numpy/core/include/numpy/npy_deprecated_api.h:11:2: warning: #warning "Using deprecated NumPy API, disable it by #defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION"
/home/esc/anaconda/lib/python2.7/site-packages/numpy/core/include/numpy/__ufunc_api.h:236: warning: ‘_import_umath’ defined but not used
gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/esc/anaconda/lib/python2.7/site-packages/numpy/core/include -I/home/esc/anaconda/include/python2.7 -c cos_doubles.c -o build/temp.linux-x86_64-2.7/cos_doubles.o
gcc -pthread -shared build/temp.linux-x86_64-2.7/_cos_doubles.o build/temp.linux-x86_64-2.7/cos_doubles.o -L/home/esc/anaconda/lib -lpython2.7 -o /home/esc/git-working/scientific-python-lectures/advanced/interfacing_with_c/cython_numpy/cos_doubles.so
$ ls
build/ _cos_doubles.c cos_doubles.c cos_doubles.h _cos_doubles.pyx cos_doubles.so* setup.py test_cos_doubles.py
And, as before, we convince ourselves that it worked:
import numpy as np
import matplotlib.pyplot as plt
import cos_doubles
x = np.arange(0, 2 * np.pi, 0.1)
y = np.empty_like(x)
cos_doubles.cos_doubles_func(x, y)
plt.plot(x, y)
plt.show()
2.8.6. Summary¶
In this section four different techniques for interfacing with native code have been presented. The table below roughly summarizes some of the aspects of the techniques.
x |
Part of CPython |
Compiled |
Autogenerated |
NumPy Support |
---|---|---|---|---|
Python-C-API |
|
|
|
|
Ctypes |
|
|
|
|
Swig |
|
|
|
|
Cython |
|
|
|
|
Of all three presented techniques, Cython is the most modern and advanced. In particular, the ability to optimize code incrementally by adding types to your Python code is unique.
2.8.7. Further Reading and References¶
Gaël Varoquaux’s blog post about avoiding data copies provides some insight on how to handle memory management cleverly. If you ever run into issues with large datasets, this is a reference to come back to for some inspiration.
2.8.8. Exercises¶
Since this is a brand new section, the exercises are considered more as pointers as to what to look at next, so pick the ones that you find more interesting. If you have good ideas for exercises, please let us know!
Download the source code for each example and compile and run them on your machine.
Make trivial changes to each example and convince yourself that this works. ( E.g. change
cos
forsin
.)Most of the examples, especially the ones involving NumPy may still be fragile and respond badly to input errors. Look for ways to crash the examples, figure what the problem is and devise a potential solution. Here are some ideas:
Numerical overflow.
Input and output arrays that have different lengths.
Multidimensional array.
Empty array
Arrays with non-
double
types
Use the
%timeit
IPython magic to measure the execution time of the various solutions
2.8.8.1. Python-C-API¶
Modify the NumPy example such that the function takes two input arguments, where the second is the preallocated output array, making it similar to the other NumPy examples.
Modify the example such that the function only takes a single input array and modifies this in place.
Try to fix the example to use the new NumPy iterator protocol. If you manage to obtain a working solution, please submit a pull-request on github.
You may have noticed, that the NumPy-C-API example is the only NumPy example that does not wrap
cos_doubles
but instead applies thecos
function directly to the elements of the NumPy array. Does this have any advantages over the other techniques.Can you wrap
cos_doubles
using only the NumPy-C-API. You may need to ensure that the arrays have the correct type, are one dimensional and contiguous in memory.
2.8.8.2. Ctypes¶
Modify the NumPy example such that
cos_doubles_func
handles the preallocation for you, thus making it more like the NumPy-C-API example.
2.8.8.3. SWIG¶
Look at the code that SWIG autogenerates, how much of it do you understand?
Modify the NumPy example such that
cos_doubles_func
handles the preallocation for you, thus making it more like the NumPy-C-API example.Modify the
cos_doubles
C function so that it returns an allocated array. Can you wrap this using SWIG typemaps? If not, why not? Is there a workaround for this specific situation? (Hint: you know the size of the output array, so it may be possible to construct a NumPy array from the returneddouble *
.)
2.8.8.4. Cython¶
Look at the code that Cython autogenerates. Take a closer look at some of the comments that Cython inserts. What do you see?
Look at the section Working with NumPy from the Cython documentation to learn how to incrementally optimize a pure python script that uses NumPy.
Modify the NumPy example such that
cos_doubles_func
handles the preallocation for you, thus making it more like the NumPy-C-API example.