Analyzing C/C++ matrix in the gdb debugger with Python and Numpy

M.Mo

5.00/5 (4 votes)

16 Oct 2013CPOL5 min read

28K

257

Using the gdb debugger's Python API to analyze and visualize C/C++ arrays in a debugging session.

Download source - 3.89 KB

Introduction

When debugging code written in Matlab or Python, one can stop at a break point, manipulate the local vector, matrix variables and plot the results. In this article, I will show how to use the python API of the gdb debugger to plot and manipulate C/C++ arrays and vectors while debugging. For example, if we want to plot the eigenvalues of a two dimensional array mat during a debugging session:

PHP

float mat[10][10] = ....;

Then using the accompanying code, we can create a numpy array in python and plot its eigenvalues:

PHP

(gdb) py
> import gdb_numpy
> import numpy as np
> import matplotlib.pyplot as plt
> mat = gdb_numpy.to_array("mat") #Creates a numpy array that corresponds to the variable mat.
> print mat.shape
(10,10)
> y = np.linalg.eigvalsh(x)
> plt.plot(y)
> plt.show() #This is needed to show the figure, see notes below.

Gives a plot of the eigenvalues.

Plot of eigenvalues using python.

Notes on using matplotlib in gdb

Before proceeding, it is worth pointing out that when using either matplotlib.pyplot or matplotlib.pylab inside gdb, the show method has to be called for display a figure. Moreover, gdb will not respond to any command until the figure is close.

Using the code

The accompanying code depends on the python package numpy and can create numpy arrays from C/C++ pointers, arrays and STL vectors, as well as their nested types, out of the box. For information about how to install and use numpy, please visit their website. In this article, we will assume some basic knowledge of the numpy package. (Matlab users may be interested in this link).

To install the accompanying code, we can run the setup.py script in the folder with the argument install. (Type the following in a linux shell or a Windows command prompt.)

python setup.py install

When using the code, import the module gdb_numpy in the gdb console:

(gdb) py import gdb_numpy

To create a numpy array from a C/C++ pointer/array/vector type, pass its name as a string to the function to_array in the gdb_numpy module:

(gdb) py vec = gdb_numpy.to_array("vec") #vec is now a numpy array.

If vec is a STL vector or a built-in array, this will create a numpy array of the appropriate shape. However, if vec is a pointer, then the user must supply a second argument indicating its dimensions. For example, if we have:

PHP

float** mat = ...;

Then the dimensions must be supplied as a tuple.

PHP

(gdb) py
> mat = gdb_numpy.to_array("mat", (10,10))
> mat = gdb_numpy.to_array("mat") #error: sizes are not provided.
> mat = gdb_numpy.to_array("mat", (10)) #error: Not all sizes are provided.

Note that even if there is only one dimension, it still has to be passed as a tuple:

PHP

float* vec = ...;
(gdb) py vec = gdb_numpy.to_array("vec", (10))
(gdb) py vec = gdb_numpy.to_array("vec", 10) #error: Dimensions must be passed as tuple.

The method also support nested types, e.g.:

PHP

std::vector<std::vector<double> > mat = ...;
> py mat = gdb_numpy.to_array("mat") #mat is a 2D numpy array

Background

We will now take a very brief look at some of the gdb-python API (supported after gdb 7) that are used in our code. Full details of the Python API in gdb can be found in the gdb documentations. Within the gdb console, the python interpreter can be accessed by the command python (or py), followed by a python command:

(gdb) py print 1 + 2
3

If no argument is provided to the command python, then the multi-line mode will be entered:

(gdb) py
> x = 1 + 2
> print x
> end
3

Variables in the C/C++ program that we are debugging can be accessed in python using the parse_and_eval method of the gdb module, which is imported automatically when the python interpreter is accessed through gdb.

(gdb) py my_var = gdb.parse_and_eval("my_var")

The parse_and_eval returns an instance of the gdb.Value type, which contains information of the C/C++ variable. For example, the name of the C/C++ type can be accessed through the type member:

(gdb) py
> my_array = gdb.parse_and_eval("my_array")
> print my_array.type
> end
double[10]

Class members can be accessed through the index operator:

(gdb) py
> my_class = gdb.parse_and_eval("my_class")
> my_data = my_class['data'] #Gives my_class.data

If the variable is of pointer type, then the indexer can be used to dereference it:

(gdb) py print my_data[10]

This covers what we need to extend the accompanying code.

Extending the code

The module can be extended to accomodate custom container types. This involves deriving from the class DeRefBase in the module deref. Suppose we have a user defined matrix type and we want to extend the module to work with it:

C++

template <typename T>
class MyMatrix
{
public:
    ....
    //Index operator.
    T& operator()(int i, int j){ return data[i*columns+j]; }
    const T& operator()(int i, int j){ return data[i*column+j]; }
private:
    //Underlying data
    T* data;
    //Number of rows
    int rows;
    //Number of columns
    int columns;
}

The type stores its underlying data in the member data so that if M is an instance of MyMatrix, then M(i,j) is given by *(M.data+i*M.columns+j).

First we need to override the deref method in DeRefBase, which is used for dereferencing the container. This is done by dereferencing the member data of the MyMatrix instance.

PHP

#Converts a MyMatrix instance named Mat to a gdb.Value instance in python.
(gdb) py Mat = gdb.parse_and_eval("Mat")
#Gets an gdb.Value instance that corresponds to M.data. (Even though data is a private member)
(gdb) py
> data = Mat['data']
> columns = int(Mat['columns']) #Gets the columns and cast into integer
> print data[i*columns+j] #Gives Mat(i,j)

The deref function is then:

PHP

def deref(self, val, indices):
    data = val['data']
    columns = int(val['columns'])
    return data[indices[0] * columns + indices[1]]

So for example, the following dereferences our matrix:

PHP

#derefMyMat is an instance of the appropriate DeRef class
(gdb) py print derefMyMat.deref(Mat,(i,j)) #Gives Mat(i,j)

Note that as with gdb_numpy.to_array, the method expects a tuple or list.

Next we need to update and initialize some members of the class. The member bounds stores the dimensions of the matrix.

PHP

#Constructor expected to take 3 variables:
#Mat: gdb.Value instance that represents the matrix
#shape_ind: An integer for internal bookkeeping purpose.
#shape: A tuple or list, for internal use.
def __init__(self, Mat, shape_ind, shape):
    ...
    self.bounds=[Mat['rows'], Mat['columns']]
    ...

Here the dimensions of the matrix are obtained from the matrix instance.

If on the other hand, the dimensions are provided by the user, such as in the case of pointers, then the _get_range_from_shape method should be used to extract the dimensions from the argument shape.

self._get_range_from_shape(2) #'2' here is the number of dimensions to extract.

This will correctly initialize the members shape_ind and bounds.

The other class member that needs updating is val. This should be a gdb.Value instance that corresponds to an object after dereferencing.

PHP

self.val = self.deref(Mat,(0,0))

As the value of self.val will not be used, it does not matter what indices we use in the deref method, as long as it is a valid index. For example, we can also use:

PHP

self.val = self.deref(Mat, (self.bounds[0]-1, 
             self.bounds[1]-1)) #Works as long as the indices are valid.

Finally, we need to provide a regular expression to identify our class. This should be something that matches the type name of our class, which can be accessed through the type member of the corresponding gdb.Value instance.

PHP

(gdb) py my_mat = gdb.parse_and_eval("my_mat")
(gdb) py print my_mat.type
MyMatrix

So in our case, the pattern can be ^MyMatrix.

PHP

class DeRefMyMatrix(DeRefBase):
    pattern = re.compile('^MyMatrix')
    ....

Summarizing, the python class that we need to write is:

PHP

class DeRefMyMatrix(DeRefBase):

    pattern = re.compile('^MyMatrix')

    def __init__(self, Mat, shape_ind, shape):
        super(DeRefMyMatrix, self).__init__(Mat, shape_ind, shape)
        self.val = self.deref(Mat, [0,0]) #Updates to a dereferenced type
        self.bounds=[Mat['rows'], Mat['columns']] #The dimensions of the matrix

    def deref(self, val, indices):
        data = val['data']
        columns = int(val['columns'])
        return data[indices[0] * columns + indices[1]]

To use this class in the gdb_numpy module, we need to register it by adding it to the _container_list variable in the module.

PHP

_container_list = [... ,deref.DeRefMyMatrix]

The gdb_numpy.to_array method can now be used with our MyMatrix class. It will also automatically support nested types, e.g.:

PHP

MyMatrix<MyMatrix<double> > 4DTensor = ...;
std::vector<MyMatrix<double> > 3DTensor = ...;
MyMatrix<std::vector<double> > Another3DTensor = ...;

will all work with the gdb_numpy.to_array method.

History

Initial submission: 13/10/13.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)