Introduction
It's been a long time I was busy coding scientific computation toolsets for academic purposes. For research matters and fast coding, C# and F# made a great choice of implementation language. They are powerful and really fast and back-boned by the great .NET Framework. Parallelization is neat and clean, and there are a bunch of free components for data visualization. But a while ago, the problem arose. Customer was willing to order a scientific analysis software doing tons and tons of vector computations in a single iteration, and performance was a significant requirement. So I inclined to the elder of performance, C++, and its recently born child, C++11 standard. Everything was going well. As I expected, the most challenging matter was memory allocation, since beside memory leaks, sequences of new
s and delete
s can cause a big descent. Memory re-use and pre-allocation came handy and decreased running time significantly. But there was something missing and I knew it: do not compute what's not needed and compute once and only once.
Background
Throughout this article, due to simplicity, I'm using an elementary example, vector
class, but the reader must be aware of more complex problems. My first implementation of vector
class was something like this:
class vector {
private:
float _x, _y;
public:
vector(float x, float x) : _x(x), _y(y) { }
float get_x() const { return _x; }
float get_y() const { return _y; }
float get_length() const { return sqrtf(_x * _x + _y * _y); }
vector get_unit() const { return *this / get_length(); }
};
This is a simple implementation, but as can be seen, there's a problem: get_length
and get_unit
methods may be called multiple times for a single vector object during computations, and so, their values are computed over and over again. Since vector
class is immutable, the easy solution is pre-computing values. So, I changed the vector
class as follows:
class vector {
private:
float _x, _y;
float _length;
vector _unit;
void precompute() {
_length = sqrtf(_x * _x + _y * _y);
_unit = *this / _length;
}
public:
vector(float x, float y) : _x(x), _y(y) {
precompute();
}
float get_x() const { return _x; }
float get_y() const { return _y; }
float get_length() const { return _length; }
vector& get_unit() const { return _unit; }
};
Now it works and it's much better. But there goes a question: is it necessary to compute all of them at once, and specially, at the class construction process? However, how can we automate pre-computing process and make the code more cleaner?
At this point, I was thinking of lazy evaluation in C# language and Lazy<T>
class. But there's no such feature in C++ and not even in Standard Template Library. I found some close implementation of lazy evaluation in boost library, but I chose to not use such a big library for a small feature, and then, I decided to implement a small one myself.
Lazy Evaluation
I needed an equivalent for .NET Lazy
class, which I was used to, and I needed it to be generic. As I thought, it was not that hard and I quickly coded lazy<T>
class:
template<typename T>
class lazy {
public:
lazy() : m_initiator(default_initiator), m_initialized(false) { }
lazy(std::function<T()> initiator) : m_initiator(initiator), m_initialized(false) { }
T& getValue() {
if (!m_initialized) {
m_value = m_initiator();
m_initialized = true;
}
return m_value;
}
operator T() {
return getValue();
}
T& operator* () {
return getValue();
}
lazy<T>& operator= (const lazy<T>& other) {
m_initiator = other.m_initiator;
m_initialized = false;
return *this;
}
private:
static T default_initiator() {
throw std::runtime_error("No lazy evaluator given.");
}
std::function<T()> m_initiator;
T m_value;
bool m_initialized;
};
This way, it let me define a lazily evaluated variable using a lambda expression as its evaluator:
auto pi = lazy<float>([]() { return 3.141592; });
auto area = *pi * r * r;
auto perimeter = *pi * 2 * r;
The value of pi
is not computed since area
is computed. And perimeter
uses its previously computed value and it's not computed again. Finally, I simply rewrote the vector
class using lazy
class.
class vector {
private:
float _x, _y;
lazy<float> _length;
lazy<vector> _unit;
void initialize() {
_length = lazy<float>([this]() { return sqrtf(_x * _x + _y * _y); });
_unit = lazy<vector>([this]() { return *this / _length; });
}
public:
vector(float x, float y) : _x(x), _y(y) {
initialize();
}
float get_x() const { return _x; }
float get_y() const { return _y; }
float get_length() { return *_length; }
vector& get_unit() { return *_unit; }
};
Note that you cannot use const
keyword for get_length
and get_unit
methods, cause their values are changed during first access. It was very clean and beautiful and was something I was after. I was really proud of myself like a hero, but suddenly my hooray didn't last long. When I tried to compile it by Visual Studio build tool, the compiler gave me C2079
error: 'lazy<T>
' uses undefined class 'vector
'.
I was really confused. From my C# background, anything was done right, and the error itself did not give me enough information to solve it.
Solution
It took me some time of struggling with the code and I was really upset. I, even, decided to fall back and use the trivial plan through the entire project. I took the last chance and hopefully found the source of the issue by a hint given in a post in stackoverflow.com site. Can you see it? Well, it's not that trivial. It has something to do with stack allocation of variables and dependencies.
I really prefer static (stack) allocation on dynamic allocation in C++, and I assure you, new
and delete
are evils. So, I prefer to define member variables and class instances as automatic variables, not pointers. Now, let's see what happens. C++ compilers allocate an automatic class instance and its members as a whole on stack, and thus, they need some information about class size in bytes at compile time. As of vector
class, let's compute its size. vector
class has four members: two of type float
, one of type lazy<float>
, and one of type lazy<vector>
. The size of two float
members are well-known. For lazy<float>
type, we further need to compute a generic lazy<T>
class with T
defined as to be float
. That's not a problem, too. Now, we have lazy<vector>
class left. Here's where the problem goes. The size of lazy<vector>
depends on vector
, and vice versa, the size of vector
depends on lazy<vector>
. So we have a double dependency here which simply goes to an infinite recursion. It's now easy to understand why languages like C# and Java do not face such a problem, because class instances in managed languages are always reference types and dynamically allocated.
When the problem is well-defined, the solution is easy. What Microsoft offers is to simply use pointers and dynamic allocation, and that means more new
s and delete
s which I was not after. So, I chose the hard way, and needed to break one of the dependency lines by breaking vector
class. What follows is the result:
class __vector {
public:
__vector(float x, float y) : m_x(x), m_y(y) { initialize(); }
float get_x() const { return m_x; }
float get_y() const { return m_y; }
float get_length() { return *m_length; }
private:
void initialize() {
m_length = lazy<float>([this]() { return sqrtf(m_x * m_x + m_y * m_y); });
}
float m_x, m_y;
lazy<float> m_length;
};
class vector : public __vector {
public:
vector(float x, float y) : __vector(x, y) { initialize(); }
vector get_unit() { return vector(*m_unit); }
private:
void initialize() {
m_unit = lazy<__vector>([this]() { return *this / get_length(); });
}
lazy<__vector> m_unit;
};
Now, both lazy<vector>
and vector
classes depend on __vector
class to compute their size. This code compiles and works successfully as expected.
Well, it's not as clean as I want, but, it's simple enough to consider, specially when you get used to it. By the way, do not blame C++ and remember: no pain, no gain!
Thread Safety
The above implementation of lazy<T>
class is good a for single-threaded application. Due to overcome thread-safety issues, some minor changes have to be made. Here, I've used the STL mutex
class for controlling access to private members.
lazy<T>& operator= (const lazy<T>& other) {
m_lock.lock();
m_initiator = other.m_initiator;
m_initialized = false;
m_lock.unlock();
return *this;
}
T& get_value() {
if (!m_initialized) {
m_lock.lock();
if (!m_initialized) {
m_value = m_initiator();
m_initialized = true;
}
m_lock.unlock();
}
return m_value;
}
Points of Interest
Always listen to your heart. Never do anything in a way that you don't like or feel that it's wrong. Hear professionals' voice, but don't be afraid of overtaking cliches. Using open source components is the right way to do things, but one cannot learn walking before falling many times. These are the key points that make you a valuable programmer after becoming a good one.
I hope that this article is useful for you, but, always think of bigger problems. Feel free to share your opinions. Critics are most welcome.
History
- First version written on November 14, 2013
- Bug fixed thanks to Stefan_Lang on November 19, 2013
- Thread Safety section added on November 19, 2013
- Some typos and bugs fixed on November 20, 2013