Introduction
Memoization
- In computing, memoization
is an optimization technique used primarily to speed up computer programs by storing the results of expensive function calls and returning the cached result when the same inputs occur again © Wikipedia (http://en.wikipedia.org/wiki/Memoization).
Memoization
is an essential part of Dynamic Programming
. The idea is simple - once you computed something, store the result, and next time, when you will be asked to compute the same, just use the result. Here we need to clarify two basic conditions: Computation result is absolutely deterministic, and depends solely on input parameters.
Pseudo Code
Usually, simple idea may be realized by the simple design and simple implementation. In case of memorization, that's also truth. In Wikipedia (http://en.wikipedia.org/wiki/Memoization), we even may find a pseudo code that implements automatic memoization.
function memoized-call (F is a function object parameter)
if F has no attached array values then
allocate an associative array called values;
attach values to F;
end if;
if F.values[arguments] is empty then
F.values[arguments] = F(arguments);
end if;
return F.values[arguments];
end function
But what about implementation in C++? Things are much more complex here, because in a real life, we are operating with real types. Associative array
looks very simple when you write pseudo code. However, when you write C++ code, you have to know the types. Type of the Key
and type of the Value
. In our case, Key
is a set of all parameters that are used as an input in computation, and Value
is a computation result.
So, how will we build the infrastructure for Memoization
? The answer is almost immediate - we will use Templates. The straightforward way is just to provide list of the types, that will consist of Type of the computation result (Value
in our associative array
) and one, or more Types of input parameters (Key
in our associative array
). Another "nice to have" feature is single API for memoized and not memoized version. So far, so good, but API for our infrastructure, will require to provide all these types, that in my opinion, will be a little bit annoying and non user friendly. Moreover, we will need to provide, as an additional parameter, function that will implement the original computation logic (to support single API feature).
May we do better? We will try :) And we will succeed!
API and Example
Let's assume that we have some template class
Memoize
that should be used like in the following example.
Here we have struct MagicCircle
with some method Func
, that gets as parameters Coordinates
and String
, and after very long and complex mathematical calculations return Radius
and Color
. Don't try to find any logic in the math calculations, that's not the point here. We just want to demonstrate, using that example, with a non trivial input types (std::pair
and std::string
) and non trivial return type (some struct
), how easy is memoization
infrastructure usage.
#pragma once
#include <cstdint>
#include <utility>
#include <string>
#include <limits>
#include <random>
#include "Memoize.h"
enum class Color { RED, GREEN, BLUE };
using Coordinates = std::pair < uint32_t, uint32_t >;
struct MagicColorResult
{
uint32_t radius;
Color color;
};
struct MagicCircle : public Memoize< MagicCircle >
{
static MagicColorResult Func(Coordinates coord, std::string str)
{
MagicColorResult res;
std::random_device rd;
std::mt19937 gen(rd());
std::uniform_int_distribution<uint32_t>
dis(std::numeric_limits<uint32_t>::lowest(),
std::numeric_limits<uint32_t>::max()
);
res.radius = dis(gen);
res.color = Color::GREEN;
return res;
}
};
So, let's see what we have here:
- In definition of struct
MagicCircle
, we inherit from class Memoize
, when as a template parameter we provide MagicCircle
itself. - We implement
static
method Func
, that gets some(many) parameters, and returns computation result of some type.
Those are all requirements for Memoization
Infrastructure usage. To inherit from class Memoize
(and provide template parameter), and to implement static
method Func
.
That's it.
We don't care what and how many parameters Func
will get as an input, and we don't care what the output type is.
Now, let's look at the usage:
MagicColorResult res1 = Memoize<MagicCircle>::Func(Coordinates{7,2}, std::string(" Original Dummy "));
MagicColorResult res2 = Memoize<MagicCircle>::Func(Coordinates{7,2}, std::string(" Another Dummy "));
MagicColorResult res3 = Memoize<MagicCircle>::Func(Coordinates{7,2}, std::string(" Original Dummy "));
For calling MagicCircle
computation, using Memoization
infrastrucutre, we just need to call not the MagicCircle::Func( ... )
method itself, but the Memoize<MagicCircle>::Func( ...
). And the magic will happened: For the two first calls, with the "first time coming" parameters actual calculation will be executed. But for the third call, with parameters identical to the first call, no computation will be performed, and memoized result will be returned.
Implementation
That's the time to move to the actual implementation of Memoize
.
#pragma once
#include <tuple>
#include <map>
template < typename T > struct Memoize
{
template <typename... Args> static auto Func(Args&&... args)
-> decltype(T::Func(std::forward<Args>(args)...))
{
using INPUT = std::tuple<Args...>;
using RESULT = decltype(T::Func(std::forward<Args>(args)...));
static std::map< INPUT, RESULT > repository;
auto key = std::make_tuple(std::forward<Args>(args)...);
auto it = repository.find(key);
if (it != repository.end())
{
return it->second;
}
else
{
auto result = T::Func(std::forward<Args>(args)...);
repository[key] = result;
return result;
}
}
};
As we saw, Memoize
gets, as a template parameter, class with a single requirement - this class must have static
method Func
.
Internally, Memoize
define his own static
method Func
, with input parameters type(s), and output parameter type, that are identical to the original Func
, from the template parameter class. To achieve that, we use Variadic Template
(C++11) for input parameters, and decltype
(C++11) for output type. No type is passed explicitly. Compiler does this job for us - all types( input and output ) for Memoize::Func
are deducted by compiler from derived class Func
definition.
The implementation itself is pretty similar to the pseudo code. We store results of calculation, indexed by combination of input parameters, in associative array (std::map
in our case). On Func
call, we look if we already calculated exactly the same parameters, by looking in associative array indexed by combination of input parameters. If we found the result, we just return that, w/o actual calculation. If we didn't - we call the original Func
, which processes the actual computation, and stores the result.
The only detail that may be interested here is indexing itself. For indexing, we use std::tuple
, based on input types. Just to remind, order operation on tuple
is defined as lexicographic order on tuple
types. So it is important that every particular type from input parameters will have comparison operator.
Recursive Example
As I wrote, Memoization
is an essential part of Dynamic Programming
. Here is an example how to use Memoize
infrastructure for Dynamic Programming
Factorial
computation.
#pragma once
#include <cstdint>
#include "Memoize.h"
class Factorial : public Memoize<Factorial>
{
public:
static uint32_t Func(int i)
{
if (i == 0)
{
return 1;
}
return i * Memoize::Func(i - 1);
}
};
Please note that for recursive call, we should call Memoize::Func
implementation of Func
method, and
Func implementation from Factorial
class itself.
And external usage is similar to the previous example:
Memoize<Factorial>::Func(10);
Memoize<Factorial>::Func(11);
For the second call, only one calculation, 10*11, will be executed.
Remarks
As we see, some advanced C++11 features have been used in this implementation. Please be aware that not all compilers have a full C++11 support. This code has been implemented and tested using MSVC 2014 CTP compiler.