Introduction
I’ve recently been busy with a machine captcha recognition module. It’s not that difficult
to achieve but the process takes time (mainly CPU time) to find the different parameters,
coefficients and weights for the best results..
I choose the simplest approach without generic algorithms or neural networks and used the usual for
statement which tends to be pretty exhaustive.
In the end I achieved approximately 75% accuracy which is more than adequate for my requirements.
The optimum solution involves a pretty ugly nesting for loops, not the most elegant.
for (int lowMaskLevel = 0; lowMaskLevel <= 90; lowMaskLevel++)
{
for (int topMaskLevel = lowMaskLevel; topMaskLevel <= 255; topMaskLevel++)
{
for (int positiveFinalKoef = 0; positiveFinalKoef <= 100; positiveFinalKoef++)
{
for (int negativeFinalKoef = 0; negativeFinalKoef <= 100; negativeFinalKoef++)
{
for (int finePositiveKoef = 0; finePositiveKoef <= 30; finePositiveKoef++)
{
for (int fineNegativeKoef = 0; fineNegativeKoef <= 20; fineNegativeKoef++)
{
for (int hightLightKoef = 1; hightLightKoef <= 3; hightLightKoef++)
{
var value = TestParemeters(lowMaskLevel,
topMaskLevel,
positiveFinalKoef,
negativeFinalKoef,
finePositiveKoef,
fineNegativeKoef,
hightLightKoef,
);
if (value >= bestValue)
{
bestLowMaskLevel = lowMaskLevel;
bestTopMaskLevel = topMaskLevel;
bestPositiveFinalKoef = positiveFinalKoef;
bestNegativeFinalKoef = negativeFinalKoef;
bestFinePositiveKoef = finePositiveKoef;
bestFineNegativeKoef = fineNegativeKoef;
bestHightLightKoef = hightLightKoef;
}
}
}
}
}
}
}
}
Terrible, isn't it?
The fact that the number of parameters, rules and limits could change made this approach
difficult to maintain. I abandoned this approach in favour of a more manageable solution in
the form of a helper class to handle iteration making things clear and maintainable.
Example
Now I can define iterator parameters inline just as regular method arguments that looks
nice, carefully and much better in support or any change.
int bestValue = 0;
object[] bestArguments = null;
Iterator.Execute(() => TestParemeters(
Iterator.Range(0, 90),
Iterator.Range(args => (int)args[0], args => 255),
Iterator.Range(0, 100, 10),
Iterator.Range(0, 100),
Iterator.Range(0, 30),
Iterator.Range(0, 20),
Iterator.Range(0, 3)
), invocationResult =>
{
if (invocationResult.Value >= bestValue)
{
bestValue = invocationResult.Value;
bestArguments = invocationResult.Arguments;
}
});
Using the code
To use it you have to add reference to the
WestWayConsulting.Tools.Iterator.dll
assembly
Note! This assembly has reference to Immutable Collections, so it was compiled with .NET Framework 4.5.
Tool has fasade class Iterator
to define and execute iteration.
To execute iteration you have to call static Execute
method that have two overloads - one for calling methods that returns value and second for action methods. Execute
method accepts following arguments:
- expression: Expression that defines method call. Can be static method, instance method or type constructor. If method returns value
Execute
method will return list of execution results; - callbackAction (optional): Method delegate that will be executed after each method execution. Method accepts instance of
InvocationResult
class that has result of method execution, array of arguments, exception; - throwExceptionOnError(optional, true by default): Defines if exception should be thrown or ignored if any happened on method invocation;
When you define your method in iterator you may still use any type of arguments as you usually do and mix them with iterator definitions. So, following statement will work:
Iterator.Execute(() => MyMethod(
1,
DateTime.Now,
Iterator.Range(0, 100, 10),
Iterator.RangeParallel(0, 100),
new WebClient()
});
Note! You cannot define iterator outside main method call. Following statement will throw InvalidOperationException
exception:
Iterator.Execute(() => MyMethod(
new DateTime(2013, 1, Iterator.Range(0, 100, 10))
});
In the current version the following iterators are available:
- Range(from, to, step: Generate sequence iterator from value to value with specified step. Has overloads to generate integer and double sequences. Equivalent for:
for(int value = from; value<=to; value+=step)
{
}
- RangeParallel(from, to, step): The same as Range, but execution runs in parallel. Equivalent for:
Parallel.ForEach(CreateIterator(from, to, step), (value) =>
{
});
- Random(minValue, maxValue, count): Generate sequence with specified number of random numbers (count) between minValue and maxValue. Has overloads to generate integer and double sequences. Equivalent for:
var random = new Random();
for(int i = 0; i<=count; value++)
{
var value = random.Next(minValue, maxValue);
}
- RandomParallel(minValue, maxValue, count): The same as Random, but execution runs in parallel. Equivalent for:
var random = new Random();
Parallel.For(0, count, (i) =>
{
var value = random.Next(minValue, maxValue);
});
- Enumerator(enumerable): Execute iteration from enumerable. Equivalent for:
foreach(var value in enumerable)
{
}
- EnumeratorParallel(enumerable): The same as Enumerator, but execution runs in parallel. Equivalent for:
Parallel.ForEach(enumerable, (value) =>
{
});
Each definition has overloads with dynamic parameters. By using them you can define
the iteration parameters using complex rules.
Function that define dynamic parameter accepts array of values from arguments above.
Iterator.Execute(() => MyMethod(
1,
DateTime.Now,
Iterator.Range(args => (int)args[0], args => 255),
Iterator.RandomParallel(args => (int)args[2], args => DateTime.Now.Seconds+255),
new WebClient()
});
How does it work
Execute
method parses the top level parameters from a Labmda expression and generate a
linked chain of argument handlers. Each handler knows the next handler to execute.
Iterator.Execute(() => TestParemeters( executor {}
Iterator.Range(0, 90), |_iterator {arg[0]}
Iterator.Range(args => (int)args[0], |_iterator {arg[0], arg[1]}
args => 255), |
1, |_constant {arg[0], arg[1], arg[2]}
DateTime.Now, |_method as argument {arg[0], arg[1],
| arg[2], arg[3]}
); |_method executor
Handler gets its arguments from the previous step; generates and adds new arguments to
the list and passes it to the next handler. In the case an argument is a method, handler
passes a method delegate to the argument list and a real value is obtained just before the
main method is executed. In the case an argument is an iterator, the next handler is called
for each value from the generated enumeration. The arguments list is immutable, so thread
safety is guaranteed. At the end of the chain, method executor invokes the main method
delegate with a list of parameters and calls the callback method if specified.
Please note that callback method is not thread safe. If you use parallel iterators you have to
implement locking and synchronization yourself.