Click here to Skip to main content
16,022,236 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
I would want to parallelize this on the top loop, but I don't think that there's an analytic formula for c so that c can be computed instead of iterated. Can anyone solve it?

C#
<pre>int nPSEtriangles = PSE_PTS.Length * (PSE_PTS.Length - 1) * (PSE_PTS.Length - 2) / 6;
PSEtriangles = new JPMath.Triangle[nPSEtriangles];
int c = 0;
for (int i = 0; i < PSE_PTS.Length - 2; i++)
	for (int j = i + 1; j < PSE_PTS.Length - 1; j++)
		for (int k = j + 1; k < PSE_PTS.Length; k++)
		{
			PSEtriangles[c] = new JPMath.Triangle(PSE_PTS[i], PSE_PTS[j], PSE_PTS[k]);
			c++;
		}


What I have tried:

staring at it for very long times, concluding always that c cannot be computed in this scenario, only iterated
Posted
Updated 22-Aug-24 18:54pm
v2

FIRST APPROACH

With an extra array cstart[] and assuming n=PSE_PTS.Length you could calculate the behavior of c in advance:

int c=0;
for (int i = 0; i < n - 2; i++) {
    cstart[i] = c;
	for (int j = i + 1; j < n - 1; j++)
		for (int k = j + 1; k < n; k++)
		{
			c++;
		}
}
and then use cstart[] to parallelize your outer loop:
for (int i = 0; i < n - 2; i++) {
    c = cstart[i];
    ...
}


SECOND APPROACH

But then you could notice cstart actually holds a predictable sequence of values from a polynomial of the third degree. I won't reproduce all the math involved here, however you can forgo the extra array and simply write:

for (int i = 0; i < n - 2; i++) {
    int y=n-i;
    c = nPSEtriangles - y*(y-1)*(y-2)/6;
    ...
}
assuming int nPSEtriangles = n*(n-1)*(n-2)/6 as you already have.
:)
 
Share this answer
 
v3
Comments
CommentFree 24-Aug-24 14:33pm    
These seem brilliant. The first solution assumes that the triangle constructor is a significant expenditure, and so a first-run-through doing something trivial in serial followed by parallelization for something complex is still better than serial for the complex task. That would be quite correct in this case as the triangle constructor does a bunch of math stuff.

The second approach seems like it would work straight away, and then you don't need the initial serial loop, which is my preference. Seems so trivial now! I'll run a few simulations of your 2nd solution in my head with small nPSEtriangles, and see if the numbers comes out right...at a glance it seems like it's correct?! Thanks!
Unless the JPMath.Triangle constructor takes a significant amount of time, then it's likely that parallelisation will slow down processing rather than accelerate it. As I said yesterday to a different user (or a sock puppet account) Improve the speed of for loops[^] parallelisation is not a magic bullet and it needs to be carefully thought about, or it can slow you down rather than speed you up.

When you parallelise code, you add threads - and there is a processing cost associated with each thread you create because each thread needs it's own stack space and "control block" so the system knows what state it is in, what priority it has, who owns it, and so forth. There is also a shutdown cost, though that is smaller - releasing memory and other resources back to the system. Those costs aren't trivial and if the setup / shutdown time exceeds the processing time each thread uses then your app will run slower.
Additionally, thread execution is restricted by the hardware it runs in: if you have 4 cores then the whole system can only execute 4 threads simultaneously - all the others are "frozen" waiting for core time. And there is a cost associated with freezing and unfreezing a thread as well: nothing is for free! If you have 8 cores, then it's 8 threads and so on.
So if you add many threads by parallelising indiscriminately, you very quickly run into the law of diminishing returns: your app slows down because you tried to speed it up!

I'd start by first measuring the current speed really carefully over a range of data sizes so you have a benchmark to work from, then looking at how you could parallelize the outer loop only, so that each iteration of i is on a separate thread but j and k remain very much the same: you will need to pass each thread a different value of c instead of using a simple increment in the innermost loop. That might speed things up, it might not. I suspect that if your triangle constructor is trivial you aren't going to get a major speed increase in any way.

Parallelism isn't a magic bullet - I know I said that above but it bears repeating - you need to think about it very carefully and it will change and complicate your code so it'll be a lot harder to test, debug, and maintain.
 
Share this answer
 
Comments
CommentFree 24-Aug-24 14:19pm    
Oh it's a very large number of triangles to create, and the triangles constructor is not quite trivial, as it does do some computations to characterize each triangle. This section takes a few seconds to compute, and that could be reduced to sub-second on target machines with 32+ threads, I'm sure.
CommentFree 24-Aug-24 14:26pm    
I do math-computation intensive work, and I parallelize everything, and it always massively boosts performance.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900