Re: Threading problem - Algorithms Discussion Boards

Alan Balkany15-Feb-11 7:41

15-Feb-11 7:41

You could use lock () around the places you absolutely don't want to execute concurrently. This would remove all the uncertainty and make your code more reliable.

Re: Threading problem

Anthony Mushrow15-Feb-11 7:49

Anthony Mushrow

15-Feb-11 7:49

True, but I was aiming for a lock-free solution as they are surprisingly expensive.

My current favourite phrase: I've seen better!

-SK Genius

Source Indexing and Symbol Servers

Re: Threading problem

Alan Balkany15-Feb-11 8:07

Alan Balkany

15-Feb-11 8:07

It's surprisingly complex for such a short fragment because of the possible interactions between threads, and the shuffling of values among four variables. Maybe it could be rewritten in a simpler way.

A useful step in analyzing code like this is identifying invariant conditions, that are guaranteed to always be true at some point in the code. But it looks like the variable shuffling makes this impossible.

Re: Threading problem

Anthony Mushrow15-Feb-11 8:21

Anthony Mushrow

15-Feb-11 8:21

Yeah, I've temporarily gone with a lock which solves the problem for now. Then I'm going to try Luc's suggestion of using a couple of queues. I already have a lock-free queue that's working elsewhere which I should be able to use, since there would only be one producer for each queue (a limitation for lock-free queues it would seem).

My current favourite phrase: I've seen better!

-SK Genius

Source Indexing and Symbol Servers

Re: Threading problem

Luc Pattyn15-Feb-11 8:57

Luc Pattyn

15-Feb-11 8:57

SK Genius wrote:
there would only be one producer ... a limitation for lock-free queues it would seem

a limitation for lock-free whatevers I would say. Big Grin | :-D

Luc Pattyn [Forum Guidelines] [My Articles] Nil Volentibus Arduum

Please use <PRE> tags for code snippets, they preserve indentation, improve readability, and make me actually look at the code.

Re: Threading problem

JesperMadsen12321-Feb-11 10:09

JesperMadsen123

21-Feb-11 10:09

I am sure you can find a solution for adding more producers, if you are good at lock-free algorithms. This page suggest that it is not impossible.

http://www.1024cores.net/home/lock-free-algorithms/queues

Re: Threading problem

Luc Pattyn15-Feb-11 7:57

Luc Pattyn

15-Feb-11 7:57

when you have a producer, a consumer, and a number (at least 2) of equivalent buffers, I find it most easy to have two queues, one holding "empty" buffers (emptied by the consumer, to be filled by the producer), one holding "full" buffers (filled by the producer, to be processed by the consumer). That is an approach that is simple to understand; it works well, it doesn't create new objects once initialized, and it doesn't really need locking, except the Queue.Dequeue operation should wait for a buffer being returned.

In pseudo-code:

// init
emptyQ=CreateQueue();
fullQ=CreateQueue();
for(N times) emptyQ.Queue(new Buffer());

// producer
forever {
    buf=emptyQ.Dequeue(); // blocking
    ...fill buf
    fullQ.Queue(buf);
}

// consumer
forever {
    buf=fullQ.Dequeue(); // blocking
    ... process buf
    emptyQ.Queue(buf);
}

Luc Pattyn [Forum Guidelines] [My Articles] Nil Volentibus Arduum

Please use <PRE> tags for code snippets, they preserve indentation, improve readability, and make me actually look at the code.

Re: Threading problem

Anthony Mushrow15-Feb-11 8:10

Anthony Mushrow

15-Feb-11 8:10

Good call, I might be able to fit that in. With the change that the consumer won't release a buffer until it has new one to render so it would be holding 2 buffers at times, so the minimum number of required buffers becomes 3.

My current favourite phrase: I've seen better!

-SK Genius

Source Indexing and Symbol Servers

Merging sorted data blocks.

Member 419459313-Feb-11 16:59

Member 4194593

13-Feb-11 16:59

What is the best (by that I mean the fastest) way to merge some smaller files of sorted data into a larger file, and then later merge the larger files into a single sorted file.

Some particulars:

Input: text files each 6 GB, short records of 500,000,000 10 digit numbers in three formats, sequential ascending, sequential descending, and random selection order, long records with null strings to 65533 byte strings in random order, with many duplicate strings, and then as unique strings where the last 10 digits of each line are replaced with one of the random short strings. A total of 5 different files for 5 different tests.

Memory is allocated with VirtualAlloc (3 GB enabled in boot.ini and in the Link line and 4 GB if memory in the system), memory blocks of 2 GB, 1 GB, and 47 MB, plus 2 more small blocks are allocated.

The first step does the initial sort. The 2GB buffer is used as an input buffer, the 1 GB buffer is used for building mod 16 aligned and sized strings and for the nodes for a RedBlackTree, the 47 MB buffer is used to hold pointers to the strings for the sort tree and for the records moved after the sort. For the short records, this results in about 211 sorted blocks written to a file. This step is complete for now.

The next step combines the 47 MB sorted blocks into a big sort block (2 GB) using the 1 GB buffer for tree nodes and 28 sort blocks are combined into each big block (limited by the node space available - 16 BYTES for each record in the tree), and there are 8 big sort blocks written to a file. This step is complete for now, but could be changed if there is a better merge solution.

The final step needs to merge these big blocks into a single output file, but has to be done in a fragmented mode because there is not enough memory to hold the files. The 2 GB buffer is split into pieces (8 in this case for the 8 big sort blocks). The header from each sort block and the initial pointers from each sort block and the initial associated records from each sort block are read into the 8 pieces of the 2 GB buffer (with an initial header pointing to each big sort block).

Essentially you need to sort the big blocks in ascending sequence by the first record in each big block (how to do this efficiently?), then select and output all the records in the first big block that are lower than the first record in the second big block (how to do this efficiently?), then unlink the first buffer and re-link it with the other set of buffers in ascending sequence (how to do this efficiently?), refilling each block buffer as it exhausts, deleting the block as all of its records are selected and written.

Unfortunately, 6 GB is not necessarily the maximum size file so the 8 big blocks could become thousands so linked lists are not efficient. Arrays of thousands of pointers have insertion timing property problems. Trees are better at insertions than arrays, but have timing problems as well (RedBlackTree recursive rotations, especially with deletions).

Which way is best (overall) for merging, and why?

Dave.