Re: Basic Aging Algorithm for Insurance Premiums - Algorithms Discussion Boards

Re: Basic Aging Algorithm for Insurance Premiums

Brady Kelly18-Feb-11 21:43

18-Feb-11 21:43

Thing is, even if this month, I received the full monthly payment, before the due date, the policy will be current for this month, but if I missed last month's payment, it is overall still in arrears, unless I include a brought-forward amount. I think it's more accurate and practical to sum all payments made on the policy, instead of implementing an extra brought-forward field, and having to keep this up to date at the end of each month. This would in fact required a monthly run, where this system is only intended for MIS and on demand, not monthly, reports.

I don't understand your qualification [for the policy period]. There surely should not be any payments for policy outside the policy period?

Re: Basic Aging Algorithm for Insurance Premiums

riced19-Feb-11 0:51

riced

19-Feb-11 0:51

How does the business define being in arrears?
My guess is that you should use the expected sum to date compared with the actual sum to date.
So I think you are right - it is that simple.

Regards
David R
---------------------------------------------------------------
"Every program eventually becomes rococo, and then rubble." - Alan Perlis
The only valid measurement of code quality: WTFs/minute.

Re: Basic Aging Algorithm for Insurance Premiums

Brady Kelly19-Feb-11 1:55

Brady Kelly

19-Feb-11 1:55

I don't have any definition from the business, and I haven't been able to elicit one yet, so for the first, prototype release I'm goinf with arrears == expected > actual.

Re: Basic Aging Algorithm for Insurance Premiums

Yusuf19-Feb-11 2:05

Yusuf

19-Feb-11 2:05

Brady Kelly wrote:
I don't understand your qualification [for the policy period]. There surely should not be any payments for policy outside the policy period?

Here is my understanding here in US. This is personal experience and I have not worked in the insurance industry.

I've a car insurance, which is my policy. I happen to buy it on the month of December. So my policy period runs from Dec'10 to Dec'11 for the current year. So, when they are calculating my payment to figure out the status they only need to look into the policy period. Last years policy period is irrelevant. Of course by the end of the policy period last year, I must have been current otherwise the insurance company would not have renewed it for this year.

Brady Kelly wrote:
Thing is, even if this month, I received the full monthly payment, before the due date, the policy will be current for this month, but if I missed last month's payment, it is overall still in arrears, unless I include a brought-forward amount.

I agree, in fact if you go back and see my original post, what I said was this method will give you a quick way of checking the status of the policy. But you won't be able to know why it is so.

Yusuf
May I help you?

Re: Basic Aging Algorithm for Insurance Premiums

Brady Kelly19-Feb-11 2:17

Brady Kelly

19-Feb-11 2:17

Looks like what I'm working on is life, not auto, so it doesn't have a yearly renewal, but a payout date, so it's actually assurance, but the 'spec' is in Afrikaans, which I read and speak well, but doesn't translate that well into a business spec for me, and I'm not in direct contact with the user. I'm working through an intermediary who has not been very available this week.

Re: Basic Aging Algorithm for Insurance Premiums

Roger Wright19-Feb-11 4:35

Roger Wright

19-Feb-11 4:35

Ah, the plot thickens!

My first answer assumed the whole scenario was as simple as explained. But in most businesses, and in accounting generally, an aging report breaks amounts into groups - Current, 30 days, 60 days, 90 days, and >90 days in arrears. This is very likely what your client wants to see. Of course, it's a lot more work, but it gives a much better report.

Will Rogers never met me.

Re: Basic Aging Algorithm for Insurance Premiums

Brady Kelly19-Feb-11 4:40

Brady Kelly

19-Feb-11 4:40

Roger Wright wrote:
Current, 30 days, 60 days, 90 days, and >90 days

No, it's not the actual finance software, it's just a sort of basic 'dashboard' for the broker. They probably use an accounting package for the real aging, the brokers probably use a pen. I'll leave room for more detail though - take this baby Enterprise. Mwaahaaaahaaa.

Re: Basic Aging Algorithm for Insurance Premiums

Roger Wright19-Feb-11 5:25

Roger Wright

19-Feb-11 5:25

Brady Kelly wrote:
just a sort of basic 'dashboard'

Oh, then I'll stick with my original answer - comparing the sums is a good way to do it.

Will Rogers never met me.

Threading problem

Anthony Mushrow15-Feb-11 6:50

Anthony Mushrow

15-Feb-11 6:50

I'm a little tired of staring at these few lines and not seeing the problem. Take this code:

volatile int WorkingBuff = 0;
volatile int NextBuff = 1;
volatile int RenderBuff = 1;

//Thread A
void stuff()
{
  NextBuff = WorkingBuff;
  int usedId = 0;
  do
  {
    usedId = RenderBuff;
    for(int i=0; i<3; i++)
    {
      if(i != NextBuff && i != usedId)
        WorkingBuff = i;
    }
  } while(RenderBuff != usedId);
}

//Thread B
void render()
{
  RenderBuff = NextBuff;
  //do stuff with RenderBuff
}

Could anybody point out the situation where WorkingBuff and RenderBuff have the same value? I'm just not seeing it.

My current favourite phrase: I've seen better!

-SK Genius

Source Indexing and Symbol Servers

Re: Threading problem

Alan Balkany15-Feb-11 7:17

Alan Balkany

15-Feb-11 7:17

1. NextBuff = WorkingBuff; in Thread A.

2. Thread A is blocked and Thread B starts.

3. RenderBuff = NextBuff; in Thread B.

Re: Threading problem

Anthony Mushrow15-Feb-11 7:36

Anthony Mushrow

15-Feb-11 7:36

Sorry, I should have elaborated a bit. I mean after the do-while loop had run.
I'm having a problem it seems, where I'm trying to render with stuff I'm in the middle of updating.

My current favourite phrase: I've seen better!

-SK Genius

Source Indexing and Symbol Servers

Re: Threading problem

Alan Balkany15-Feb-11 7:41

Alan Balkany

15-Feb-11 7:41

You could use lock () around the places you absolutely don't want to execute concurrently. This would remove all the uncertainty and make your code more reliable.

Re: Threading problem

Anthony Mushrow15-Feb-11 7:49

Anthony Mushrow

15-Feb-11 7:49

True, but I was aiming for a lock-free solution as they are surprisingly expensive.

My current favourite phrase: I've seen better!

-SK Genius

Source Indexing and Symbol Servers

Re: Threading problem

Alan Balkany15-Feb-11 8:07

Alan Balkany

15-Feb-11 8:07

It's surprisingly complex for such a short fragment because of the possible interactions between threads, and the shuffling of values among four variables. Maybe it could be rewritten in a simpler way.

A useful step in analyzing code like this is identifying invariant conditions, that are guaranteed to always be true at some point in the code. But it looks like the variable shuffling makes this impossible.

Re: Threading problem

Anthony Mushrow15-Feb-11 8:21

Anthony Mushrow

15-Feb-11 8:21

Yeah, I've temporarily gone with a lock which solves the problem for now. Then I'm going to try Luc's suggestion of using a couple of queues. I already have a lock-free queue that's working elsewhere which I should be able to use, since there would only be one producer for each queue (a limitation for lock-free queues it would seem).

My current favourite phrase: I've seen better!

-SK Genius

Source Indexing and Symbol Servers

Re: Threading problem

Luc Pattyn15-Feb-11 8:57

Luc Pattyn

15-Feb-11 8:57

SK Genius wrote:
there would only be one producer ... a limitation for lock-free queues it would seem

a limitation for lock-free whatevers I would say. Big Grin | :-D

Luc Pattyn [Forum Guidelines] [My Articles] Nil Volentibus Arduum

Please use <PRE> tags for code snippets, they preserve indentation, improve readability, and make me actually look at the code.

Re: Threading problem

JesperMadsen12321-Feb-11 10:09

JesperMadsen123

21-Feb-11 10:09

I am sure you can find a solution for adding more producers, if you are good at lock-free algorithms. This page suggest that it is not impossible.

http://www.1024cores.net/home/lock-free-algorithms/queues

Re: Threading problem

Luc Pattyn15-Feb-11 7:57

Luc Pattyn

15-Feb-11 7:57

when you have a producer, a consumer, and a number (at least 2) of equivalent buffers, I find it most easy to have two queues, one holding "empty" buffers (emptied by the consumer, to be filled by the producer), one holding "full" buffers (filled by the producer, to be processed by the consumer). That is an approach that is simple to understand; it works well, it doesn't create new objects once initialized, and it doesn't really need locking, except the Queue.Dequeue operation should wait for a buffer being returned.

In pseudo-code:

// init
emptyQ=CreateQueue();
fullQ=CreateQueue();
for(N times) emptyQ.Queue(new Buffer());

// producer
forever {
    buf=emptyQ.Dequeue(); // blocking
    ...fill buf
    fullQ.Queue(buf);
}

// consumer
forever {
    buf=fullQ.Dequeue(); // blocking
    ... process buf
    emptyQ.Queue(buf);
}

Luc Pattyn [Forum Guidelines] [My Articles] Nil Volentibus Arduum

Please use <PRE> tags for code snippets, they preserve indentation, improve readability, and make me actually look at the code.

Re: Threading problem

Anthony Mushrow15-Feb-11 8:10

Anthony Mushrow

15-Feb-11 8:10

Good call, I might be able to fit that in. With the change that the consumer won't release a buffer until it has new one to render so it would be holding 2 buffers at times, so the minimum number of required buffers becomes 3.

My current favourite phrase: I've seen better!

-SK Genius

Source Indexing and Symbol Servers

Merging sorted data blocks.

Member 419459313-Feb-11 16:59

Member 4194593

13-Feb-11 16:59

What is the best (by that I mean the fastest) way to merge some smaller files of sorted data into a larger file, and then later merge the larger files into a single sorted file.

Some particulars:

Input: text files each 6 GB, short records of 500,000,000 10 digit numbers in three formats, sequential ascending, sequential descending, and random selection order, long records with null strings to 65533 byte strings in random order, with many duplicate strings, and then as unique strings where the last 10 digits of each line are replaced with one of the random short strings. A total of 5 different files for 5 different tests.

Memory is allocated with VirtualAlloc (3 GB enabled in boot.ini and in the Link line and 4 GB if memory in the system), memory blocks of 2 GB, 1 GB, and 47 MB, plus 2 more small blocks are allocated.

The first step does the initial sort. The 2GB buffer is used as an input buffer, the 1 GB buffer is used for building mod 16 aligned and sized strings and for the nodes for a RedBlackTree, the 47 MB buffer is used to hold pointers to the strings for the sort tree and for the records moved after the sort. For the short records, this results in about 211 sorted blocks written to a file. This step is complete for now.

The next step combines the 47 MB sorted blocks into a big sort block (2 GB) using the 1 GB buffer for tree nodes and 28 sort blocks are combined into each big block (limited by the node space available - 16 BYTES for each record in the tree), and there are 8 big sort blocks written to a file. This step is complete for now, but could be changed if there is a better merge solution.

The final step needs to merge these big blocks into a single output file, but has to be done in a fragmented mode because there is not enough memory to hold the files. The 2 GB buffer is split into pieces (8 in this case for the 8 big sort blocks). The header from each sort block and the initial pointers from each sort block and the initial associated records from each sort block are read into the 8 pieces of the 2 GB buffer (with an initial header pointing to each big sort block).

Essentially you need to sort the big blocks in ascending sequence by the first record in each big block (how to do this efficiently?), then select and output all the records in the first big block that are lower than the first record in the second big block (how to do this efficiently?), then unlink the first buffer and re-link it with the other set of buffers in ascending sequence (how to do this efficiently?), refilling each block buffer as it exhausts, deleting the block as all of its records are selected and written.

Unfortunately, 6 GB is not necessarily the maximum size file so the 8 big blocks could become thousands so linked lists are not efficient. Arrays of thousands of pointers have insertion timing property problems. Trees are better at insertions than arrays, but have timing problems as well (RedBlackTree recursive rotations, especially with deletions).

Which way is best (overall) for merging, and why?

Dave.

Re: Merging sorted data blocks.

Luc Pattyn14-Feb-11 1:15

Luc Pattyn

14-Feb-11 1:15

Sorry Dave, that does not make much sense to me. You will be wasting lots of time reading, decoding, sorting, writing data, all of which isn't very useful.

I would:
- avoid such an amount of data if at all possible;
- avoid large files; I prefer several smaller files;
- avoid text files; binary files are smaller and get processed much faster;
- probably use a database.

Smile | :)

Luc Pattyn [Forum Guidelines] [My Articles] Nil Volentibus Arduum

Please use <PRE> tags for code snippets, they preserve indentation, improve readability, and make me actually look at the code.

Re: Merging sorted data blocks.

Member 419459314-Feb-11 4:00

Member 4194593

14-Feb-11 4:00

Luc,

I'm sorry to give you a 1 for your answer, but I disagree with all of your points. I got into this because I was sorting such a list using Windows sort.exe and was disappointed with the performance. I had a file that I wanted to sort and it was a text file. I do not believe that a general purpose database can out-perform a tailored sort process, and you would have all of the database overhead in addition to the data itself to write and move around. The following are the timings from my testing thus far:

The current time is: 10:44:34.51

 \restart18\debug>if exist c:\longdups.txt sort.exe /REC 65535 c:\longdups.txt /O c:\winlong.txt

The current time is: 11:07:44.81
The current time is: 10:44:34.51
                      ----------
                      0:23:10.30    =   1,390,300 msec

The current time is: 10:24:20.18

 \restart19\debug>if exist c:\longdups.txt ..\sortit.exe c:\longdups.txt c:\mylong.txt

The current time is: 10:30:14.76
The current time is: 10:24:20.18
                      ----------
                     00:05:54.58    =   354,580 msec, (354 580*100)/1,390,300 = 25.50%

--------------------------------------------------------------------------------

The current time is: 17:20:45.57

 \restart18\debug>if exist c:\textseq1.txt sort.exe c:\textseq1.txt /O c:\winshort.txt

The current time is: 18:10:40.32
The current time is: 17:20:45.57
                     -----------
                     00:49:54.75    =   2,994,750 msec.

The current time is: 11:10:25.21

 \restart19\debug>if exist c:\textseq1.txt ..\sortit.exe c:\textseq1.txt c:\myshort.txt

The current time is: 11:19:27.01
The current time is: 11:10:25.21
                     -----------
                     00:09:01.80    =   541,800 msec, (541,800*100)/2,994,750 = 18.09%

--------------------------------------------------------------------------------

The current time is: 13:35:56.95

 \restart18\debug>if exist c:\textdseq.txt sort.exe c:\textdseq.txt /O c:\wdshort.txt

The current time is: 14:22:32.76
The current time is: 13:35:56.95
                     -----------
                     00:46:35.81    =   2,795,810 msec.

The current time is: 12:13:07.90

 \restart19\debug>if exist c:\textdseq.txt ..\sortit.exe c:\textdseq.txt c:\mdshort.txt

The current time is: 12:26:48.50
The current time is: 12:13:07.90
                     -----------
                     00:13:40.60    =   820,600 msec, (820,600*100)/2,795,810 = 29.35%

--------------------------------------------------------------------------------

The current time is: 19:23:06.04

 \restart18\debug>if exist c:\textuseq.txt sort.exe c:\textuseq.txt /O c:\wushort.txt

The current time is: 20:37:44.10
The current time is: 19:23:06.04
                     -----------
                     01:14:38.06    =   4,478,060 msec

The current time is: 14:30:05.59

 \restart19\debug>if exist c:\textuseq.txt ..\sortit.exe c:\textuseq.txt c:\mushort.txt

The current time is: 15:09:04.34
The current time is: 14:30:05.59
                     -----------
                     00:38:58.75    =   2,338,750 msec, (2,338,750*100)/4,478,060 = 52.23%
                     -----------
                      (not complete - need merge)

--------------------------------------------------------------------------------

The current time is:  8:34:48.34

 \restart15\debug>if exist c:\longuniq.txt sort.exe c:\longuniq.txt /O c:\wulong.txt

The current time is:  9:00:27.12
The current time is:  8:34:48.34
                      ----------
                      0:25:38.78    =   1,538,780 msec.

The current time is: 20:37:07.09

 \restart19\debug>if exist c:\longuniq.txt ..\sortit.exe c:\longuniq.txt c:\mulong.txt

The current time is: 20:54:34.73
The current time is: 20:37:07.09
                     -----------
                     00:17:27.62    =   1,047,620 msec, (1,047,620*100)/1,538,780 = 68.08%
                      (not complete - need merge)

Sill looking for any other hints about ways to do this efficiently.

Dave.

Re: Merging sorted data blocks.

Stefan_Lang14-Feb-11 23:25

Stefan_Lang

14-Feb-11 23:25

You fail to see that the moment your data don't fully fit into memory, the file system will become your main bottleneck. A general purpose database will be faster than a general purpose file system by several orders of magnitude!

You say a program tailored to your problem would be faster than a database, but the tailoring would need to include the file system as well!

Re: Merging sorted data blocks.

Member 419459315-Feb-11 4:41

Member 4194593

15-Feb-11 4:41

Stefan,

I just read the CodeProject "Daily News" for today. Interesting article "What would Feynman do?". It seems that you are suggesting a Feynman solution. I am not talking about a 6 core 16 GB server with a farm of hundreds of drives, I am talking abut my simple PC with 4 GB of memory and 2 hard drives running Windows XP. I want a utility string sort solution.

You say "several orders of magnitude better!" I do not think this is very accurate. I'll tell you what I will do. I will supply you with an executable and a small template file. The execution of the executable will create all of my test files (5 files, 6 GB each, the same files that I am using). You can then write your version of a single program to sort these files (case sensitive), then execute Windows sort.exe for each file to create a baseline time on your machine, then execute your program for each file and report all of your timing statistics here. You have an advantage in that you already have my current times and you can refine your algorithm until you get better times. Note that is is not the time for any test or the overall time of all of the tests that matters, it is the percentage of improvement over the baseline sort.exe times that is important. This is done to even out timing differences between different machines. I just do not believe several orders of magnitude.

Interested in trying? Note: I currently do not have a complete solution, I still need the final merge step for the last two files that is what I was trying to get suggestions about in this thread. I will complete the program in some way and post the final results for the final 2 files.

Dave.

Re: Merging sorted data blocks.

Stefan_Lang15-Feb-11 5:31

Stefan_Lang

15-Feb-11 5:31

Sorry, 'several orders of magnitude' was probably overstating. I've been thinking of one particular case where a database solution outperformed a file-based solution by a factor of ~15-20. But when I think about it, part of the acceleration may have been due to the algorithms being used, so 'one order of magnitude' might be closer to the real thing.

I've also thought of another case with a 120 GB database of geographical data that turned out to be so fast that my algorithms to dig out some geometrical properties out of a given (and comparatively small) result set were much more time critical than the two-dimensional search to find the data in a particular geographical region. On a sidenote, that was about 15 years ago, so I doubt the database server was all that much faster than your current PC.

In both cases I was just the 'user' of the data though - I am not a database specialist. I just wanted to point out that your opinion of databases might be premature, not that I could write down a solution in 5 minutes.

Anyway, if you want me to try you'd have to wait for a month or two as the single hard disk in my 5-year old PC has barely enough room to fit one of those 6 GB files; forget about 5. Dead | X|

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.