|
vSoares wrote: What's the best algorithm to do this?
well, what is your definition of best? best matches? best execution time? best ...?
there are plenlenty of algorithms out them, did you try exact string match[^] or this[^]
|
|
|
|
|
Yes, I tried but i was expecting someone who had some experience with this kind of algorithms to point me in the right direction.
I agree, "best" is very broad. I need a fast search algorithm. I can optimize the source if needed.
It's not an exact string match. It's all strings beggining with a given pattern, something like google suggestions.
vSoares
|
|
|
|
|
Tries and related data structures come to mind, as well as the bitap algorithm
Just general suggestions..
|
|
|
|
|
vSoares wrote: I need a fast search algorithm. I can optimize the source if needed.
Sort the file then use a binary search?
Regards
David R
---------------------------------------------------------------
"Every program eventually becomes rococo, and then rubble." - Alan Perlis
|
|
|
|
|
Load them into a database and use SQL.
|
|
|
|
|
I think I would give it a try with regex.
If you really want to program something nice, then maybe suffix trees are good.
Cheers
You have the thought that modern physics just relay on assumptions, that somehow depends on a smile of a cat, which isn’t there.( Albert Einstein)
|
|
|
|
|
Hi...
This is graphics related.I need to draw a map .I have the lat long values of it .I was able to map these to screen coordinates and then draw the map.The problem i need help with is how do i do the reverse.For the initial case its simple,just rearrangement of window to view port mapping formula .i.e.,
u = (x-xmin)* sx+ umin
v = (y-ymin)* sy + vmin
But once the zoom and pan operations are applied on the map, I dont know how to get the latitude longitude values from the screen coordinate point.
If some one knows how to solve this ,Please help me
Thanks in advance
|
|
|
|
|
You must use the zoom and pan factors in the equation to get map coordinates from screen coordinates: pan will be an x-y offset, while zoom will be a multiplication/division constant.
Note that you might introduce a conversion error if zoom is a floating point number, so you might consider using an integer. The same of course goes for pan factors.
Hope this can help.
2+2=5 for very large amounts of 2
(always loved that one hehe!)
|
|
|
|
|
Hi,
u = (x-xmin)* sx + umin
v = (y-ymin)* sy + vmin
are formulas to translate and scale a coordinate system (x,y) to a new system (u,v)
The inverse operation has exactly the same structure: swap every u and x, and every y and v, and you become the inverse formulas.
x = (u-umin)* su + xmin
y = (v-vmin)* sb + ymin
All you need now is the new constants
Substituting one set into the other should lead to two identity relations, hence:
su=1/sx and sv=1/sy
Luc Pattyn [Forum Guidelines] [My Articles]
The quality and detail of your question reflects on the effectiveness of the help you are likely to get.
Show formatted code inside PRE tags, and give clear symptoms when describing a problem.
|
|
|
|
|
How would I calculate the Z Scores for a list of data - for example if I run a selection on the database then tabulate the values from a particular variable (Marital Status, for example) I get the following columns:
A Description - Widowed, Married, Divorced etc.
Universe - The number of people in the database with that value
Selection - The number of people in the selection with that value
Universe % - the percentage of the total people on the database who have that value
Selection % - the percentage of the total people on the selection who have that value
I also have the number of people on the database and the number of people in the selection
Is it possible to work out a Z Score for each line in this list given the above information and how would I do it in C#?
Thanks
|
|
|
|
|
If think this[^] is what you're looking for.
2+2=5 for very large amounts of 2
(always loved that one hehe!)
|
|
|
|
|
Hello,
I have an app that searches for a directory that has the format "LastName FirstName". However, I want to account for misspellings in the name that is entered. For instance, the directory is named "Stein" but the user enters "Stien". Of course, more un-common misspellings are also possible. I'm looking into using NetSpell[^] and defining a custom dictionary based on the directory names. However, since new directories are always being added with different names, I was wondering what the best way is to automatically keep this dictionary up-to-date. I can certainly auto-update when the program is started, but I'd like a way to keep it updated without have to restart the program. Any ideas?
Thanks,
Dybs
|
|
|
|
|
Hi,
you could have it work like this:
1.
on app start, scan the directory for names; for each name, generate all reasonable variations and add them to a list (variations could be: dropping a letter, swapping two consecutive letters, ...)
2.
whatever the user enters, compare against the list of exact and almost-exact names, and warn the user if there is either no match at all or ambiguity, i.e. he entered something that could lead to different hits.
3.
install a FileSystemWatcher to detect creation of new file objects, treat them the same way as under 1.
Luc Pattyn [Forum Guidelines] [My Articles]
The quality and detail of your question reflects on the effectiveness of the help you are likely to get.
Show formatted code inside PRE tags, and give clear symptoms when describing a problem.
|
|
|
|
|
Thanks for the suggestions. I completely forgot about the FileSystemWatcher class Never used it before, but it's exactly what I need. What I was originally thinking was on app start, build a dictionary in NetSpell of all the directory names I'd be searching. If a user enters a directory that doesn't exist, it would run that name through NetSpell and search for any directories matching its suggestions (rather than generating all the variations myself - there are several hundred directories, and that could take some time). But the SoundEx algorithm Henry suggested below sounds like it might be useful as well. Thanks again!
Dybs
|
|
|
|
|
you're welcome.
Luc Pattyn [Forum Guidelines] [My Articles]
The quality and detail of your question reflects on the effectiveness of the help you are likely to get.
Show formatted code inside PRE tags, and give clear symptoms when describing a problem.
|
|
|
|
|
As part of 1. from Lucs' solution a Soundex algorithm might prove useful. There is A SoundEx implementation in .NET[^] on CP that might help.
Henry Minute
Do not read medical books! You could die of a misprint. - Mark Twain
Girl: (staring) "Why do you need an icy cucumber?"
“I want to report a fraud. The government is lying to us all.”
|
|
|
|
|
Beyond Henry's suggestion - you might also want to look at Double Metaphone[^], which provides a richer set of results than Soundex.
"WPF has many lovers. It's a veritable porn star!" - Josh Smith As Braveheart once said, "You can take our freedom but you'll never take our Hobnobs!" - Martin Hughes.
My blog | My articles | MoXAML PowerToys | Onyx
|
|
|
|
|
Hi can someone explain to me what a regular language is? And what is pumping lema?
|
|
|
|
|
Google knows all that kind of stuff.
Luc Pattyn [Forum Guidelines] [My Articles]
The quality and detail of your question reflects on the effectiveness of the help you are likely to get.
Show formatted code inside PRE tags, and give clear symptoms when describing a problem.
|
|
|
|
|
A regular language is the set of strings that can be generated by a regular expression, consisting of strings and the Kleene star (*), (which means zero or more occurrences of substrings). It's also the set of strings that can be generated by a finite-state machine, or recognized by a finite-state acceptor.
It's the simplest of four classes of languages. An example of a non-regular language is the set of strings with equal numbers of 0's and 1's. This can't be recognized by a finite-state acceptor (with n states) since a string with more than n letters would overflow the acceptor's finite capacity.
|
|
|
|
|
Hello,
Pulling my hair out trying to solve this problem. I have a list of data describing maximum allowable speeds along a section of track. I need to sort the data to form a speed 'profile' for that section of track. I'll describe by example:
The generated list may look like this upon retrieval from the database:
<snip>
LOMILE, HIMILE, SPEED
0, 30, 25
2, 10, 15
4, 6, 10
10, 20, 20
<end snip="">
What this describes is a section of track 30 miles long with a normal operating speed of 30 MPH (row 1). However, the operating speed is reduced to 15 MPH from mile 2 to mile 10 (row 2), that section is further reduced to 10 MPH from mile 4 to mile 6 (row 3). The operating speed is reduced to 20 MPH from mile 10 to mile 20 (row 4).
What I've been trying to do is come up with an algorithm that would sort that list into something that looks like this:
<snip>
LOMILE, HIMILE, SPEED
0, 2, 25
2, 4, 15
4, 6, 10
6, 10, 15
10, 20, 20
20, 30, 25
<end snip="">
I'm completely stumped trying to get to the solution. Please help!
It doesn't matter what code you implement in, pseudo code would be ideal, but, however you want to present the algorithm is fine with me.
If you are curious, I'm developing this in Access with VBA, however, in the future I'll probably move it to MySQL with PHP. Therefore, whatever code you want to use to illustrate the algorithm is fine.
Thanks in advance!
Lyle.
|
|
|
|
|
If I understand you correctly, the data means something like this:
|--------------------------| 25
|------------| 15
|----| 10
|-------| 20
And the lowest value on a vertical line would be the maximum allowed speed at that point.
Then you want to somehow split the ranges and drop the parts that are not the minimum.
Cut:
|--|--|----|------|-------|---| 25
|--|----|------| 15
|----| 10
|-------| 20
Drop:
|--| |---| 25
|--| |------| 15
|----| 10
|-------| 20
There is of course an efficient way to do this, but I'm having trouble defining it exactly.
What I can define exactly, is a far less elegant way..
int[] speed = new int[first.high - first.low];
for each section s
{
for (int i = s.low; i < s.high; i++)
speed[i] = min(speed[i], s.speed);
}
You can extract the result from this array in O(size-of-longest-section) time.
Being O(|sections| * size-of-longest-section), this algorithm sucks. But hey, it works.
I'm going to think about this some more, I hope I can accurately define splitting the sections.. Removing the not-minimum cuts after that, is nearly trivial.
|
|
|
|
|
Hi,
here is a KISS approach that might be suboptimal but it is simple:
0. get all the different relevant points (LOMILE and HIMILE) in a list
1. sort the list (probably combine 0 and 1 in "add-and-keep-sorted")
2. start at the first listed position
3. check all the rules, apply the lowest speed, and drive to the next listed point
4. iterate 3 until done.
The advantage is we don't need to find overlaps, intersections, ... in the set of rules. A precondition is the entire trajectory needs to be described in the rules, i.e. there should not be part of the route without speed limitation at all.
PS: sorry Harold for not using your beautiful diagrams...
modified on Thursday, May 7, 2009 7:05 PM
|
|
|
|
|
Hi,
a more elaborate approach:
1. assume a "run" structure holding:
- a pointer to the next run (so we can make chains, i.e. linked lists);
- a LO coordinate;
- a HI coordinate;
- a speed limit.
2. the solution will be built as a linked list of such runs.
3. note that all the rules could be stored as a linked list of runs too.
4. start with one run, copied from the first rule, the one that describes the entire trajectory and the overall speed limit
5. take the next rule, and locate its LO in the current chain of runs: if its LO falls halfway inside a run, replace that run by two runs with the original speed (this probably means adapt the existing run, allocate a new run, give it correct values, and insert it in the linked list)
6. now locate the rule's HI in the chain: if its HI falls halfway some run, split it in two runs.
7. now the current rule exactly covers one or more runs in the current chain, for each of those runs if the rule's speed limit is less than the run's speed, reduce the run's speed.
8. repeat 5-6-7 for all rules
9. now the chain of runs describes the trajectory with the correct speed limits.
remark: if the rules are sorted by LOMILE step 6 can take advantage of that.
|
|
|
|
|
Not certain how efficient this would be, but a tree structure may work for this as well. Bear with me, this might seem more complicated than it really is, trees aren't the easiest to explain in plain text Assume your initial list is sorted the start of each speed segment as you have described, with the first entry covering the overall speed of the track:
TheConfusedGuy wrote: LOMILE, HIMILE, SPEED
0, 30, 25
2, 10, 15
4, 6, 10
*7,9,5
10, 20, 20
*Added for example clarification
I'm thinking the tree structure would be like so:
0,30,25
/ \
2,10,15 10,20,20
/ \
4,6,10 7,9,5
Insert procedure as follows:
1. Root node = (0,30,25)
2. New = (2,10,15), compare = (0,30,25)
3. 2 b/w 0 and 30, compare = child of compare (null/no children)
4. compare = null, add New as child of compare.
5. New = (4,6,10), compare = (0,30,25)
6. 4 b/w 0 and 30, compare = child of compare (2,10,15)
7. 4 b/w 2 and 10, compare = child of compare (null, no children)
8. compare = null, add New as child of compare.
9. New = (7,9,5), compare = (0,30,25)
10. 7 b/w 0 and 30, compare = child of compare (2,10,15)
11. 7 b/w 2 and 10, compare = child of compare (4, 6, 10)
12. 7 past 6, compare = sibling of compare (null, no more siblings)
13. compare = null, add New as sibling of compare
... and so on.
I know the insert procedure seems complicated (trees tend to be that way), but I think it would be relatively efficient. To find the speed limit for a given track position, simply compare the position to the LOWMILE value of each node when you traverse your tree. Your lookup wouldn't be O(1), but no worse O(log m(n)) (I think), where m is the max number of subsections to a given stretch, and n is the total number of defined subsections. So in the example here, it would be O(log 2(3)) Your 2 worst cases would be:
|----|----|----|
|----|
|----|
|----|
or
|----|----|----|
|----|----|
|----|
I hope this all makes sense. This is all totally off the top of me head, so I could be mistaken on how efficient this might be. I'm curious what ya'll's opinion is on this method
Dybs
modified on Friday, May 8, 2009 12:14 AM
|
|
|
|
|