|
aspdotnetdev wrote: You do know there are overloads for IndexOf ...
yes I do.
However some people feel tempted by this style:
int pos1=s.IndexOf(char1);
int pos2=s.IndexOf(char2);
int pos3=s.IndexOf(char3);
.. and then a lot of decisions based on the value of pos1/2/3.
And that could be very bad on long strings.
|
|
|
|
|
Luc Pattyn wrote: And that could be very bad on long strings.
And it may also fail sometimes, such as if you are looking for a 2 then a 4 in the string "6424". That string should pass, but would fail given your sample code. IndexOf (using the start index) and a for loop is probably the best way to go, as Piebald suggested.
|
|
|
|
|
aspdotnetdev wrote: IndexOf (using the start index) and a for loop is probably the best way to go, as Piebald suggested.
Actually, I like the regular expression approach more.
|
|
|
|
|
the specs started and ended as: matching 6[0-9]*2[0-9]*4
somewhere along the thread I got confused; my BTW was wrong, 6424 has to be accepted.
So, yes PIEBALD's approach is a correct one and possibly the best.
It might be improved by looking for the last subitem using LastIndexOf (taking great care about the parameters that beast requires!).
|
|
|
|
|
Luc Pattyn wrote: It might be improved by looking for the last subitem using LastIndexOf
Not if the string is "62420000000000000000000000000000000000000000000000000".
|
|
|
|
|
It gets to be that on occasion.
|
|
|
|
|
Hence "might", it depends on the probability distribution of the patterns.
|
|
|
|
|
|
Fortunately I said "possibly", the page couldn't stand your enthusiasm if I hadn't.
|
|
|
|
|
Luc Pattyn wrote: cSearch=search[iSearch++];
cSearch=search[++iSearch];
Minor, but powerful, bug. Only an issue when searching for consecutive numbers that are the same. 624 in 6246 works, but 622 in 6226 fails.
|
|
|
|
|
Either
foreach(string s in stringCollection) {
int i2=s.IndexOf('2');
int i4=s.IndexOf('4');
if (i2>=0 && i2<i4) output("match: "+s);
}
or
foreach(string s in stringCollection) {
int i24=s.IndexOfAny('2','4');
if (i24>=0 && s[i24]=='2') output("match: "+s);
}
Note: this is assuming there is at most one number in each string.
|
|
|
|
|
Nope. Number strings are much, much longer than the example I showed you. Also, I shrunk my search criteria down to two numbers. It's actually three with the fourth being the first number 6.
|
|
|
|
|
bool seen2=false;
foreach(char c in s) {
if (c=='2') seen2=true;
else if (c=='4') return seen2;
}
return false;
correction (other interpretation of specs):
bool seen2=false;
foreach(char c in s) {
if (c=='2') seen2=true;
else if (c=='4' && seen2) return true;
}
return false;
|
|
|
|
|
That's basically it except I need one more else if for my third character (not the 6).
|
|
|
|
|
So it was in the right forum after all, it was just a misleading subject line!
|
|
|
|
|
I guess so. It's all good.
|
|
|
|
|
It might have made for a good Friday Programming Quiz.
|
|
|
|
|
I would also like to point out that the repeated IndexOf ( c , i ) technique might be simplest when implementing a general method that searches for a variable number of values as with the following signature:
bool ContainsInOrder ( string StringToSearch , char[] DigitsToFindInOrder )
Edit: After additional thought, a more general method might be...
bool ContainsInOrder<T> ( IEnumerable<T> ValuesToSearch , IEnumerable<T> ValuesToFindInOrder )
And add a where T : IComparable
modified on Saturday, September 11, 2010 10:49 PM
|
|
|
|
|
I have just recompiled a class library originally written in .Net 1.1 into .Net 3.5 - it uses a lot of unsafe code for pointers and also uses lots of bit shifting (but nothing greatly complicated) - the code in 3.5 is about 20% slower than the code in 1.1
|
|
|
|
|
I have never experienced anything like that, nor read about it.
Are you sure your two copies are comparable, e.g. both built for release?
If the difference is there, I'm curious to see some of the relevant source.
|
|
|
|
|
The .Net 1.1 version is a class library and runs in our live system - it's a database analysis suite using an inverted index database - in the live system a query on 20 million rows returns a count of 4 million in .23 seconds - recompiling the class library and just calling it from a console app runs the same query on the same database in .38 seconds.
|
|
|
|
|
What you could try is:
- locate the/a hotspot, the/a bit of code that is dominating the execution time;
- compare the 1.1 and 3.5 versions of the IL code for that hotspot.
If the code is different, there may be ways to massage your source into better hehavior; if the IL code is basically identical, there may be nothing you can do about it. Chances are the difference is in some middleware, e.g. ADO.NET
|
|
|
|
|
There is no middleware - the database system is all mine and is hit using a BinaryReader.
I will check out the IL
|
|
|
|
|
The debug version of the .Net 1.1 assembly is as fast as the release version of the .Net 3.5 assembly. I don't have VS2003 installed anymore so I can't profile the 1.1 version. Oh well - the speed of this has been on a downward spiral for years - assembler -> C++ -> .Net 1.1 -> .Net 3.5
|
|
|
|
|
One more thought: to reduce the search area, you could slim your code down to the file I/O, i.e. skip all data processing, and see what gives. (That would require a live VS2003 of course).
|
|
|
|