|
No, but I am wondering between your first method and PIEBALDconsult's use of IndexOf above. Your second method actually has a more dynamic use for me which I may or may not need. Either way, it helped me learn what I've forgotten.
Thanks!
|
|
|
|
|
IndexOf is fine for short strings; it is also fine for long strings if you know the character is present. It is a waste if you need several of them and most of those return nothing.
BTW: there also is a LastIndexOf which actually scans from right to left. So looking for 2...4 could be achieved like this:
int i2=s.IndexOf('2');
if (i2>=0 && i2<s.LastIndexOf('4')) return true;
|
|
|
|
|
Thanks, I saw that. Actually, the string I'm searching for will only appear roughly one third of the time. Why would it be a problem if the character is not present? It should basically scan the entire string just like your method does. However, I'm inclined to believe that Microsoft must have optimized the IndexOf and LastIndexOf to work just as well as your first search method. It's your second method that I'm still deciding on.
|
|
|
|
|
an IndexOf-based approach, when using several calls to such methods, has the disadvantage of scanning the string several times; a state-machine based approach typically scans just once.
I'm not sure what you mean by my first and second approach; the one with cSearch is a bit neater than the nested while loops, as it takes less code and scales better when more than a few items need to be present in a specific order. As I said, the speed difference would be minimal, the only overhead it adds is advancing one position in "search" each time the state-machine's state changes (as opposed to entering another nested while loop). When in doubt, just give it a try.
|
|
|
|
|
Luc Pattyn wrote: an IndexOf-based approach, when using several calls to such methods, has the disadvantage of scanning the string several times;
PIEBALDconsult's example shows the IndexOf(Char, Int32) or LastIndexOf(Char, Int32) searching from the last known position. That's identical to what you're doing, but I would think Microsoft has it optimized.
Luc Pattyn wrote: I'm not sure what you mean by my first and second approach;
Your cSearch (while) was the first one. That one is somewhat self-explanatory, but the second one allows me to create more dynamic searches in case the values that I'm searching for change. It also means I am not fixed at N amount of while loops.
Luc Pattyn wrote: When in doubt, just give it a try.
Yeah, I intend to. I just don't have Visual Studio here and no admin rights to install the Express version. We need a Portable Visual Studio Microsoft.
|
|
|
|
|
Bassam Abdul-Baki wrote: but I would think Microsoft has it optimized
I went to look, but found out that it's implemented as a call to "somewhere" (0xFFFFFFFFFF47A100, any clues?)
So I still know nothing. It could be optimized.. but knowing MS it is likely to be a highly generic "good for all CPU's" routine and therefore rather slow. That's what they usually do.
|
|
|
|
|
Luc Pattyn wrote: IndexOf is fine for short strings; it is also fine for long strings if you know the character is present. It is a waste if you need several of them and most of those return nothing.
Huh? You do know there are overloads for IndexOf that take the starting position to scan from, right? Scan for first character, then scan for the second character starting from the position that follows the first, then scan for the third character starting from the position that follows the second character, and so on. I figure that would be very clear code that would be very performant as well.
int index = str.IndexOf(char3, str.IndexOf(char2, str.IndexOf(char1) + 1) + 1);
|
|
|
|
|
I like that. Short and succinct. However, Luc's second method is still more dynamic. I guess we could always put this one in a loop as well.
|
|
|
|
|
aspdotnetdev wrote: You do know there are overloads for IndexOf ...
yes I do.
However some people feel tempted by this style:
int pos1=s.IndexOf(char1);
int pos2=s.IndexOf(char2);
int pos3=s.IndexOf(char3);
.. and then a lot of decisions based on the value of pos1/2/3.
And that could be very bad on long strings.
|
|
|
|
|
Luc Pattyn wrote: And that could be very bad on long strings.
And it may also fail sometimes, such as if you are looking for a 2 then a 4 in the string "6424". That string should pass, but would fail given your sample code. IndexOf (using the start index) and a for loop is probably the best way to go, as Piebald suggested.
|
|
|
|
|
aspdotnetdev wrote: IndexOf (using the start index) and a for loop is probably the best way to go, as Piebald suggested.
Actually, I like the regular expression approach more.
|
|
|
|
|
the specs started and ended as: matching 6[0-9]*2[0-9]*4
somewhere along the thread I got confused; my BTW was wrong, 6424 has to be accepted.
So, yes PIEBALD's approach is a correct one and possibly the best.
It might be improved by looking for the last subitem using LastIndexOf (taking great care about the parameters that beast requires!).
|
|
|
|
|
Luc Pattyn wrote: It might be improved by looking for the last subitem using LastIndexOf
Not if the string is "62420000000000000000000000000000000000000000000000000".
|
|
|
|
|
It gets to be that on occasion.
|
|
|
|
|
Hence "might", it depends on the probability distribution of the patterns.
|
|
|
|
|
|
Fortunately I said "possibly", the page couldn't stand your enthusiasm if I hadn't.
|
|
|
|
|
Luc Pattyn wrote: cSearch=search[iSearch++];
cSearch=search[++iSearch];
Minor, but powerful, bug. Only an issue when searching for consecutive numbers that are the same. 624 in 6246 works, but 622 in 6226 fails.
|
|
|
|
|
Either
foreach(string s in stringCollection) {
int i2=s.IndexOf('2');
int i4=s.IndexOf('4');
if (i2>=0 && i2<i4) output("match: "+s);
}
or
foreach(string s in stringCollection) {
int i24=s.IndexOfAny('2','4');
if (i24>=0 && s[i24]=='2') output("match: "+s);
}
Note: this is assuming there is at most one number in each string.
|
|
|
|
|
Nope. Number strings are much, much longer than the example I showed you. Also, I shrunk my search criteria down to two numbers. It's actually three with the fourth being the first number 6.
|
|
|
|
|
bool seen2=false;
foreach(char c in s) {
if (c=='2') seen2=true;
else if (c=='4') return seen2;
}
return false;
correction (other interpretation of specs):
bool seen2=false;
foreach(char c in s) {
if (c=='2') seen2=true;
else if (c=='4' && seen2) return true;
}
return false;
|
|
|
|
|
That's basically it except I need one more else if for my third character (not the 6).
|
|
|
|
|
So it was in the right forum after all, it was just a misleading subject line!
|
|
|
|
|
I guess so. It's all good.
|
|
|
|
|
It might have made for a good Friday Programming Quiz.
|
|
|
|