Re: Regular Expressions - C# Discussion Boards

Re: Regular Expressions

Luc Pattyn10-Sep-10 6:40

Luc Pattyn

10-Sep-10 6:40

aspdotnetdev wrote:
You do know there are overloads for IndexOf ...

yes I do.

However some people feel tempted by this style:

int pos1=s.IndexOf(char1);
int pos2=s.IndexOf(char2);
int pos3=s.IndexOf(char3);
.. and then a lot of decisions based on the value of pos1/2/3.

And that could be very bad on long strings.

Smile | :)

Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles] Nil Volentibus Arduum

Please use <PRE> tags for code snippets, they preserve indentation, and improve readability.

Re: Regular Expressions

AspDotNetDev10-Sep-10 7:19

AspDotNetDev

10-Sep-10 7:19

Luc Pattyn wrote:
And that could be very bad on long strings.

And it may also fail sometimes, such as if you are looking for a 2 then a 4 in the string "6424". That string should pass, but would fail given your sample code. IndexOf (using the start index) and a for loop is probably the best way to go, as Piebald suggested.

[Forum Guidelines]

Re: Regular Expressions

AspDotNetDev10-Sep-10 7:21

AspDotNetDev

10-Sep-10 7:21

aspdotnetdev wrote:
IndexOf (using the start index) and a for loop is probably the best way to go, as Piebald suggested.

Actually, I like the regular expression approach more. Smile | :)

[Forum Guidelines]

Re: Regular Expressions [modified]

Luc Pattyn10-Sep-10 7:31

Luc Pattyn

10-Sep-10 7:31

the specs started and ended as: matching 6[0-9]*2[0-9]*4
somewhere along the thread I got confused; my BTW was wrong, 6424 has to be accepted.
So, yes PIEBALD's approach is a correct one and possibly the best.
It might be improved by looking for the last subitem using LastIndexOf (taking great care about the parameters that beast requires!).

Smile | :)

Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles] Nil Volentibus Arduum

Please use <PRE> tags for code snippets, they preserve indentation, and improve readability.

modified on Friday, September 10, 2010 1:37 PM

Re: Regular Expressions

AspDotNetDev10-Sep-10 8:14

AspDotNetDev

10-Sep-10 8:14

Luc Pattyn wrote:
It might be improved by looking for the last subitem using LastIndexOf

Not if the string is "62420000000000000000000000000000000000000000000000000". Smile | :)

[Forum Guidelines]

Re: Regular Expressions

Bassam Abdul-Baki10-Sep-10 11:58

Bassam Abdul-Baki

10-Sep-10 11:58

It gets to be that on occasion. Smile | :)

Re: Regular Expressions

Luc Pattyn10-Sep-10 17:02

Luc Pattyn

10-Sep-10 17:02

Hence "might", it depends on the probability distribution of the patterns.

Smile | :)

Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles] Nil Volentibus Arduum

Please use <PRE> tags for code snippets, they preserve indentation, and improve readability.

Re: Regular Expressions

PIEBALDconsult11-Sep-10 15:59

PIEBALDconsult

11-Sep-10 15:59

Luc Pattyn wrote:
So, yes PIEBALD's approach is a correct one and possibly the best

Badger | [badger,badger,badger,badger...]

Re: Regular Expressions

Luc Pattyn11-Sep-10 16:11

Luc Pattyn

11-Sep-10 16:11

Fortunately I said "possibly", the page couldn't stand your enthusiasm if I hadn't.

Smile | :)

Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles] Nil Volentibus Arduum

Please use <PRE> tags for code snippets, they preserve indentation, and improve readability.

Re: Regular Expressions

Bassam Abdul-Baki10-Sep-10 16:48

Bassam Abdul-Baki

10-Sep-10 16:48

Luc Pattyn wrote:
cSearch=search[iSearch++];

cSearch=search[++iSearch];

Minor, but powerful, bug. Only an issue when searching for consecutive numbers that are the same. 624 in 6246 works, but 622 in 6226 fails.

Re: Regular Expressions

Luc Pattyn10-Sep-10 4:20

Luc Pattyn

10-Sep-10 4:20

Either

foreach(string s in stringCollection) {
    int i2=s.IndexOf('2');
    int i4=s.IndexOf('4');
    if (i2>=0 && i2<i4) output("match: "+s);
}

foreach(string s in stringCollection) {
    int i24=s.IndexOfAny('2','4');
    if (i24>=0 && s[i24]=='2') output("match: "+s);
}

Note: this is assuming there is at most one number in each string.

Smile | :)

Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles] Nil Volentibus Arduum

Please use <PRE> tags for code snippets, they preserve indentation, and improve readability.

Re: Regular Expressions

Bassam Abdul-Baki10-Sep-10 4:30

Bassam Abdul-Baki

10-Sep-10 4:30

Nope. Number strings are much, much longer than the example I showed you. Also, I shrunk my search criteria down to two numbers. It's actually three with the fourth being the first number 6. Smile | :)

Re: Regular Expressions

Luc Pattyn10-Sep-10 4:41

Luc Pattyn

10-Sep-10 4:41

bool seen2=false;
foreach(char c in s) {
    if (c=='2') seen2=true;
    else if (c=='4') return seen2;
}
return false;

correction (other interpretation of specs):

bool seen2=false;
foreach(char c in s) {
    if (c=='2') seen2=true;
    else if (c=='4' && seen2) return true;
}
return false;

Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles] Nil Volentibus Arduum

Please use <PRE> tags for code snippets, they preserve indentation, and improve readability.

Re: Regular Expressions

Bassam Abdul-Baki10-Sep-10 4:50

Bassam Abdul-Baki

10-Sep-10 4:50

That's basically it except I need one more else if for my third character (not the 6).

Re: Regular Expressions

Luc Pattyn10-Sep-10 4:54

Luc Pattyn

10-Sep-10 4:54

So it was in the right forum after all, it was just a misleading subject line! Laugh | :laugh:

Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles] Nil Volentibus Arduum

Please use <PRE> tags for code snippets, they preserve indentation, and improve readability.

Re: Regular Expressions

Bassam Abdul-Baki10-Sep-10 5:02

Bassam Abdul-Baki

10-Sep-10 5:02

I guess so. It's all good. Smile | :)

Re: Regular Expressions

PIEBALDconsult10-Sep-10 14:52

PIEBALDconsult

10-Sep-10 14:52

It might have made for a good Friday Programming Quiz. Cool | :cool:

Re: Regular Expressions [modified]

PIEBALDconsult11-Sep-10 4:30

PIEBALDconsult

11-Sep-10 4:30

I would also like to point out that the repeated IndexOf ( c , i ) technique might be simplest when implementing a general method that searches for a variable number of values as with the following signature:

bool ContainsInOrder ( string StringToSearch , char[] DigitsToFindInOrder )

Edit: After additional thought, a more general method might be...

bool ContainsInOrder<T> ( IEnumerable<T> ValuesToSearch , IEnumerable<T> ValuesToFindInOrder )

And add a where T : IComparable

modified on Saturday, September 11, 2010 10:49 PM

is C# in .Net 3.5 slower than C# in .Net 1.1

RugbyLeague10-Sep-10 2:20

RugbyLeague

10-Sep-10 2:20

I have just recompiled a class library originally written in .Net 1.1 into .Net 3.5 - it uses a lot of unsafe code for pointers and also uses lots of bit shifting (but nothing greatly complicated) - the code in 3.5 is about 20% slower than the code in 1.1

Re: is C# in .Net 3.5 slower than C# in .Net 1.1

Luc Pattyn10-Sep-10 3:03

Luc Pattyn

10-Sep-10 3:03

I have never experienced anything like that, nor read about it.
Are you sure your two copies are comparable, e.g. both built for release?
If the difference is there, I'm curious to see some of the relevant source.

Smile | :)

Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles] Nil Volentibus Arduum

Please use <PRE> tags for code snippets, they preserve indentation, and improve readability.

Re: is C# in .Net 3.5 slower than C# in .Net 1.1

RugbyLeague10-Sep-10 3:11

RugbyLeague

10-Sep-10 3:11

The .Net 1.1 version is a class library and runs in our live system - it's a database analysis suite using an inverted index database - in the live system a query on 20 million rows returns a count of 4 million in .23 seconds - recompiling the class library and just calling it from a console app runs the same query on the same database in .38 seconds.

Re: is C# in .Net 3.5 slower than C# in .Net 1.1

Luc Pattyn10-Sep-10 3:17

Luc Pattyn

10-Sep-10 3:17

What you could try is:
- locate the/a hotspot, the/a bit of code that is dominating the execution time;
- compare the 1.1 and 3.5 versions of the IL code for that hotspot.

If the code is different, there may be ways to massage your source into better hehavior; if the IL code is basically identical, there may be nothing you can do about it. Chances are the difference is in some middleware, e.g. ADO.NET

Smile | :)

Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles] Nil Volentibus Arduum

Please use <PRE> tags for code snippets, they preserve indentation, and improve readability.

Re: is C# in .Net 3.5 slower than C# in .Net 1.1

RugbyLeague10-Sep-10 3:21

RugbyLeague

10-Sep-10 3:21

There is no middleware - the database system is all mine and is hit using a BinaryReader.

I will check out the IL

Re: is C# in .Net 3.5 slower than C# in .Net 1.1

RugbyLeague10-Sep-10 5:48

RugbyLeague

10-Sep-10 5:48

The debug version of the .Net 1.1 assembly is as fast as the release version of the .Net 3.5 assembly. I don't have VS2003 installed anymore so I can't profile the 1.1 version. Oh well - the speed of this has been on a downward spiral for years - assembler -> C++ -> .Net 1.1 -> .Net 3.5

Re: is C# in .Net 3.5 slower than C# in .Net 1.1

Luc Pattyn10-Sep-10 5:53

Luc Pattyn

10-Sep-10 5:53

One more thought: to reduce the search area, you could slim your code down to the file I/O, i.e. skip all data processing, and see what gives. (That would require a live VS2003 of course).

Smile | :)

Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles] Nil Volentibus Arduum

Please use <PRE> tags for code snippets, they preserve indentation, and improve readability.

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.