|
My main thing is that I don't know what the Compiled Option actually does. I have an idea what it might or should do, but then it doesn't seem to make much sense not to include it every time.
The "Compiled" option can't instruct the compiler to make IL of the expression at compile time, it has to be at runtime.
I can definitely understand that "compiling" can help when a RegEx is used many times, but does it slow down a RegEx which is used only once or a few times? Is there some number of calls where performance matches (for a particular expression)?
I also found that (apparently) the RegEx class caches the expressions and I suspect that it must "compile" them as well, otherwise why bother?
I just read this:
https://stackoverflow.com/questions/513412/how-does-regexoptions-compiled-work[^]
Which I don't consider canon, though it may be correct.
So Compiled makes it IL at some cost -- great, perfect.
But non-Compiled must do something (at lower cost) as well which can be cached -- "turns the expression into a state machine graph" or similar?
Might it then be kinda:
0) Take Expression in as text
1) Create state machine graph of the Expression and cache it
2) If requested, compile the state machine graph into IL -- and maybe cache that as well?
There is also the question of performance between the static RegEx methods and an instance of a RegEx. Using an instance should at least eliminate the cost of a cache look-up.
Part of why I ask is that I have written a number of methods which use Regular Expressions, and some of these get used by CLR Functions in SQL Server. I therefore want them to run as efficiently as they can as they may be called several million times during a query.
Also, in your response, I suspect that when you say "runtime" you may mean "on each call", not just "on the first call".
|
|
|
|
|
PIEBALDconsult wrote: I also found that (apparently) the RegEx class caches the expressions and I suspect that it must "compile" them as well, otherwise why bother?
I haven't read anything about the caching, but what I can tell you is that algorithmically it is very little work to convert a regular expression into an NFA state machine. It's hardly worth caching, unless they actually mean caching the state machine, but I think that's already done when you call Parse() and get an instance back. It doesn't actually reconstitute it from the string each time.
Converting to a DFA *does* take time, but Microsoft's engine doesn't use DFA.
DFA is much faster in the general case than NFA, but doesn't backtrack. Therefore my library beats out Microsoft's particularly when the input is shorter than say, several pages of text by about 3x under ideal cases - MS's has things like Boyer-Moore optimization which can scan long strings faster.
If you really want the fastest possible performance and you know what the expressions are ahead of time - like they are relatively static/don't change that often, then make them compiled. Otherwise, leave them uncompiled.
If you'd like something really fast (and you know the expressions ahead of time) I can hook you up with my little engine. It's pretty small. It doesn't support anchors or backtracking, but does full UTF-32 Unicode. (You'll need to use the ToUtf32 function to convert UTF-16 to UTF-32.
Check out my IoT graphics library here:
https://honeythecodewitch.com/gfx
And my IoT UI/User Experience library here:
https://honeythecodewitch.com/uix
|
|
|
|
|
honey the codewitch wrote: I can tell you is that algorithmically it is very little work to convert a regular expression into an NFA state machine
Does your regular expression language support everything that the C# one does?
|
|
|
|
|
Nope, because if it did it would be much slower.
It supports non-backtracking expressions, sans anchors.
As such it's maybe 250x faster than Microsoft's in the best case. (see my perf numbers)
I'm also working on a code generation feature for it so it will have that since Regex in .NET 7 does.
It already compiles, but honestly there's little reason to compile it. You can turn and expression into an array of ints instead and it matches at least as fast in most cases.
Check out my IoT graphics library here:
https://honeythecodewitch.com/gfx
And my IoT UI/User Experience library here:
https://honeythecodewitch.com/uix
|
|
|
|
|
Regarding my earlier response about suggesting my engine. I re-ran it to compare MS uncompiled and compiled with my engine.
Edit: Changed my timing to remove the time for Console.Write/Console.WriteLine
Microsoft Regex "Lexer": [■■■■■■■■■■] 100% Done in 1556ms
Microsoft Regex Compiled "Lexer": [■■■■■■■■■■] 100% Done in 1186ms
Expanded NFA Lexer: [■■■■■■■■■■] 100% Done in 109ms
Compacted NFA Lexer: [■■■■■■■■■■] 100% Done in 100ms
Unoptimized DFA Lexer: [■■■■■■■■■■] 100% Done in 111ms
Optimized DFA Lexer: [■■■■■■■■■■] 100% Done in 6ms
Table based DFA Lexer: [■■■■■■■■■■] 100% Done in 4ms
Compiled DFA Lexer: [■■■■■■■■■■] 100% Done in 5ms
Take that how you will.
Check out my IoT graphics library here:
https://honeythecodewitch.com/gfx
And my IoT UI/User Experience library here:
https://honeythecodewitch.com/uix
modified 6-Jan-24 20:43pm.
|
|
|
|
|
I don't doubt your code and abilities. But I can't just use anything I want, my employer allows only packages they certify for use.
Additionally, I'm unsure yours could be used from within a CLR Function.
Aside from that, you say that there are a few limitations of yours, and they might be critical to me -- you mentioned anchors, and some of my more complex expressions use the \G anchor.
|
|
|
|
|
Every time I link stack overflow from this day forward:
"Which I don't consider canon, though it may be correct."
|
|
|
|
|
|
|
You can force C# to work like C++ with this code I also remembered that a workaround for a bug in C# involved some Emit coding, which makes it pretty useful to know.
|
|
|
|
|
I was diving through the native asm code for this trying to track down a weird bug.
I was calling a function off the wrong object (the object didn't have that function) and it created a stack overflow. Whoops.
But I did notice the emitted native code was pretty similar to the IL.
Check out my IoT graphics library here:
https://honeythecodewitch.com/gfx
And my IoT UI/User Experience library here:
https://honeythecodewitch.com/uix
|
|
|
|
|
Epic is giving this week the game "Marvel - Guardians of the galaxy" for free.
Usual price 59.99€ in Germany in both Epic and Steam. Reviews in Steam are over 25k and very positive.
M.D.V.
If something has a solution... Why do we have to worry about?. If it has no solution... For what reason do we have to worry about?
Help me to understand what I'm saying, and I'll explain it better to you
Rating helpful answers is nice, but saying thanks can be even nicer.
|
|
|
|
|
Marvel-ous!
|
|
|
|
|
If I only could remember my Epic game password...
|
|
|
|
|
If they could only offer an option to recover passwords... oh, wait!
M.D.V.
If something has a solution... Why do we have to worry about?. If it has no solution... For what reason do we have to worry about?
Help me to understand what I'm saying, and I'll explain it better to you
Rating helpful answers is nice, but saying thanks can be even nicer.
|
|
|
|
|
I'm upgrading my phone from Herself's Samsung Galaxy M32* to a new Samsung Galaxy A54 5G and initially it looked like it was all done very easily - everything transferred to the new phone smoothly and quickly.
Then I noticed that I was missing a load of Whatsapp messages, and try as I might I couldn't get them back on the new phone - though they appear on the old phone, and on my desktop and my Surface ...
Finally worked it out: Google does a weekly backup to Google drive and when you open Whatsapp on the new phone it uses the last backup, rather than the latest message set. Simple solution: switch back to old phone, uninstall Whatsapp on new phone, do a manual backup on old phone, reinstall Whatsapp on new phone, relink Desktop and Surface to new phone. Bingo! Half a damn hour that took to work out...
* Because the second SIM slot on my Huawei P30 wasn;t working, and I wanted her SIM and mine in the same phone for a few months so if anyone called her I could break the news ... but I hate the fingerprint reader on the M32 (love the reader on the P30) and I need a "full 5G/4G" phone because 3G is being turned off this month in the UK. And I have to switch carrier because the Vodafone 4G & 5G coverage here is non-existent but EE's works indoors. Just - I'm still using WiFi Calls because the signal is better.
Just to make life entertaining, I'm also dumping Sky landline, broadband, and Sky Q in favour of EE/BT Digital Voice, Broadband, and EE TV Box Pro as it's half the price.
"I have no idea what I did, but I'm taking full credit for it." - ThisOldTony
"Common sense is so rare these days, it should be classified as a super power" - Random T-shirt
AntiTwitter: @DalekDave is now a follower!
|
|
|
|
|
Moving the files using the PC from one phone to the next one used to work too. I did it when my wife changed the phone last time (3-4 years ago?)
M.D.V.
If something has a solution... Why do we have to worry about?. If it has no solution... For what reason do we have to worry about?
Help me to understand what I'm saying, and I'll explain it better to you
Rating helpful answers is nice, but saying thanks can be even nicer.
|
|
|
|
|
I had to do the backup from WhatsApp on my old phone (Android v8), as it was not included in Google backup. Might be that things have changed in newer Android versions.
|
|
|
|
|
THis is Samsung Android 13 to Samsung Android 13 ... you'd think they would get at least the basics right by now ... it's not as if they don't want you to upgrade the hardware as often as possible!
"I have no idea what I did, but I'm taking full credit for it." - ThisOldTony
"Common sense is so rare these days, it should be classified as a super power" - Random T-shirt
AntiTwitter: @DalekDave is now a follower!
|
|
|
|
|
I have been trying to upgrade my EE package but their website is total crap. Select the option to view packages and it just goes round in circles. It also says it cannot log me in, even though I'm already logged in.
|
|
|
|
|
I went into the shop and got a staffer to do it all for me. Got my name wrong (there is no "K" in "Paul") but saved me loads of hassle. Mind you, the Vodafone site is generally worse - to the point where I changed my login to "IHateThisCrap" and my password to an obscenity.
"I have no idea what I did, but I'm taking full credit for it." - ThisOldTony
"Common sense is so rare these days, it should be classified as a super power" - Random T-shirt
AntiTwitter: @DalekDave is now a follower!
|
|
|
|
|
Given the product(s) that they want us to buy you would think it's in their interest to have a decent website.
|
|
|
|
|
I feel your pain. Mrs. Wife's phone died this week, as the home/power button would neither home nor power. Mine has been slowly dying with an ever-shortening operating life on a charge.
I replaced both our phones today. Fortunately the phone dude recovered contacts and calendar data for both. It took me a couple of hours to get hers laid out the way she had the original. Mine was a little harder. Our original phones were 3-4 generations old, so things have changed.
As a moderately cranky old person, changes annoy me.
Software Zen: delete this;
|
|
|
|
|
Do you remember the times when techie guys used to make fun of wives not being able to program the VCR? God, where are those simple times? Even writing "VCR" makes me feel incredibly old!
Mircea
|
|
|
|
|
My first VCR did not do automatic channel scanning, even from cable TV. I had to manually set each channel, which took over an hour. To top it all off, the channel settings would be lost if the power went off for more than a few seconds.
Software Zen: delete this;
|
|
|
|