|
Excellent article. Unfortunate he turned off comments; a blog without comments isn't a blog.
But a great article nonetheless. He argues convincingly that because System.String is so easy to use, people end up building new strings all the time, allocating memory and increasing pressure on the garbage collector, and, contrary to popular belief, this is not free.
I'm reminded of this recent article: Why I Program in Erlang[^]. The author makes the case that Erlang's non-obvious implementation of strings is the right one:
Or take string concatenation. If you pop open the implementation of string concatenation in Perl, Ruby, or JavaScript, you are certain to find an if statement, a realloc, and a memcpy. That is, when you concatenate two strings, the first string is grown to make room for the second, and then the second is copied into the first. This approach has worked for decades and is the “obvious” thing to do. Erlang's approach is non-obvious, and, I believe, correct. In the usual case, Erlang does not use a contiguous chunk of memory to represent a sequence of bytes. Instead, it something called an “I/O list” — a nested list of non-contiguous chunks of memory. The result is that concatenating two strings (I/O lists) takes O(1) time in Erlang, compared O(N) time in other languages. This is why template rendering in Ruby, Python, etc. is slow, but very fast in Erlang.
Duffy's .NET strings article got me thinking about more efficient ways to represent a string. Obviously we have StringBuilder. Would passing around lazy IEnumerable<char> be any different?
Or some amplified type from there?
Interesting thought exercise. But in my day-to-day work, I'm happy just using System.String.
|
|
|
|
|
Judah Himango wrote: a nested list of non-contiguous chunks of memory
Judah Himango wrote: The result is that concatenating two strings (I/O lists) takes O(1) time in Erlang, compared O(N) time in other languages
Sounds like a linked list of strings (or tree), which is how I implemented StringBuilder+ (see my articles if you are curious). That's not good for some things though, such as doing a substring operation (you'd have to scan through the linked list to find the one that contains the section of the string you are trying to substring, which turns an O(1) operation into an O(N) operation). There are some optimizations that can be made, but using a linked list isn't the ultimate solution.
|
|
|
|
|
AspDotNetDev wrote: you'd have to scan through the linked list to find the one that contains the section of the string you are trying to substring, which turns an O(1) operation into an O(N) operation That's why rope[^] exists. But I don't like that one too much either - the nodes are too small compared to the size of a cache line. I tried to make it more B-tree-like, but it didn't quite fit somehow. Someone else could probably figure that out.
|
|
|
|
|
A good read - thanks, Dave!
/ravi
|
|
|
|
|
Beware the code that doesn't work!
string str = ...;
int lastIndex = 0;
int commaIndex;
while ((commaIndex = str.IndexOf(',', commaIndex)) != -1) {
Process(substr, lastIndex, commaIndex);
lastIndex = commaIndex + 1;
}
- The string passed to the
Process method should be str , not substr ; - By passing
commaIndex to the IndexOf function, this code would enter an infinite loop, if it would compile.
(It actually generates a "use of unassigned local variable" compiler error.) - If you change it to
lastIndex , it will miss the last item in the string.
"These people looked deep within my soul and assigned me a number based on the order in which I joined."
- Homer
|
|
|
|
|
The code may be bugged, but the discussion was sound.
|
|
|
|
|
Yes, I agree the discussion is sound; I was just pointing out that it's harder to write correct code without using the built-in functions.
I've just finished throwing together a struct to cover the common substring operations, and I've reached 500 LoC (excluding comments).
"These people looked deep within my soul and assigned me a number based on the order in which I joined."
- Homer
|
|
|
|
|
Notices will be sent out to developers of up to 100 mobile apps that are not compliant with California privacy law, starting with those who have the most popular apps available on mobile platforms, the office of the state's attorney general Kamala D. Harris said Tuesday. ITworld]
|
|
|
|
|
That will be a real impact on the small developer. $2500 penalty each time the app is downloaded. So much for making a profit/
|
|
|
|
|
Intel researchers are working on a 48-core processor for smartphones and tablets, but it could be five to 10 years before it hits the market. "If we're going to have this technology in five to 10 years, we could finally do things that take way too much processing power today," said Patrick Moorhead, an analyst with Moor Insights and Strategy. "This could really open up our concept of what is a computer... The phone would be smart enough to not just be a computer but it could be my computer." Meanwhile, fashion designers are researching pants with fan-cooled phone pockets.
|
|
|
|
|
To effectively use multiple cores, you need to write the software. That means the the current generation of languages such as C++, C#, Java really are not designed to handle all these cores. Maybe the functional languages can do better. Of course I guess the compiler can do the job, but so far that has been difficult.
|
|
|
|
|
Clifford Nelson wrote: To effectively use multiple cores, you need to write the software. That means the the current generation of languages such as C++, C#, Java really are not designed to handle all these cores.
Are you referring to Processor Affinity?[^]
dev
|
|
|
|
|
Actually no, I was refering to being able to program so that parallelism becomes like memory management. You basically do not have to worry about it. But thanks for the link.
|
|
|
|
|
ok... here's one thought i have for our app
(a) Assign processor affinity to a calc job (exe, kicked started by another exe/calc controller) based on user
(b) Assign processor affinity to a calc job (exe, kicked started by another exe/calc controller) based on calculation type
I don't think .NET framework or Windows can make that decision for you...
dev
|
|
|
|
|
Clifford Nelson wrote: the current generation of languages such as C++, C#, Java really are not designed to handle all these cores
somebody banned the use of threads? :S
|
|
|
|
|
No, but threads suck. They're only actually good at handling completely independent parallel workloads that run for a relatively long time - and there are almost none of those, not in the average app anyway.
|
|
|
|
|
I agree, and it is mostly because writing scalable code that runs on separate CPU threads is not a trivial task, unless you use frameworks like OpenCL(C++), TPL(.NET), and Scala(JVM) in combination with the languages mentioned in the OP's post xD
Then there are other cool bits like C++ AMP and CUDAfy.NET for writing C++/C# code that scales and runs on GPU threads.
|
|
|
|
|
I like "Thread sucks" part but can't agree with your comment on that there's no need for threads.
we do real time apps where screens can't freeze up, so many times, we send a job to background thread pool (calc, data fetch...etc) so UI remains responsive. Until the thread job is done, data comes back to UI thread where it's binded to UI synchronously.
dev
|
|
|
|
|
devvvy wrote: but can't agree with your comment on that there's no need for threads. Well if you put it like that, I don't agree with myself either
Like you say, they're nice for asynchronous programming too. This was a thread about 48 cores and how they might be used though - threads suck in the context of keeping all those cores fed.
|
|
|
|
|
That is not a language feature, that is a framework feature. You could say that async is a language feature, so maybe in that case. What you want is a language to state the program so that its parallellism can be determined and used. If the language was truly parallel friendly you would not even have to think about threads or background, or anything like that. That you do is the problem.
|
|
|
|
|
Thread is an OS feature, not a framework feature. System.Threading.Thread just happens to use kernel32::CreateThread to implement the required functionality.
Using the same logic, we shouldn't use C# to add 1 plus 1 because Int32 is declared and implemented in CLR and not a C# feature xD
|
|
|
|
|
What do you mean by C# not being designed to handle all the cores? As of .NET 4, parallel programming in C# became relatively straight forward. Link[^].
|
|
|
|
|
Even TPL CTP1 worked like a charm in 3.5 ^^
|
|
|
|
|
C# has some tools to help from the TPL, but it does not automatically handle multiple cores. It requires some programmer intervension, and it is not C# that does this, but the framework.
|
|
|
|
|
Indeed it does, but if you use TPL and the other features of the framework then you have multi core capability. Your post seemed to suggest that you couldn't do this right now, which is why I posted.
|
|
|
|