|
Problem with templates is that they mostly depends on the compiler's abilities, and that their syntax isn't widely taught or easy to understand (a bit like imperative vs functional languages). The Boost libraries (templates) are good yet not widely used due to these problems... Note also that STL doesn't deals quite well with Unicode/Utf so I guess until you find a good template string library, hand-made ones might still have a future...
Kochise
In Code we trust !
|
|
|
|
|
But not asking anybody to play 1:5
Kochise
In Cod we trust !
|
|
|
|
|
Kochise wrote:
Now the interrest I find in this class is not to chain operations, what I agree with most of you is an incredible stupidity. But 'soso_pub' never stated to chain thousands of operations.
As an assembly coder first, the development orientation should be 'vertical', one instruction per line. So that in case of breakpoint, you don't have to look which instruction in the line caused the break...
Yep, this seems to be the major drawback, but for example I prefer something like str.TrimLeft().TrimRight() instead of writing a line for each trim (I'm talking from the perspective of CString here who - if my memory serves me right - doesn't have a Trim operation that does left and right trim).
Anyway, you can chain or you can not chain. Programmer's choice.
Kochise wrote:
SOooo, thanks alot dear 'soso_pub', I'm saving your component in my 'Essential' folder
You're welcomed!
soso
|
|
|
|
|
CStr t = "this is a test";
CStr f;
f.Format(_T("%s-%s"),t);
Crashes. I believe due to the various casting opertors that t is not ending up a pointer to a const char* and crashes the internal formatting routine.
It works if you cast the var arg to a LPCTSTR
CStr t = "this is a test";
CStr f;
f.Format(_T("%s-%s"),(LPCTSTR)t);
The CStr class should be able to handle this internally somehow I think.
|
|
|
|
|
Anonymous wrote:
CStr t = "this is a test";
CStr f;
f.Format(_T("%s-%s"),t);
Perhaps it is due to an error in your code: you want two strings à la %s to be formatted, but supply only one parameter t .
Best Regards,
Christian Richardt
Those who know don't talk.
Those who talk don't know.
|
|
|
|
|
First, as someone pointed in another reply you want two strings to be formatted but you supply only one parameter.
Second, Format uses a variable-arguments list and I don't think it's possible to find argument types at runtime (this is the reason you have to supply %s, %d, etc in format string). So instead of passing a LPCTSTR as parameter you are passing a CStr and this is why it crashes, besides supplying only one parameter.
I don't know right now a solution to use raw CStr as arguments, probabily introducing a new format specifier like %q. But in case of %s automatic conversion from CStr to the internal LPCTSTR will still not work.
Regarding casting operators with the current implementation if you try to use a CStr in the right part of << you will get an ambiguosity error. It is fixed on my computer -plus other small bugs/enhancements-, I'll have to update it sometimes soon.
soso
|
|
|
|
|
The extra format specifier was a typo and not really related to the crash at all since that part gets ignored by the internal formatting routine.
Anyway, I was looking at other string implementations and they suffer from the same problem which means that MFC must have been doing some hacky workarounds to make the code (that I really got used to using ) work.
So, I contend it is not really a bug afterall and the correct "fix" is for the client (me) to cast the objects to a const char* pointer before calling the Format() function which as I stated in my first post, does work as expected.
Thanks for the quick followups.
|
|
|
|
|
The reason CString works is because its first (and only) data member is a pointer to its char array. All its internal data is stored offset to this pointer, and does some iffy negative pointer offsetting to get to it.
So when you pass a CString to a formatting routine, it gets a verbatim copy of the CStrings member info, which happens to be a string pointer, just what it was looking for.
There is a temptation to make the first member of CStr a pointer to its own string data, but then the following member data would be interpreted as the following formatting arguments, I think. So that wouldnt work.
|
|
|
|
|
Why not use a good name instead of CStr?
|
|
|
|
|
Have a suggestion for a good class name???
William
Fortes in fide et opere!
|
|
|
|
|
hero3blade wrote:
Why not use a good name instead of CStr?
Class names are like girls: the best ones are already taken
|
|
|
|
|
.:fl0yd:. wrote:
Class names are like girls: the best ones are already taken
I have not got a pretty girl, so could not thought out of a good name.
Give me one, God.
|
|
|
|
|
Regarding your wanting to use
str.TrimLeft().TrimRight().Replace( " ", "/" );
instead of
str.TrimLeft();<br />
str.TrimRight();<br />
str.Replace( " ", "/" );
I personally see no difference in the two, other than a few semicolons. The code is not any easier to use, does not result in any fewer mistakes, and does not produce a "better" result. This (by itself) should not be a reason to create a new class.
I do, however, like your idea of assigning numeric values. While I'm prefectly comfortable using CString::Format(), and will continue to do so, direct assignment is a plus.
|
|
|
|
|
in my opinion, chaning operation makes the code more readable.
readability implies fewer mistakes, easier use, all these might produce a "better" result overall.
---------------
Horatiu CRISTEA
|
|
|
|
|
In my opinion, while a lot of you guys out there pride yourself on writing 100 operations in one ine of code, good luck to the next guy (or you in 2 years) who has to figure out what the heck it is supposed to do.
|
|
|
|
|
I have about 9 years professional C++ experience, and I have to agree chaining operations like this is NOT easier to understand. It is certainly not easier to step through or set break points while debugging.
|
|
|
|
|
In practice, having to type MakeLower or MakeUpper on a separate line is an absolute pain when you work with strings a lot. One supposes that these are not functions you would generally wish to step into.
My solution is to have a class called CStringTools which just acts as a container for a collection of handy static string functions, including versions of MakeLower and MakeUpper which return the modified string. I use these functions *all the time*.
|
|
|
|
|
i'm not talking about 100 operations in 1 line of code.
this is about few obvious operations that can be chained like objStr.LTrim().RTrim().Replace(..) its common practice in other languages like java or C# and in alot of scripting languages.
---------------
Horatiu CRISTEA
|
|
|
|
|
Cleary you missed the tougue in cheek way that I meant it. To me, it might as well be 100 lines of code. It seems that every programmer has very different "dialect" when it comes to stringing operations together. What's obvious to you is NOT often obvious to another. That results in wasted time.
Therefore, in large projects, with shared coding responsibilites, please use "linear" coding. My 25 years of experience coding, and of hiring new coders tells me this is the best approach. So called 'elagant' coding means nothing to me if I am late on a project.
|
|
|
|
|
What folks are trying to point out here is that when you see:
objStr.LTrim();
objStr.RTrim();
objStr.Replace(..);
There is absolutely now doubt which step is done first when an inexperienced programmer reads the code. Further you don't have to read the entire line looking for where you put that blasted Replace(...) statement you're sure needs a little tweak, and a grep might not work because you might have 50 Replace calls in the same file. Sure, those of us who are more experienced aren't going to be slowed down "much" by chained code, but who needs to be slowed down at all? Chained statements are clearer. It's not just our oppinion. Books discussing maintainability of code have actual numbers based on real world projects. Chained code results in more mistakes, whether they be order precedence mistakes, not quite grasping everything that's done in a single line, etc.
These same books talk about limiting the use of the ternary operator (that's the : result = (bool) ? val1 : val2; type of line)due to similar issues. The reader of the code has to take a moment and THINK about whether val1 or val2 is used on a true condition. Further they have to sit back and verify if the data types realy make sense for both pieces, and if val1 and val2 contain there own complex processes, well, things get hairy fast. The same holds true for chaining opperations. If one or two take strings (such as the Replace call) you quickly find folks who will write:
objStr.LTrim().RTrim().Replace(objStr2.LTrim().RTrim().Replace(...));
and much worse. Giving the power means it will be used, and usually by your least experienced programmer which means more maintanance bugs for them and those that follow them. Those who argue against this are talking generally from personal bad experiences with chained code and often from knowledge we've gained in our reading.
Take our advice, don't do it any more than necessary.
|
|
|
|
|
It would be nice to have more operators, like +,+=,<,<=,>,>=.
|
|
|
|
|
Also I like and it is necessary
Does work?
CStr CStr::operator+( CStr &other )<br />
{<br />
CStr sTemp(this->m_pBuf);<br />
return sTemp.Append(other);<br />
}<br />
<br />
CStr CStr::operator+=( CStr &other )<br />
{<br />
return this->Append(other);<br />
}<br />
<br />
...<br />
|
|
|
|
|
You've said so already, STL-lovers should quit reading. Although I do agree with you on the fact that std::string isn't all that great in practice, especially when it comes to type-conversions using stringstreams, I believe a better approach would have been to wrap up an STL-string in your own class.
Comparing your implementation with std::string, we can probably agree that both aren't guaranteed to be thread-safe. There are, however, some features in std::string that your class is missing. Most important, when it comes to performance your implementation will be far behind both std::string and MFC's CString. It doesn't incorporate reference counting for assigning strings to different objects without modifying them. Even worse, it doesn't have special handling for short strings, i.e. a union of either an m_pBuf or a m_pShortBuf[WHATEVER_SEEMS_TO_FIT]. operator new[] isn't really all that spectactular when dealing with small-memory-allocations, so this would be an obvious optimization.
Apart from that, from a short glimpse at the source, there are some more or less severe bugs in your code:
* [Str.cpp:line 25]: by default, operator new throws an exception when it runs out of memory. Granted, MSVC defines a different [non-standard] behaviour, returning a NULL-pointer, unless you turn off MSVC-specific language extensions. At any rate, the correct code would be m_pBuf = new( std::nothrow ) TCHAR[ m_size ];
* [Str.cpp: line 38]: applying operator delete on an array results in undefined behaviour. I'm sure one or the other will start arguing, that it merely misses to invoke the [trivial] d'tors of those stored TCHAR's. The truth however is, that any compiler vendor can choose to produce any code they like to. Moreover, the behaviour can change from one revision to the next without further notice or even the need to document the behaviour. If you want to keep standard-conforming replace that with the correct code, i.e. delete [] m_pBuf;
* [Str.cpp: line 51]: _Alloc returns a BOOLean value. Since you are just ignoring it your code will crash at line 53 if it runs out of memory.
* Illegal symbol names: _Alloc, _Free, _Init, _PaddedSize, _CheckSize, _StrCopy, and __STR__EX_H. According to the ISO standard of c++ all symbol names containing a double underscore or starting with an underscore immediately followed by a capital letter are reserved for use by the implementation. You may call it nitpicking, but library code has to follow just about any rule there is -- or it won't make it to be a library ever.
* _CheckSize is misleading: The name of the method has no indication that it is actually performing a reallocation. Why not go with the STL-naming and call it reserve instead?
* missing exception specification: I know, few ever care about the extra cost of doing this. But hey, if this is supposed to be a library you should put in the extra effort and put as much documentation into the code as possible. At least mark the d'tor and member _Free with throw() to indicate that no exception is raised there.
* missing assertions: A library should use as many as necessary to enforce the correct usage of the class. Your implementation doesn't appear to have any. Does that really mean, a client cannot, under no circumstances use the class in an undesired way?
* #define's: are BAD! They pollute global namespace and especially INT_SIZE and the like aren't distinct enough to minimize the probability that someone using your code hasn't #defined them before or will #define them after your code. Why not make them const objects instead? If nothing else it's at least a bit more type-safe.
There are probably issues with the inner workings of the code, which I didn't have time to check. The class itself may work in your environment. Some of it is pure luck though. Especially the first 3 issues noted above, which I would grade as beginners' mistakes, would make me feel somewhat uneasy, if I was forced to use this class.
I hope this isn't putting the code in too bad a light. As it was mentioned before though, there are some rather strict requirements that have to be met when dealing with library code, which you may get away with in a typical application.
Regards,
.f
|
|
|
|
|
.:fl0yd:. wrote:
Comparing your implementation with std::string, we can probably agree that both aren't guaranteed to be thread-safe. There are, however, some features in std::string that your class is missing. Most important, when it comes to performance your implementation will be far behind both std::string and MFC's CString. It doesn't incorporate reference counting for assigning strings to different objects without modifying them. Even worse, it doesn't have special handling for short strings, i.e. a union of either an m_pBuf or a m_pShortBuf[WHATEVER_SEEMS_TO_FIT]. operator new[] isn't really all that spectactular when dealing with small-memory-allocations, so this would be an obvious optimization.
std::string is not specified to be ref-counted, and most implementations are moving away from that now, as the semantics are difficult to make correct and still obtain any performance benefit. COW needs to be replaced by move semantics (http://www.cuj.com/documents/s=8246/cujcexp2102alexandr/).
Also, the standard says nothing about employing any special handling for small strings. That would be a QoI issue, at best.
William E. Kempf
|
|
|
|
|
Yip, those are valid comments. When I was typing I edited the original post several times and both, ref-counting as well as special handling for small strings do not apply to std::string according to the ISO standard. They are present in may implementations, though, and MFC CString's do still have those features, although Alexandrescu raises some valid concerns about their usability.
Thanks for setting this straight.
|
|
|
|
|