Click here to Skip to main content
16,021,041 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
i am using strtok function to break the string into desirable parts.

i am using the following code :

C++
const char seps[6]   = "\n\t";
char *token;
token = strtok( a, seps );



C++
char *arr[5]={NULL};
arr[i]=token; //Inside the loop i assign each token of a line into arr[i]
string buf= arr[3]; // Description token is in the arr[3]. buf string is initialized by arr[3]'s value and is later used for printing.


However whenever i break my tokens, for some tokens for the description field, an extra character is appended to random rows.
For e.g.
Dear parents is shown as Dear parentsuè

Out of 4 lines these extra character appends to say the description tokens of last 2 lines or so. This thing does not happens with nay of other tokens.

Can you suggest me something which i may be overlooking or missing?




C++
while( getline( myTfile, s1 ) )
{
	char * a = new char[s1.size() + 1];
	std::copy(s1.begin(), s1.end(), a);
	const char toks[6]   = "\n\t";
	char *token;
	token=NULL;
	token = strtok(a,toks);
	char *arr[5]={NULL}; int i=0;
	while( token != NULL )
	{
		arr[i]=token;
		i++;
		token++;
		token = strtok(NULL,toks);/* Get next token: */
	}
	string buf= arr[3];         // Description
}


Input is some thing like :

+ 123123123 243.20 textstring1
- 123454355 123.10 textstring2


Now for textstring2 or some random string in the lie input, an extra character is appended to random rows.
For e.g.
Dear parents is shown as Dear parentsuè
Posted
Updated 26-Sep-11 22:55pm
v5
Comments
[no name] 27-Sep-11 4:19am    
Without seeing your code, and the exact data you start with it is impossible to guess what you may be doing wrong.
typedefcoder 27-Sep-11 4:51am    
i have updated the question...

A few things to consider:

  1. std::copy() will not null terminate your character array, so you immediately have the potential of extra characters at the end of your array. You should replace this with a line of the form strcpy(a, s1.c_str());
  2. Your token characters do not include a space, however, this may be deliberate.
  3. You should really use char* toks = "\n\t" for your splitter characters to ensure it is properly const (i.e. read only).
  4. The expression token=NULL; is redundant as it is immediately followed by a strtok() call.
  5. Similarly you have the expression token++ in your while loop, which serves no purpose.

The extra characters you are seeing are the result of point 1 above. There are probably other things to consider depending on how you are trying to process your data.
 
Share this answer
 
You may also think of using STL types which for example may ease the handling of strings. What you are facing a typical errors when working with C char pointers/arrays.

For your inspiration. A function to get the tokens of a string and store them in a vector may look as follows:


C++
//! Tokenize the given string str with given delimiter. If no delimiter is given whitespace is used.
void Tokenize(const std::string& str, std::vector<std::string> & tokens, const std::string& delimiters = " ")
{
        tokens.clear();
	// Skip delimiters at beginning.
	std::string::size_type lastPos = str.find_first_not_of(delimiters, 0);
	// Find first "non-delimiter".
	std::string::size_type pos = str.find_first_of(delimiters, lastPos);

	while (std::string::npos != pos || std::string::npos != lastPos)
	{
		// Found a token, add it to the vector.
		tokens.push_back(str.substr(lastPos, pos - lastPos));
		// Skip delimiters.  Note the "not_of"
		lastPos = str.find_first_not_of(delimiters, pos);
		// Find next "non-delimiter"
		pos = str.find_first_of(delimiters, lastPos);
        }
}
 
Share this answer
 
v5
Comments
[no name] 27-Sep-11 5:36am    
My 5 for an excellent answer, but I get the feeling that OP is still learning the STL.
typedefcoder 27-Sep-11 5:39am    
Richard, the usage of STL has been restricted for this code. I really wish i could have used that. My work would have been faster. Even could not used things like "locale" in my program's development.

Thanks Legor for the update. Richard... i will update you with my development in while if your solution worked.
[no name] 27-Sep-11 5:49am    
It worked in my test, although you need to check that arr[3] is not NULL before trying to assign it to a string.
typedefcoder 27-Sep-11 5:58am    
Yeah its gone. Thanks a ton Richard. You are life saver. Apart from CP do you maintain some article blog or some series on c++ ? If yes i would love to study that .
[no name] 27-Sep-11 6:04am    
I wish! Sorry but I'm not a great writer so I stick to trying to answer tech problems here on CodeProject.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900