C / C++ / MFC

Re: How can read a unicode text file as character by character?

21-Mar-11 3:31

Hope you do know that unicode is two bytes and ascii is one byte...

Emilio Garavaglia22-Mar-11 22:43

Re: How can read a unicode text file as character by character?

22-Mar-11 22:43

UNICODE is actually a set of code-points whose cardinality requires 21 bits.
When encoded in sequence of 1 bye is called UTF-8 and when encoded as sequence of two bytes is called UTF-16.
In UTF-8 coding may vary from 1 to 4 bytes (and remains identical for code-points between 0 and 127, aka ASCII)
In UTF-16 coding may be 2 or 4 bytes (and is TWO for the most of Latin, Cyrillic and Greek characters, as many simplified Chinese).

UNICODE==2bytes is a misconception that originated at the time Windows included Unicode APIS using 16bits since -at that time- Unicode specs where not so wide.
Actually, reading 2bytes does not necessarily means "read a character".

2 bugs found.
> recompile ...
65534 bugs found.
D'Oh! | :doh:

Albert Holguin23-Mar-11 4:07

Re: How can read a unicode text file as character by character?

23-Mar-11 4:07

unfortunately, i think it depends on what standard of C/C++ and what OS. I'm pretty sure windows defines unicode as 16bits...

from their website:
"Unicode: A fixed-width, 16-bit worldwide character encoding that was developed and is maintained and promoted by the Unicode Consortium, a nonprofit computer industry organization."
http://msdn.microsoft.com/en-us/library/cc194793.aspx[^]

Emilio Garavaglia24-Mar-11 8:37

Re: How can read a unicode text file as character by character?

24-Mar-11 8:37

I'm sorry for you and for Microsoft, but the one and only entitled to say what Unicode was, is and will be is www.unicode.org[^]

The page you linked is a very shame for Microsoft. A technical document like that cannot be written without specifying in the page itself a data when it was written (hey ... they speak about their new amazing Windows NT 3.5 ...) and for this sole fault should disqualify M$ of whatever authority in the field.

2 bugs found.
> recompile ...
65534 bugs found.
D'Oh! | :doh:

Albert Holguin24-Mar-11 8:41

Re: How can read a unicode text file as character by character?

24-Mar-11 8:41

so angry!

...similar articles found in the MS VS2010 area of MSDN... i don't do much in unicode so haven't needed to worry about it...

Emilio Garavaglia22-Mar-11 22:46

Re: How can read a unicode text file as character by character?

22-Mar-11 22:46

This is actually a miscoception ...
see here[^].

2 bugs found.
> recompile ...
65534 bugs found.
D'Oh! | :doh:

Albert Holguin23-Mar-11 4:21

Re: How can read a unicode text file as character by character?

23-Mar-11 4:21

Here's another reference to Microsoft:
http://msdn.microsoft.com/en-us/library/2dax2h36.aspx[^]

Emilio Garavaglia24-Mar-11 9:57

Re: How can read a unicode text file as character by character?

24-Mar-11 9:57

Nope.

1) Unicode is not a Microsoft product. What UNICODE is is defined by www.unicode.org[^]

2) Microsoft use to encode UNICODE into 16-bits units. That is a technique well defined by the UNICODE standard itself, known as UTF-16. Essentially, every code not in the range 0xD800-0xDFFF and lower than 0xFFFF is code as itself.
Every other greater that 0xFFFF is broken in two 10-bits chunks, or-ed with 0xD800 and 0xDC00 respectively.
The range 0xD800 - 0xDFFF is called "UNICODE surropgate" and does not contain valid codepoints.

So you can have single unicode characters requiring two wchar_t in sequence to be represented and sequences of two wchar_t representing a single character, with code greater than 0xFFFF (typical for CJK - Chinese, Japanese, Corean characters).

2 bugs found.
> recompile ...
65534 bugs found.
D'Oh! | :doh:

Albert Holguin24-Mar-11 10:02

Re: How can read a unicode text file as character by character?

24-Mar-11 10:02

i certainly believe your point about unicode consortium being the authority... no argument there! Smile | :)

Richard MacCutchan20-Mar-11 23:18

Re: How can read a unicode text file as character by character?

20-Mar-11 23:18

Use fgetc()/fgetwc()[^] or wcin[^]. It's exactly the same process as reading non-Unicode.

I must get a clever new signature for 2011.

Le@rner21-Mar-11 0:51

Le@rner

21-Mar-11 0:51

i m using this now but its not successful.

FILE *stream;
	   char buffer[2];
	   int  kk, ch;

	   // Open file to read line from:
	   fopen_s( &stream, OpenFile, "r" );
	   if( stream == NULL )
	   {
		   return ;
		  //exit( 0 );
	   }

	   // Read in first 80 characters and place them in "buffer": 
	   ch = fgetc( stream );
	   for( kk=0; (kk < 1 ) && ( feof( stream ) == 0 ); kk++ )
	   {
		  buffer[kk] = (char)ch;
		  ch = fgetc( stream );
	   }

	   // Add null to end string 
	   buffer[kk] = '\0';
	   printf( "%s\n", buffer );
	   fclose( stream );

here its unable to open the file and stream is alwaz null thats why its return.

Re: How can read a unicode text file as character by character?

Richard MacCutchan21-Mar-11 2:33

Re: How can read a unicode text file as character by character?

21-Mar-11 2:33

You are ignoring the return code from fopen_s so it's impossible to diagnose your problem. Use something like the following and look up the error code that you receive.

errno_t errNum = fopen_s( &stream, OpenFile, "r" );
if (errNum != 0)
{
    // add some code here to display the error value
    // or set a breakpoint and inspect its contents
}

I would suggest you look at the documentation here[^] for further guidance.

Incidentally, the rest of your code does not seem to be set up to process Unicode data, which was the subject of your original question.

I must get a clever new signature for 2011.

malaugh1-Apr-11 5:03

malaugh

1-Apr-11 5:03

If you have the program set to unicode, you need to use _wfopen_s to open the file, and the filename (OpenFile) needs to be specified as wchar_t something like

wchar_t Myfile[] = "my_file.ext";

Then you should be able to use fgetwc to get the characters using ch = fgetwc( stream ); your should specify ch as wchar_t

See if dll exist

marca29220-Mar-11 22:04

20-Mar-11 22:04

Hi,

How can I check if a dll exist in c++ before I try to load it with dllHandle = LoadLibraryW(m_sFileName)? Sometimes it looks like that the LoadLibraryW call hangs the calling thread if the dll doesn't exist on the disc.

Regards
Olof

వేంకటనారాయణ(venkatmakam)20-Mar-11 22:32

వేంకటనారాయణ(venkatmakam)

20-Mar-11 22:32

I think you can use the API PathFileExists() to check wheather a file exist or not in disk.
http://msdn.microsoft.com/en-us/library/bb773584%28v=vs.85%29.aspx[^]
But according msdn loadlibrary function will return with fail if it wont find the file specified dll.

marca29221-Mar-11 0:48

21-Mar-11 0:48

Thanks, I will try this.

CPallini20-Mar-11 22:35

CPallini

20-Mar-11 22:35

If you a priori know where the DLL should be then your strategy might be good. On the other hand if you need to follow the same procedure the OS does for searching the DLL then you'll meet the same hanging conditions, I suppose.
Smile | :)

Richard MacCutchan20-Mar-11 23:15

20-Mar-11 23:15

It's probably more important to investigate why the program hangs. LoadLibrary() should merely search through all paths in the PATH variable to find the named DLL. If it hangs during that process then there is something wrong with your system.

I must get a clever new signature for 2011.

marca29221-Mar-11 0:52

21-Mar-11 0:52

Hi, the problem occurs when the user do sleep/resume test with the computer. We try to load the dll with LoadLibrary and a few milliseconds after the LoadLibrary call the user puts the system into sleep. Maybe thats why I got the problem?

Richard MacCutchan21-Mar-11 2:35

21-Mar-11 2:35

This is a joke, right? You really expect your program to continue running when the OS enters sleep mode?

I must get a clever new signature for 2011.

marca29221-Mar-11 4:49