Click here to Skip to main content

Managed C++/CLI

Mark Salsbery30-Sep-07 7:12

30-Sep-07 7:12

Big Grin | :-D

Mark Salsbery
Microsoft MVP - Visual C++
Java | [Coffee]

Java | [Coffee]

George L. Jackson30-Sep-07 2:02

George L. Jackson

30-Sep-07 2:02

Sorry, You are on the wrong web site. You might want to visit: http://www.asp.net/downloads/starter-kits/[^].

"We make a living by what we get, we make a life by what we give." --Winston Churchill

Questions about reading binary files [modified]

GentooGuy28-Sep-07 13:47

28-Sep-07 13:47

Hi
I've got a jpg file which I'd like to read.
When using od in UNIX, I can get a hex version of its contents:

$ od -x file.jpg  | head<br />
0000000      d8ff    e1ff    8711    7845    6669    0000    4949    002a<br />
0000020      0008    0000    000b    010e    0002    0015    0000    0092<br />
0000040      0000    010f    0002    0018    0000    00b2    0000    0110<br />
0000060      0002    0005    0000    00d2    0000    0112    0003    0001<br />
0000100      0000    0001    0000    011a    0005    0001    0000    00e2<br />
0000120      0000    011b    0005    0001    0000    00ea    0000    0128<br />

When using this simple program, most of the values I read have a strange value:

#include "stdafx.h"<br />
#include <iostream><br />
#include <fstream><br />
 <br />
using namespace std;<br />
ifstream::pos_type size;<br />
char * memblock;<br />
<br />
<br />
int main () <br />
{<br />
  ifstream file ("y:\\EXIF\\sanyo-vpcg250.jpg", ios::in|ios::binary);<br />
  if (file.is_open())<br />
  {<br />
    size = file.tellg();<br />
    memblock = new char [size];<br />
    file.seekg (0, ios::beg);<br />
    file.read (memblock, size);<br />
    file.close();<br />
<br />
    cout << "the complete file content is in memory";<br />
	<br />
	char x=memblock[0];<br />
    delete[] memblock;<br />
  }<br />
  else cout << "Unable to open file";<br />
  return 0;<br />
}

The x value becomes 0xfd in the debugger (in vs.net using windows xp).
What's going wrong? And how can I get an output like the one using od?

-- modified at 19:58 Friday 28th September, 2007

Woops, should be moved to: http://www.codeproject.com/script/comments/forums.asp?forumid=1647

Re: Questions about reading binary files

Mark Salsbery28-Sep-07 15:16

28-Sep-07 15:16

The first byte you should see is 0xFF.

Try changing the memblock type to BYTE (unsigned char) since you're
working in binary.

After the read() call, look at memblock in the debugger. Should be FF D8 FF E1...

Mark

Mark Salsbery
Microsoft MVP - Visual C++
Java | [Coffee]

Java | [Coffee]

Re: Questions about reading binary files

Mark Salsbery28-Sep-07 15:22

28-Sep-07 15:22

Also

tellg() is probably returning 0.

To get the file length, you need to seek to the end before calling tellg...

...<br />
file.seekg(0, ios_base::end);<br />
size = file.tellg();<br />
...<br />

Mark

Mark Salsbery
Microsoft MVP - Visual C++
Java | [Coffee]

Java | [Coffee]

Re: Questions about reading binary files

GentooGuy29-Sep-07 7:25

29-Sep-07 7:25

That is indeed the solution.
thanks a lot Smile | :)

Smile | :)

unsigned char's aren't to be used by read I guess (after having looked at the function's signature).
So that wasn't quite an option, I'm afraid.

Re: Questions about reading binary files

Mark Salsbery29-Sep-07 7:39

29-Sep-07 7:39

GentooGuy wrote:
unsigned char's aren't to be used by read I guess (after having looked at the function's signature)

Yes, but in binary mode, you really aren't dealing with char so a cast can be appropriate.

It depends on the data in the file....if it was really all char data then you probably wouldn't be
using binary mode.

Whatever works for you - in the end you're reading bytes and you'll need to cast them to
something else eventually Smile | :)

Smile | :)

Cheers,
Mark

Mark Salsbery
Microsoft MVP - Visual C++
Java | [Coffee]

Java | [Coffee]

Re: Questions about reading binary files

GentooGuy29-Sep-07 8:05

29-Sep-07 8:05

Okay thanks for the advice.
Currently, I'm having another small (I hope) problem:

#include "stdafx.h"<br />
#include <iostream><br />
#include <fstream><br />
 <br />
using namespace std;<br />
ifstream::pos_type size;<br />
char * memblock;<br />
<br />
void swapByteOrder();<br />
int readFile(char *filename);<br />
void printFile(int nr); // just for diag purposes only<br />
<br />
int main () <br />
{<br />
  if(!readFile("y:\\EXIF\\sanyo-vpcg250.jpg"))<br />
  {<br />
		cout <<"Some error occurred while opening the file"<< endl;<br />
		return 1;<br />
  }<br />
  printFile(10);<br />
  cout << endl;<br />
  swapByteOrder();<br />
  printFile(10);<br />
  return 0;<br />
}<br />
<br />
<br />
<br />
int readFile(char *filename)<br />
{<br />
  ifstream file (filename, ios::in|ios::binary);<br />
  if (file.is_open())<br />
  {<br />
	file.seekg(0, ios_base::end);<br />
	size = file.tellg();<br />
    memblock = new char [size];<br />
    file.seekg (0, ios::beg);<br />
    file.read (memblock, size);<br />
    file.close();<br />
<br />
    cout << "the complete file content is in memory\n";<br />
	<br />
	cout << "Size : "<< size  << endl;<br />
	for(int i=0;i<100;i++)<br />
	{	<br />
		char x = memblock[i];<br />
		cout << hex << (int)memblock[i]<<endl;<br />
	}	<br />
<br />
    delete[] memblock;<br />
  }<br />
  else cout << "Unable to open file";<br />
  return -1;<br />
}<br />
<br />
<br />
void swapByteOrder()<br />
{<br />
	long max = size;<br />
	char temp;<br />
	for(int i=0 ;i<max-2; i+=2)<br />
	{<br />
		temp=memblock[i];<br />
		memblock[i]=memblock[i+1];<br />
		memblock[i+1]=temp;<br />
	}<br />
}<br />
<br />
<br />
void printFile(int nr)<br />
{<br />
	for(int i=0;i<nr ;i++)<br />
	{<br />
		cout << hex << memblock[i] << endl;	<br />
	}<br />
<br />
}

The SwapByte function gets a access violation, when reaching i==3992. This is strange because it should be able to run to 62096 (the lenght of the file , as indicated by size).

What's going wrong here?

Re: Questions about reading binary files

Mark Salsbery29-Sep-07 8:22

29-Sep-07 8:22

I'm surprised it gets that far, since you delete memblock in readFile() Smile | :)

Smile | :)

I'm curious....why are you reading bytes from a jpeg file as ints
cout << hex << (int)memblock[i]<<endl;
and why would you be swapping byte order? Are you trying to make the jpeg unreadable?

Actually, this whole loop doesn't make sense

for(int i=0;i<100;i++)<br />
{   <br />
    char x = memblock[i];<br />
    cout << hex << (int)memblock[i]<<endl;<br />
}

You're indexing the array by bytes but casting to int (4 bytes)???

Mark

Mark Salsbery
Microsoft MVP - Visual C++
Java | [Coffee]

Java | [Coffee]

Re: Questions about reading binary files

GentooGuy29-Sep-07 8:52

29-Sep-07 8:52

Well I got confused too. I'm java developer (bsc. in CS) but I'm getting quite stuck on this one.
Manual GC isn't exactly my cup of tea Wink | ;)

Wink | ;)

Well, I want to retrieve some EXIF information from the file, and when not swapping the bytes (hey, I do NOT write the array to disk) I get the same ouput as when using od (UNIX tool for displaying files).

So I thought, I had a byte-order related problem.
od output:

0000000      d8ff    e1ff    8711    7845    6669    0000    4949    002a<br />

When running my own app, I found a ff first, then the d8, an ff, the e1, the 11, the 87. etc...

That's my reason to swap these bytes.

Re: Questions about reading binary files

Luc Pattyn29-Sep-07 9:15

29-Sep-07 9:15

Hi,

A typical JPEG hex dump starts like this:
000000 FF D8 FF E0 00 10 4A 46 49 46 00 01 02 01 00 87
000010 00 87 00 00 FF ED 08 9E 50 68 6F 74 6F 73 68 6F

i.e. the very first byte is FF.

If you interpret that as a number of 16-bit words (as your od command seems to do)
then you would get D8FF E0FF etc. but that does not mean this is how you should look at it.

In fact JPEG coding is byte oriented, each FF XX pair of bytes marks the start of something
and may be preceeded by an arbitrary number of FF bytes.

I suggest you:
- start by reading the JPEG standard, you can find it on the web;
- look at JPEG files with an unbiased tool, one that shows bytes, not larger integers.

BTW: if you read a JPEG file with Image.FromFile() the Image class will offer access
to a lot of metadata as well (e.g. GetPropertyItem() method)

Smile | :)

Smile | :)

Luc Pattyn [Forum Guidelines] [My Articles]

this weeks tips:
- make Visual display line numbers: Tools/Options/TextEditor/...
- show exceptions with ToString() to see all information
- before you ask a question here, search CodeProject, then Google

Re: Questions about reading binary files

GentooGuy29-Sep-07 9:20

29-Sep-07 9:20

thanks for the info.
nut the image class is .net based, and I don't want just plain C++ without ms specific stuff.

Re: Questions about reading binary files

Mark Salsbery29-Sep-07 9:16

29-Sep-07 9:16

It's your binary viewer utility that's swapping the bytes.

If you go through and swap bytes, you won't have a JPEG anymore.

If you want to see the actual bytes in order, change your byte viewer loop to

	for(int i=0;i<100;i++)<br />
	{	<br />
   cout << hex << (int)(unsigned char)memblock[i] << endl;<br />
	}

And for your non-GC related issue - you don't want to use your array after you delete it Smile | :)

Smile | :)

Mark

Mark Salsbery
Microsoft MVP - Visual C++
Java | [Coffee]

Java | [Coffee]

Re: Questions about reading binary files

GentooGuy29-Sep-07 9:20

29-Sep-07 9:20

okay that's strange.

thanks you for the help, I really appreciated it Smile | :)

Smile | :)

Re: Questions about reading binary files

GentooGuy29-Sep-07 10:27

29-Sep-07 10:27

it works fine now Smile | :)

Smile | :)

thanks a lot!

Re: Questions about reading binary files

GentooGuy29-Sep-07 10:52

29-Sep-07 10:52

I'm sorry, but I still think they're swapped.
My reason for this is the fact that I'm looking up 0x9003, which is a tag in a JPEG file indicating the date of the picture.
When using 'od', I see this:

<br />
$ od -x file.jpg  | grep 9003<br />
0000520      0004    0000    3230    3030    9003    0002    0014    0000<br />

This is the only occurence of '9003' in a file which does contain the information (so this must be the instance I'm looking for).
But, when running my program, and printing some lines, I get this output:

<br />
4<br />
0<br />
0<br />
0<br />
30<br />
32<br />
30<br />
30<br />
3<br />
90<br />
2<br />
0<br />
14<br />
0<br />

This is produced by the loop you've proposed. When comparing both outputs, I see this one has the bytes swapped when compared to od and (!) the exif standard. So I guess, od isn't wrong. Or am I indeed wrong?

Re: Questions about reading binary files [modified*2]

Mark Salsbery29-Sep-07 11:08

29-Sep-07 11:08

I think we're confusing two different issues here...

First, your "od" is lumping 2 byte pairs and is assuming little-endian
byte order so, as you can see from the sample listings you've posted,
each pair of bytes appears swapped in the od-generated listing.

Second, you have to parse your file properly, depending on the byte order.
exif has some kind of tag to indicate whether multi-byte integers are stored
in "motorola" or "intel" order. This doesn't mean you can just go through the
entire file and swap every pair of bytes. This means when you encounter
multibyte-integer data in the file, you may need to swap bytes to work with
the data on your platform.

You need to parse the file bytes following its type and format. I don't have the jpeg/jfif
format memorized but it's well documented all over the internet Smile | :)

Smile | :)

Again, the only swapping going on here is by your "od" utility. In your code you simply
have the bytes in the same order they occur in the file.

Mark

*edit* LOL I really meant "sample", not "ample"....ample sounded snotty LMAO

Last modified: 17mins after originally posted --

Mark Salsbery
Microsoft MVP - Visual C++
Java | [Coffee]

Java | [Coffee]

Re: Questions about reading binary files [modified*2]

GentooGuy30-Sep-07 2:40

30-Sep-07 2:40

okay that sounds possible.

But when having a look at http://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/EXIF.html
I can see the tags about the creation date are 0x9003 and 0x9004 (still have to decide which one to take).

when a having a look at the jpg itself (which has a date and time of 01-01-1998) i see the strings about the date in a proper sequence.... but there's no 9003. Just a 0390 before it.

I've looked it up in the exif documentation, and I haven't found anything about inverting the bytes on such markers. What am I missing?

Re: Questions about reading binary files

Mark Salsbery30-Sep-07 6:31

30-Sep-07 6:31

GentooGuy wrote:
but there's no 9003. Just a 0390 before it.

Where are you seeing that? In your od results? If so, then that IS 9003
because od is swapping avery pair of bytes.

Mark Salsbery
Microsoft MVP - Visual C++
Java | [Coffee]

Java | [Coffee]

Re: Questions about reading binary files

GentooGuy30-Sep-07 6:37

30-Sep-07 6:37

Nop, in the VS binary editor

Re: Questions about reading binary files

Mark Salsbery30-Sep-07 6:47

30-Sep-07 6:47

I don't know what to tell you....

There's only three possibilities here:

1) The file was written incorrectly (not following specs)
2) There's a tag somewhere that indicates the byte order and you need to use it
3) You're interpreting the binary bytes incorrectly.

Which is it? Smile | :)

Smile | :)

Mark

Mark Salsbery
Microsoft MVP - Visual C++
Java | [Coffee]

Java | [Coffee]

Re: Questions about reading binary files

GentooGuy30-Sep-07 6:55

30-Sep-07 6:55

option 2.

I've read something about it yesterday, currently trying to find the document which described it.

Re: Questions about reading binary files

Mark Salsbery30-Sep-07 7:10

30-Sep-07 7:10

JPEG is big-endian.

EXIF follows TIFF specs which can be big or little endian.
This is usually determined by the first two bytes of the file:
"II" (0x49 0x49) for little endian, "MM" (0x4D 0x4D) for big endian.

For a file with "MM" byte order, tags (and all other multi-byte fields
of tags) will need to be swapped on Intel machines.

For the tag 0x9003, I would expect the following storage in the file:

Big endian: 0x90 0x03
Little endian: 0x03 0x90

Mark Salsbery
Microsoft MVP - Visual C++
Java | [Coffee]

Java | [Coffee]

Re: Questions about reading binary files

GentooGuy30-Sep-07 14:51

30-Sep-07 14:51

thanks, I've got it working Smile | :)

Smile | :)

Re: Questions about reading binary files

Mark Salsbery29-Sep-07 11:11

29-Sep-07 11:11

BTW are you using Visual Studio? If so, open the jpeg file in the binary editor window.
It won't swap any bytes in the display.

File menu -> Open/File...
Select the file
Click the little drop arrow on the "Open" button and choose "Open with..."
Choose binary editor

Mark Salsbery
Microsoft MVP - Visual C++
Java | [Coffee]

Java | [Coffee]

General News Suggestion Question Bug Answer Joke Praise Rant Admin

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.