I don't see anything wrong with your example, except that you assume that this is a bitmap of 4-byte pixels. If the pixel type is smaller that would explain the access violation. A test of bmBitsPixel would give you certainty that this is the cause.
Edited: Ok, now that we know that you have 3 bytes per pixel we need to rearrange your loop a little. First of all, we have to define the pointer to the current pixel as a byte pointer instead of DWORD pointer. Then we step in 3-byte increments across a scan line:
for(int row = 0; row < Bmp.bmHeight; ++row)
{
BYTE* p = ((BYTE*) Bmp.bmBits) + row * Bmp.bmWidthBytes;
for(int col = 0; col < Bmp.bmWidth; ++col)
{
*p++ = 255;
*p++ = 127;
*p++ = 0;
}
}
As you see, in this way we can also avoid to calculate the pixel's address for every single pixel. Note that it is necessary to recalculate the pixel pointer for every row, because at the end of each row there might occur several slack bytes to fill up to the next even boundary. That is why the BITMAP structure holds the member bmWidthBytes to tell us exactly how long a row is.