Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

VCMAME - Multiple Arcade Machine Emulator for Visual C++

0.00/5 (No votes)
3 Sep 2003 2  
This article will explain some of the software engineering issues inside the MAME source tree, and provide an overview of how the emulation actually works.

Table of Contents

Introduction

On February 5th 1997, Nicola Salmoria released the very first version of MAME - Multiple Arcade Machine Emulator. It included source code, and allowed users to play 5 different arcade games including Pacman & Pengo on their DOS PC. Not just recreations but the original arcade code running inside a virtual machine. Since then, MAME has attracted a core group of developers who contribute to the project in their own time, it emulates over 2300 unique games and over 4000 variations, and it has been ported to a very wide variety of platforms, from modern PCs to PDAs to Unix workstations and even to closed systems such as Sony Playstation 2, Sega Dreamcast and Microsoft Xbox. In a twist that brings everything full circle, some people have built arcade cabinets and use MAME to power them rather than original arcade hardware.

This article will explain some of the software engineering issues inside the MAME source tree, and provide an overview of how the emulation actually works. Some of the more interesting programming techniques are discussed. The article should hopefully be a good starting point for anyone considering developing MAME. It also features the MAME variant VCMAME, which is a set of project files and extensions that allow MAME to be developed under the Microsoft Visual C series of compilers.

What is MAME?

MAME emulates the hardware found in classic arcade machines, allowing old arcade software to run on modern computers. The main reason for this is to document the way the hardware works, for both purely educational reasons and to aid in actually repairing classic hardware. Most classic arcade software is held under copyright and it is up to the user of MAME to purchase a legal license to use it, or source the material from an original source such as the original arcade board. A handful of original ROM images have been made public domain and are available for download on the MAME website. MAME itself is free to copy and distribute.

Original Technos Double Dragon PCB

PCB running on arcade monitor.

Double Dragon running under MAME with the integrated debugger active. As well as being able to single step through the assembly program and view memory, the debugger shows the CPU flags, the IRQ state, number of cycles until next IRQ and the current renderer scanline & horizontal beam positions.

MAME Architecture

From a software engineering point of view, MAME can be split into three ‘layers’ of code. The top layer deals with specifics of emulating certain games. The middle layer contains common functions & modules and ‘glue’ code, the bottom layer handles actually presenting the emulation to the user. This style of layout is a cornerstone for portability between platforms. The top and middle layers contain pure C code – no Windows function calls, no DirectX calls – everything is self contained code that could compile on any C compiler on any platform. All platform specific code is kept in the bottom layer – this is where your main or WinMain is – this is where DirectX or Linux kernel calls would be used. In MAME terms, this is OSD layer – operating system dependent. This layer makes porting MAME simplistic – if you want to port MAME to a new platform, perhaps a PDA, you just plug in a new set of OSD calls that meet the interface required by the middle layer and you are there. No changes to the middle layer or game layer are required. A lot of you will probably already be familiar with this style, and a lot of you, especially games programmers, will be aware of how, over time, the platform independent layer can ‘creep’ into the layers above it, destroying modularity and creating unwanted coupling between layers. Usually, this is caused by a programmer wanting a quick fix to a solution rather than finding the clean solution. How many times have you seen a sly #ifdef WIN32 sneaked into a supposed multi-platform project? MAME is currently handling this potential problem well, with absolutely no platform specific defines in the 1300 or so files that make up the game layer. There are currently a handful of platform/compiler specific defines in the middle core layer.

Let’s have a look at the actual layout of the source files.

VC7 solution explorer
Src/ Core cross-platform files.
Src/cpu
Src/sound
Core CPU & sound chip emulations. These are cross-platform and shared by different machine emulations.
Src/drivers
Src/vidhrdw
Src/machine
Src/sndhrdw
These are the individual ‘machine’ emulations, in other words, the ‘game drivers’. The process of adding a new game to MAME generally involves creating a new set of machine emulation files. For reasons of readability, the components of the driver are split between up to 4 files each with the same filename in a different directory. The ‘driver’ file contains C code and data related to the definition of the hardware (ROMS, memory maps, CPUs, graphics layouts). The ‘vidhrdw’ file encapsulates the functions needed to emulate the video hardware of the machine. In a similar way, the ‘sndhrdw’ file encapsulates functions related to sound hardware. The ‘machine’ file includes anything else left over, e.g., NVRAM support, or a real-time clock.

Source

The first thing to realise about the MAME source is that it is a C project – not C++. Although emulation, and especially emulation of multiple machines and processors, seems perfect for object-oriented design, C++ was not a massively widespread language in 1996 when initial MAME development began. C++ also had the stigma of being ‘slower’ than C which was an issue when even the simplest emulation would push a 486-DX or Pentium-66 to its limits. Nowadays, performance is not an issue for emulation of most machines, and MAME policy is to always ensure accurate emulation ahead of performance, but there are no plans to convert the source tree to C++. Instead, during a number of source refactors over the years, heavy macro use has been introduced, some of which is designed to mimic some object-oriented features.

Let’s look at how a MAME ‘driver’ is put together; the most important part is the emulation of the ‘machine’.

Machine Definition & Construction

A ‘machine’ is defined as follows, it has a name, at least one processor capable of running a binary program, information on the memory map of that processor and interrupts to it, information on the video output of that machine (resolution, frames per second, etc.) and information on the sound output of that machine (mono, stereo, etc). From src/drivers/m90.c:

static MACHINE_DRIVER_START( bombrman )
 
    /* basic machine hardware */
    MDRV_CPU_ADD(V30,32000000/4)         /* NEC V30 CPU @ 8 MHz */
    MDRV_CPU_MEMORY(readmem,writemem)    /* Pointers to memory maps */
    MDRV_CPU_PORTS(readport,writeport)   /* Pointers to memory mapped ports */
    MDRV_CPU_VBLANK_INT(m90_interrupt,1) /* Vertical blank callback function */
 
    MDRV_CPU_ADD(Z80, 3579545)    /* Zilog Z80 @ 3.579545 MHz */
    MDRV_CPU_FLAGS(CPU_AUDIO_CPU) /* CPU handles audio only */    
    MDRV_CPU_MEMORY(sound_readmem,sound_writemem)
    MDRV_CPU_PORTS(sound_readport,sound_writeport)
    MDRV_CPU_VBLANK_INT(nmi_line_pulse,128)
 
    MDRV_FRAMES_PER_SECOND(60)       /* Video display runs at 60fps */
    MDRV_VBLANK_DURATION(DEFAULT_60HZ_VBLANK_DURATION)
    MDRV_MACHINE_INIT(m72_sound)     /* Pointer to init function */
 
    /* video hardware */
    MDRV_VIDEO_ATTRIBUTES(VIDEO_TYPE_RASTER) /* Raster monitor */
    MDRV_SCREEN_SIZE(64*8, 64*8)     /* Video output size in pixels */
    MDRV_VISIBLE_AREA(10*8, 50*8-1, 17*8, 47*8-1) /* Visible viewport */
    MDRV_GFXDECODE(gfxdecodeinfo)    /* Pointer to gfx decode information */
    MDRV_PALETTE_LENGTH(512)         /* Video palette length */
 
    MDRV_VIDEO_START(m90)            /* Pointer to video init function */
    MDRV_VIDEO_UPDATE(m90)           /* Pointer to main renderer function */
 
    /* sound hardware */
    MDRV_SOUND_ADD(YM2151, ym2151_interface) /* Yamaha 2151 soundchip */
    MDRV_SOUND_ADD(DAC, dac_interface)    /* DAC used for sound samples */
MACHINE_DRIVER_END

When the macros are untangled, this boils down to a static function that ‘constructs’ a machine struct that can be handled by the emulation subsystems.

// #define MACHINE_DRIVER_START(game)                               
    void construct_##game(struct InternalMachineDriver *machine) 
    {                                                              
        struct MachineCPU *cpu = NULL;                
        (void)cpu;

// #define MDRV_CPU_ADD_TAG(tag, type, clock)                
    cpu = machine_add_cpu(machine, (tag), CPU_##type, (clock));        

 
// #define MDRV_CPU_ADD(type, clock)                        
    MDRV_CPU_ADD_TAG(NULL, type, clock)                                
    
// #define MDRV_CPU_MEMORY(readmem, writemem)                
    if (cpu)                                        
    {                                        
        cpu->memory_read = (readmem);                    
        cpu->memory_write = (writemem);                    
    }

This method has some interesting side effects - machines can share resources such as memory maps and renderer functions. Machines can also inherit from one another by layering constructors – suppose I wanted to specialise the above machine with a variant that added stereo sound.

static MACHINE_DRIVER_START( bombrman_stereo )
            MDRV_IMPORT_FROM(bombrman)
            MDRV_SOUND_ATTRIBUTES(SOUND_SUPPORTS_STEREO)
MACHINE_DRIVER_END

The macro initially constructs the base machine, but then any further parameters in the machine definition override the original specification:

#define MDRV_IMPORT_FROM(game) 
            construct_##game(machine);

Memory Maps

Let’s have a look at how a memory map is defined:

static MEMORY_WRITE_START( writemem )
            { 0x00000, 0x7ffff, MWA_ROM },
            { 0xa0000, 0xa3fff, MWA_RAM },
            { 0xd0000, 0xdffff, m90_video_w, &m90_video_data },
            { 0xe0000, 0xe03ff, paletteram_xBBBBBGGGGGRRRRR_w, &paletteram },
            { 0xffff0, 0xfffff, MWA_ROM },
MEMORY_END

The idea of a memory map is familiar to most low-level programmers. Here, it is easy to see how the map is broken up into chunks. From the base up to the half-megabyte range (0x80000 bytes or 512k), the CPU addresses the program ROM, like the BIOS ROM in a PC. From 0xa0000 to 0xa3fff, there is general purpose RAM (16k). From 0xd0000 to 0xdffff is the video ram, just as VGA ram is mapped on a PC. Here, a callback function is specified m90_video_w – this function is executed whenever the CPU writes to this area of memory. This is how the renderer (video processor emulation) can keep track of video commands/data issued by the CPU. A pointer to the raw memory block is also created here - m90_video_data. The video emulation can access the emulated video memory directly through it. In a similar manner at 0xe0000 in the memory map, an area of memory is defined as the colour palette area. Here, a function is specified that handles mapping of machine colour information (BBBBBGGGGGRRRRR or B5G5R5 colour format) into the platform independent 24-bit palette system used internally by MAME. Again, a pointer is initialised to the base of this block of memory for use elsewhere. Finishing the memory map off is another ROM area, this time containing the processor boot vector. As the NEC V30 processor has a 20-bit address bus, you will notice the highest addressable location is 0xfffff. You may also notice there are some memory chunks that are not listed – these are literally undefined – the CPU does not address them.

Let’s look at some of the macros that put the memory map together:

#define MEMORY_WRITE16_START(name) \
    MEMPORT_ARRAY_START(Memory_WriteAddress16, name, \
    MEMPORT_DIRECTION_WRITE | MEMPORT_TYPE_MEM | MEMPORT_WIDTH_16)
 
#define MEMORY_END    MEMPORT_ARRAY_END
 
#define MEMPORT_ARRAY_START(t,n,f) const struct t n[] = \
                                                  { { MEMPORT_MARKER, (f) },
#define MEMPORT_ARRAY_END    { MEMPORT_MARKER, 0 } };
 
struct Memory_WriteAddress16
{
        offs_t start, end;   /* start, end addresses, inclusive */
mem_write16_handler handler; /* handler callback */
    data16_t **    base;     /* receives pointer to memory (optional) */
        size_t * size;       /* receives size of memory in bytes (optional) */
};
 
typedef write16_handler    mem_write16_handler;
typedef void (*write16_handler)(UNUSEDARG offs_t offset, 
                                UNUSEDARG data16_t data, 
                                UNUSEDARG data16_t mem_mask);

Wow, that’s quite a mess of defines & typedefs! It pretty much boils down to a const array of ‘memory_writeaddress16structs, with a special dummy struct at the head of the array with flags in it, and a dummy struct at the end as a signal to the map parser. There are two important things to note though, the fact structs can be partially initialised means the map in driver can include optional elements. If the map requires a base pointer and size variable initialised, it can specify them. If not, then there is no untidiness in the map definition and it is clean and easy to read (at the expense of complex macros). The other thing is that there is strict type safety in the callback functions – there is a different callback prototype for each possible type – reading or writing – and for different data bus width sizes. For the 16-bit callback function above, the parameters passed in are the offset within the memory block, the data itself, and a mask parameter (for the case of a 16-bit bus only writing a byte for example). The use of these callback functions is again made simple for the driver code with macros, e.g.:

WRITE16_HANDLER( m90_video_w )
{
      /* Code */
}

Video & Audio

So where are we at? We have loosely shown how a MAME machine driver is defined in terms of the processors it uses and the memory maps involved. There are, in fact, currently over 1700 different machines defined in the MAME source tree. What else is needed to define a machine? Audio and video.

The sound & video emulation is where the raw data produced by running the CPU virtual machine is mapped into a form suitable for output on the target platform. From an engineering point of view, this actually happens in two stages – the game driver assembles a platform independent video frame and sound stream, making use of MAME core engine functions if need be, these are then given to the core which then passes them onto platform specific backend code. Under Windows, DirectX is used to present the video & sound to the user.

In the machine definition above, we specified the machine ran at 60fps, and we specified a video update callback. This essentially means that the callback is executed 60 times a second – at each instant, it must provide a video frame based on the current state of emulated video memory. The video update routines typically have access to all of the memory and variables defined in the memory map. An update function might be defined as:

VIDEO_UPDATE( m90 )
{
fillbitmap(bitmap,Machine->pens[0],cliprect); /* Erase bitmap with pen 0 */
m90_drawsprites(bitmap,cliprect); /* Draw sprites onto bitmap */
}

Yes, another macro:

#define VIDEO_UPDATE(name)  \
      void video_update_##name(struct mame_bitmap *bitmap, \
      const struct rectangle *cliprect)

Like read & write memory handlers, all video update functions have implicit arguments enforced by the macro - the bitmap to draw onto, and a clip region that must be respected.

Each machine in MAME will generally have its own video update function as very few machines draw their graphics in the same way.

Machine Destruction

Above, we discussed how macros were used for construction of emulated machines to run. What wasn’t discussed was destruction of machines – in fact, this is handled quite easily. When switching machines, absolutely everything is destroyed – all allocations used in building a machine and made by its associated driver code go through an auto-destructing allocator. All allocations are recorded internally, and the allocator is told to clean itself on machine switches. Memory leaks in driver code are impossible with this method, as the MAME core owns all allocations.

VCMAME

The standard MAME source code package from http://www.mame.net/ is currently intended to be compiled with GCC3.0 and as you would expect is built on a series of makefiles that control compilation and build options. VCMAME is a set of Visual C project files that allow MAME to be built using the MS Visual C IDE/compiler. Currently VC6, VC.NET 2002 and VC.NET 2003 are supported.

Build options are split between the VC project settings and a special configuration file /src/vc/vcmame.h. vcmame.h sets up the project wide pre-processor definitions for which CPU & sound cores are to be included in the build, and also controls the compiler warning levels. To avoid extensive changes to the MAME source, this file is automatically included by all other source files, using the VC force include flag (/FI).

There are three different project configurations – Debug, Release and Dev Release. The first two are largely obvious, Debug not only includes debug information and symbols, but also the integrated MAME debugger for debugging emulated targets. Release is built for speed and takes advantage of various compiler optimisations. Dev Release is a hybrid of the two – it is built for speed and so is optimised and doesn’t include program debug information but it does include the integrated MAME debugger. This is useful in situations where you are not actually debugging MAME itself but instead trying to debug an emulated program. Quite often, in this situation, you want the emulated target to run at full speed and this is often not possible in a standard Debug build.

For performance reasons, there is some x86 assembly code within the MAME source tree. Equivalent C source is always provided so MAME can be compiled on non-x86 platforms but this doesn’t apply to VCMAME, as it is x86 only. There is a big difference in the handling of inline assembly between VC and GCC – GCC uses AT&T assembler syntax, while VC uses Intel style assembler. For small chunks of code, the_MSC_VER definition is used to provide both versions, e.g., from src/windows/osinline.h:

#ifdef _MSC_VER
 
#define vec_mult _vec_mult
INLINE int _vec_mult(int x, int y)
{
    int result;
 
    __asm {
        mov eax, x
        imul y
        mov result, edx
    }
 
    return result;
}
 
#else
 
#define vec_mult _vec_mult
INLINE int _vec_mult(int x, int y)
{
  int result;
  __asm__ (
            "movl  %1    , %0    ; "
            "imull %2            ; "    /* do the multiply */
            "movl  %%edx , %%eax ; "
            :  "=&a" (result)           /* the result has to go in eax */
            :  "mr" (x),                /* x and y can be regs or mem */
               "mr" (y)
            :  "%edx", "%cc"            /* clobbers edx and flags */
 );
  return result;
}
#endif /* _MSC_VER */

For more than a few lines of assembly, maintaining both styles can be very time consuming, and can obviously also be error-prone and stunt development. For large assembly chunks, the code is extrapolated to a separate assembly file (i.e., not inlined), and the free Netwide Assembler (NASM) is used to compile the source on both GCC & VC platforms. In VCMAME, a custom build step is used for each assembly file.

Custom build step

Performance

In terms of performance, there is currently little to choose between the VC and GCC compilers, at least on Athlon hardware. I performed some tests using the VC7.1 compiler (standard VCMAME release mode optimisation setup) and GCC3.2.2 compiler (standard release build as set up in usual MAME makefiles and also MAME ‘Pentium Pro’ build).

The following command line switches were used to run the emulation unthrottled for a set number of frames (1500) and to time how long it took to execute those frames:

Mame <game> -dd -noafs -nothrottle -ftr 1500 -r 800x600x32 -window 
     -refresh 60 -nowaitvsync -norc -nosleep -effect none -skip_disclaimer 
     -skip_gameinfo -noart -nobezel -nooverlay

Each of the three builds was run three times each and the results averaged. Hardware used was a 1GHz Athlon, 512 Meg PC133 RAM, GeForce 3 video.

Game VCMAME MAME MAMEPP
Double Dragon 93 fps 91 fps 92 fps
Twinbee Yahhoo 84 fps 82 fps 84 fps
Landmaker 60 fps 57 fps 61 fps
1943 120 fps 118 fps 119 fps

Conclusions

Hopefully, this article has explained the inner-workings of MAME a little, it’s not meant as an exhaustive guide, just a primer. If you would like to try and build MAME using Visual C, then first download the source package from http://www.mame.net/, at the time of writing, the current version is 0.72. Then, download the VCMAME package from http://www.vcmame.net/ and apply over the base source package. If you follow the instructions provided with VCMAME, you should be building in no time at all.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here