Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / C

Linux Ethernet Driver using Qemu

5.00/5 (9 votes)
1 Apr 2016GPL311 min read 33.8K   376  
This article goes into the details of how to write a Linux Device Driver for a pseudo Ethernet device simulated on a Qemu platform.

Introduction

Qemu is a very powerful platform for hw/sw codesign and development. In this article we develop an ARM based system with a pseudo Ethernet device connected to an ARM-A9 processor on AXI bus. We will then run Linux Operating system on this virtualized machine. A simulated Ethernet Device as well as the corresponding Ethernet Linux driver are developed from scratch to showcase the ease with which a full system can be simulated with Qemu.

Getting the base system up

This article makes use of Qemu which is an open source machine emulator. Since our intention is to showcase simulation of Ethernet Device and the associated Device driver in Linux, we will make use of a widely used platform "ARM Versatile Express" as the base and Linux Kernel 3.2 built for this platform.

Download linaro toolchain for ARM

Since we are using ARM Verstatile Express as the base platform, we need to cross compile the linux sources & busybox utility using arm-gcc compiler. We use Linaro arm tool chain for cross compilation. If you are using Ubuntu machine, the following command will install the toolchain.

sudo apt-get install gcc-arm-linux-gnueabi

Download and build qemu sources

Download the qemu sources from http://wiki.qemu.org/Download. Since we are going to enhance the ARM Vexpress platform by adding our simulated Ethernet Device, we need to use the source package. For this project I have used qemu-2.5.0. Any other version might need porting the Qemu pseudo-ethernet emulation code.

Download the qemu-2.5.0 sources into ~/dwnld_dir, create a work dir & execute the following commands. Here we are building the Qemu for Arm archiecture. We don't need to give the platform name while building. We need to give the platform name at the time of invoking the Qemu emulator.

mkdir ~/work
cd work
wget http://wiki.qemu-project.org/download/qemu-2.5.0.tar.bz2 
tar -xjvf qemu-2.5.0.tar.bz2
cd qemu-2.5.0/
./configure --target-list=arm-softmmu
make
ls arm-softmmu/qemu-system-arm

Download and build Linux Kernel sources

Download linux kernel sources from http://www.kernel.org and build it for ARM Vexpress board. Make sure that the compressed Linux image zImage is present in arch/arm/boot/ directory.

cd ~/work
wget http://www.kernel.org/pub/linux/kernel/v3.0/linux-3.2.tar.bz2
tar -xjvf linux-3.2.tar.bz2 
export ARCH=arm
export CROSS_COMPILE=arm-linux-gnueabi-
cd linux-3.2/
make vexpress_defconfig
make all
ls arch/arm/boot/zImage

Download and build Busybox sources

We will need a root file system. We can either use a pre-built root file system for this platform or build one ourselves. We will build a skeletal file system using Busybox. Here are the instructions for creating the skeletal root file system. Please use the menuconfig option to select "Busybox Settings" --->"Build Options" ---> "Build Busybox as a static library". After the filesystem is built, we are copying it to the work directory from where we are going to invoke the Qemu emulator.

Note: With this busybox version you will get a link error. Please unselect the option "Networking Utilities" --->"inetd" ---> "Support RPC Services" from menuconfig.

cd ~/work
wget http://www.busybox.net/downloads/busybox-1.24.1.tar.bz2
make ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- defconfig
make ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- menuconfig
make ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- 
make ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- install
cd _install/
cp ../rcS etc/init.d
mkdir proc sys dev etc etc/init.d
chmod +x etc/init.d/rcS 
find . | cpio -o --format=newc > ../rootfs.img
cp ../rootfs.img ~/work

Here we are using an rcS file in the root file system, the contents are given below. The rcS mounts the proc and sysfs.

Bash
#!/bin/sh
mount -t proc none /proc
mount -t sysfs none /sys
/sbin/mdev -s

Running the base system

Now our system is ready. We will run it with the following command.

./qemu-2.5.0/arm-softmmu/qemu-system-arm  -M vexpress-a9 -m 256M -kernel ./linux-3.2/arch/arm/boot/zImage -initrd rootfs.img -append "root=/dev/ram rdinit=/sbin/init ip=dhcp console=ttyAMA0" -net user,hostfwd=tcp::2222-:22 -net nic,model=lan9118 -nographic
Here we are running the Qemu in nographics mode. The system will boot and you will get a prompt. The Qemu uses a virtual COM port and hence you need to do "Ctrl+A", x if you want to close the session as in the case with minicom.
Sending DHCP requests ., OK
IP-Config: Got DHCP answer from 10.0.2.2, my address is 10.0.2.15
IP-Config: Complete:
     device=eth0, addr=10.0.2.15, mask=255.255.255.0, gw=10.0.2.2,
     host=10.0.2.15, domain=, nis-domain=(none),
     bootserver=10.0.2.2, rootserver=10.0.2.2, rootpath=
Freeing init memory: 172K
input: AT Raw Set 2 keyboard as /devices/mb:kmi0/serio0/input/input0

Please press Enter to activate this console. input: ImExPS/2 Generic Explorer Mouse as /devices/mb:kmi1/serio1/input/input1

/ # ls
bin      etc      lib      proc     sbin     sys
dev      include  linuxrc  root     share    usr
/ # 
This system comes with the default nic model lan9118. Since the network model in Qemu uses SLIRP, we can't use ICMP messages. Hence ping will not work. We can use a simple socket server/client program to test that networking between the Qemu emulator and the host PC works fine. We can even use Iperf compiled for this platform (only client modes will work from the Qemu Device).

Running simple socket application

We will now run a simple tcp_client application. Cross compile the tcp_client.c application and put it in the skeletal root file system.

arm-linux-gnueabi-gcc -g -Wall -g -o tcp_client.out  tcp_client.c -static
cp tcp_client.out ~/work/busybox-1.24.1/_install/bin/
cd ~/work/busybox-1.24.1/_install
find . | cpio -o --format=newc > ../rootfs.img
cp ../rootfs.img ~/work

Open a TCP server example application on your host PC. Start the Qemu, and run the /bin/tcp_client.out. You will see TCP client establishing connection with the TCP server and sending packets.

Pseudo Ethernet Device

We will now model a psuedo Ethernet device. The Ethernet Device has an external PHY connected through the MII & MDIO interface. The Ethernet packets are represented by Descriptors which contain information about the packet size and exact memory location where the packet data is present. There are 4 kinds of descriptors, 2 types  on the transmit side and 2 on the receive side. These descriptors are maintained in their appropriate fixed size queues. The driver has to allocate memory for all the descriptors and the queues. The h/w is notified of the presense of these queues by the driver when it programs the base address of these queues. 

Tx Descriptor Queue

This queue contains the descriptors representing the Ethernet frames that must be transmitted out by the hardware. The queue contains two pointers contained in the register of the Ethernet Device, the Head pointer maintained by the Device Driver, the tail pointer maintained by the Ethernet Device. When the Device driver wants to transmit a frame, it constructs a Transmit Descriptor and puts it in the place pointed to by the Head Pointer register and updates the Head Pointer register by one. The Transmit side of the Ethernet hardware continuously looks for presense of any packets to transmit as indicated by Head Pointer register not equal to Tail Pointer register. It then retrieves the descriptor pointed to by the Tail Pointer, retrieves the packet from the Tx Descriptor and sends it on the wire.

Tx Done Descriptor Queue

This queue contains the descriptors representing the Ethernet Frames that have been sent by the hardware. The queue contains two pointers contained in the register of the Ethernet Device, the Head Pointer maintainted by the Ethernet Device and the tail pointer maintainted by the Device Driver. When a packet represented by the Transmit Descriptor in the Tx Descriptor Queue is successfully sent on wire by the Ethernet Device, it constructs a Tx Done Descriptor and puts it in the Tx Done Descriptor Queue in the place pointed to by the Head Pointer. It then raises a Transmit Done interrupt. The Device Drivr ISR function looks at the Tx Done Descriptors one by one using the Tail pointer and releases the socket buffers used by these packets.

Rx Empty Descriptor Queue

This queue contains the descriptors representing the Empty Ethernet buffers onto which the hardware can put the received ethernet data. The queue contains two pointers contained in the register of the Etherent Device. The head pointer is maintained by the Device driver and the tail pointer is maintained by the Ethernet Device. During initialization of the Etherent Device the Device driver queues up enoudh empty packet buffers so as to fill this queue. When the Receive side of the ethernet device sees a Rx frame on wire, it takes an empty buffer pointed to by the Tail pointer of this queue, copies the data to it and advances the tail pointer. The driver periodically checks if there are enough empty buffers present in this queue, and replenishes the buffers if it is below some threshold.

Rx Done Descriptor Queue

This queue contains the descriptors representing the Received Ethernet packets. The queue contains two pointers contained in the register of the Ethernet Device. The head pointer maintained by the Hardware and the tail pointer maintained by the Device driver. Whenever the hardware receives a frame, it take an empty buffer pointed to by the tail pointer of Rx Empty Descriptor, fills the buffer with received data, constructs a Rx Done Descriptor pointing to this received ethernet packet and puts the descriptor in the Rx Done Descriptor Queue at a place pointed to by the head. It then increments the head position. It raises a Rx Done interrupt. The Device Driver ISR sees the presense of a new received frame. It takes the ethernet packet pointed to by the Tail Pointer of the Rx Done Descriptor and sends it up to the TCP/IP stack. It increments the tail pointer. It repeats the above steps till all the packets are sent up.

----- +                     +------------------------+
      |                     |         MAC            |
      |    Tx Desc Queue    |    +--------------+    |
      |     =============   |    |              |    | 
      | ===> XXXXXXXXXX ==> |    |              |    |       +-----------+
      |     =============   |    |              |    |       |           |
      |                     |    |   Transmit   |    |       |           |
      |  Tx Done Desc Queue |    |              |    |       |           |
      |     =============   |    |              |    |       |           |
      | <=== XXXXXXXXXX <== |    |              |    |       |           |
      |     =============   |    +--------------+    |  MDIO |   Phy     |
 C    |                     |                        | <===> |           |
 P    |                     |                        |  MII  |           |
 U    | Rx Empty Desc Queue |    +--------------+    | <===> |           |
      |     =============   |    |              |    |       |           |
      | ===> XXXXXXXXXX <== |    |              |    |       |           |
      |     =============   |    |              |    |       |           |
      |                     |    |   Receive    |    |       |           |
      |  Rx Done Desc Queue |    |              |    |       |           |
      |     =============   |    |              |    |       |           |
      | <=== XXXXXXXXXX <== |    |              |    |       +-----------+
      |     =============   |    +--------------+    |
      |                     |                        |
------+                     +------------------------+

Pseudo Ethernet Device Model in Qemu

In this section I will breifly describe how the psuedo NIC is modelled in Qemu.

The model starts with the function skel_eth_device_register_types(). This function registers a new device of type skel_eth_device_info.

static const TypeInfo skel_eth_device_info = {
    .name          = TYPE_SKEL_ETH_DEV,
    .parent        = TYPE_SYS_BUS_DEVICE,
    .instance_size = sizeof(skel_eth_device_state),
    .class_init    = skel_eth_device_class_init,
};

static void skel_eth_device_register_types(void)
{
    type_register_static(&skel_eth_device_info);
}

In the Qemu Vexpress emulation main file in the function vexpress_common_init() we are calling the skel_eth_device_init() function which creates this NIC model, maps its register space at the given base address and finally, connects to the required IRQ pin.

void skel_eth_device_init(NICInfo *nd, uint32_t base, qemu_irq irq)
{
    DeviceState *dev;
    SysBusDevice *s;
    qemu_check_nic_model(nd, TYPE_SKEL_ETH_DEV);
    dev = qdev_create(NULL, TYPE_SKEL_ETH_DEV);
    qdev_set_nic_properties(dev, nd);
    qdev_init_nofail(dev);
    s = SYS_BUS_DEVICE(dev);
    sysbus_mmio_map(s, 0, base);
    sysbus_connect_irq(s, 0, irq);
}

The Qemu core emulation layer calls the class specific init function of our device skel_eth_device_init1() that we had registered at the time of doing class_init(). The skel_eth_device_init1() maps the read/write hook functions to handle register read/write from the CPU to its register address space. 

static const MemoryRegionOps skel_eth_device_mem_ops = {
    .read = skel_eth_device_readl,
    .write = skel_eth_device_writel,
    .endianness = DEVICE_NATIVE_ENDIAN,
};

static int skel_eth_device_init1(SysBusDevice *sbd)
{
    const MemoryRegionOps *mem_ops = &skel_eth_device_mem_ops;
    memory_region_init_io(&s->mmio, OBJECT(dev), mem_ops, s, "skel_eth_device-mmio", 0x100);
    return 0;
}

The rest of the logic is fairly simple. The skel_eth_device_readl() has a big switch->case statement to take care of register reads.

static uint64_t skel_eth_device_readl(void *opaque, hwaddr offset,
                              unsigned size)
{

    skel_eth_device_state *s = (skel_eth_device_state *)opaque;
    switch (offset) {
        case ETH_CTRL:
            return s->eth_ctrl;
        case MAC_ADDR_HIGH:
            return s->mac_addr_high;
        case MAC_ADDR_LOW:
            return s->mac_addr_low;
        case MGMT_FRM:
            //Read/Write PHY register here
            return s->mac_mii_data ;
...
}

The same is true about skel_eth_device_writel() function.

static void skel_eth_device_writel(void *opaque, hwaddr offset,
                           uint64_t val, unsigned size)
{
    skel_eth_device_state *s = (skel_eth_device_state *)opaque;
    offset &= 0xff;
    switch (offset) {
        case ETH_CTRL:
            if (((s->eth_ctrl & 1) != (val&1)) && ((val&1) == 1))
            {
                skel_eth_device_reset(s);
            }
            s->eth_ctrl=val;
            break;

        case MAC_ADDR_HIGH:
            s->mac_addr_high=val;
            s->conf.macaddr.a[4] = val & 0xff;
            s->conf.macaddr.a[5] = (val >> 8) & 0xff;
            skel_eth_device_mac_changed(s);
            break;

        case MAC_ADDR_LOW:
            s->mac_addr_low=val;
            s->conf.macaddr.a[0] = val & 0xff;
            s->conf.macaddr.a[1] = (val >> 8) & 0xff;
            s->conf.macaddr.a[2] = (val >> 16) & 0xff;
            s->conf.macaddr.a[3] = (val >> 24) & 0xff;
            skel_eth_device_mac_changed(s);
            break;
...
}

Transmitting a packet

The transmission of a packet is triggered when the Device Driver writes a new Transmit Descriptor into the Tx Descriptor Queue. The writel function checks if the Queue is non-empty and calls the do_tx_packet() function.

static void skel_eth_device_writel(void *opaque, hwaddr offset,
                           uint64_t val, unsigned size)
{
    skel_eth_device_state *s = (skel_eth_device_state *)opaque;
    offset &= 0xff;
    switch (offset) {
        case TX_DESC_FIFO_HEAD:
            s->tx_desc_fifo_head=val;
            if(s->tx_desc_fifo_head != s->tx_desc_fifo_tail)
            {
                do_tx_packet(s);
            }

The do_tx_packet() is listed below. It gets the current Tx Descriptor, copies the packet data over to a temporary holding buffer and calls the qemu_send_packet() to send it over the wire. After that it constructs a Tx Done descriptor and puts it in the Tx Done Descriptor queue. It advances the Tx Descriptor Tail pointer and Tx Done Descriptor Head pointer. Finally, it asserts the tx_complete interrupt.      

static void do_tx_packet(skel_eth_device_state *s)
{
    while(s->tx_desc_fifo_head != s->tx_desc_fifo_tail)
    {
        // Get the current tx_desc
        tx_desc = s->tx_desc_base_low + (s->tx_desc_fifo_tail * sizeof(tx_desc_t));
        cpu_physical_memory_read(tx_desc, &tx_desc_local, sizeof(tx_desc_t));
        cpu_physical_memory_read(tx_desc_local.u.tx_buf_address, buf,
                                        tx_desc_local.u.buffer_length);
        qemu_send_packet(qemu_get_queue(s->nic), (const uint8*)(buf),
                                        tx_desc_local.u.buffer_length);
        // Move this descriptor to 
        memset(&tx_done_desc_local, 0, sizeof(tx_done_desc_t));
        tx_done_desc_local.u.send_done=1;
        tx_done_desc_local.u.index = s->tx_desc_fifo_tail; 

        // Get the pointer to current tx_done_desc
        tx_done_desc = s->tx_done_desc_base_low +
                        (s->tx_done_desc_fifo_head * sizeof(tx_done_desc_t));

        // Copy the updated tx_done_desc to the actual location
        cpu_physical_memory_write(tx_done_desc, &tx_done_desc_local, sizeof(tx_done_desc_t));

        // Advance the tx_desc_fifo_tail & tx_done_desc_fifo_head
        ...
    }
    // Assert Transmit complete interrupt
    s->int_sts1 |= 1;
    skel_eth_device_update(s);
}

Receiving a packet

Receiving a packet takes a slightly different path. We have registered a receive function skel_eth_device_receive() as part of our NIC model, while creating the NIC and registering with the Qemu emulator. Whenver the system receives an ethernet frame, this registered function is called with the received packet as the argument. 

static ssize_t skel_eth_device_receive(NetClientState *nc, const uint8_t *buf,
                               size_t size)
{
    // Get the current rx_empty_desc
    rx_empty_desc = s->rx_empty_desc_base_low +
                        (s->rx_empty_desc_fifo_tail * sizeof(rx_empty_desc_t));
    cpu_physical_memory_read(rx_empty_desc, &rx_empty_desc_local, sizeof(rx_empty_desc_t));

    // Copy the payload to the buffer pointed to by the rx descriptor
    cpu_physical_memory_write(rx_empty_desc_local.u.rx_buf_address, buf, size);

    // Update rx_done_desc descriptor
    memset(&rx_done_desc_local, 0, sizeof(rx_done_desc_t));
    rx_done_desc_local.u.receive_done=1;
    rx_done_desc_local.u.index = s->rx_empty_desc_fifo_tail; 
    rx_done_desc_local.u.cur_buf_length = size;

    // Get the pointer to current rx_done_desc
    rx_done_desc = s->rx_done_desc_base_low +
                (s->rx_done_desc_fifo_head * sizeof(rx_done_desc_t));
    cpu_physical_memory_write(rx_done_desc, &rx_done_desc_local, sizeof(rx_done_desc_t));

    // Advance the rx_done_desc_fifo_head and tx_desc_fifo_tail
    ...

    // Assert interrupt complete interrupt
    s->int_sts1 |= 2; //Rx-Queue Complete interrupt
    skel_eth_device_update(s);
    return size;
}

Ethernet Driver

In this section I will breifly explain how the Device Driver for the Psuedo Ethernet device is written. I have kept the Driver Bare minimal by implementing only the important functions. Hence the driver is by no-means complete.

Modifications to the Linux Kernel Makefile to include our driver

The ARM Vexpress platform comes with smsc911 driver. We will simply remove this driver and put our skeletal etherenet driver in its place. (The right way to do is to change the config file to add an entry for skeletal ethernet driver and select it. Since out focus here is to explain the Etherent Driver model and the associated driver, we are taking a short cut here)

gvim linux-3.2/drivers/net/ethernet/smsc/Makefile

#obj-$(CONFIG_SMSC911X) += smsc911x.o <--commented this line and added the following line
obj-$(CONFIG_SMSC911X) += skel_eth_drv.o

We also change the V2M Core BSP file to use the skeletal device instead of the smsc911x device.

gvim linux-3.2/arch/arm/mach-vexpress/v2m.c

static struct platform_device v2m_eth_device = {
    //.name        = "smsc911x", <-- Commented this line and added the following line
    .name        = "skel_eth_dev",
    .id        = -1,
    .resource    = v2m_eth_resources,
    .num_resources    = ARRAY_SIZE(v2m_eth_resources),
    .dev.platform_data = &v2m_eth_config,
};

Writing Ethernet Device driver

In this section we will go into the details of writing the Driver for our psuedo Ethernet Device.

The driver starts with the platform_driver_register() which is called from our init_module function.

static struct platform_driver skel_eth_driver = {
    .probe = skel_eth_drv_probe,
    .remove = skel_eth_drv_remove,
    .driver = {
        .name    = SKEL_ETH_NAME,
        .owner    = THIS_MODULE,
        .pm    = SKEL_ETH_PM_OPS,
        .of_match_table = of_match_ptr(skel_eth_dt_ids),
    },
};

/* Entry point for loading the module */
static int __init skel_eth_init_module(void) {
    return platform_driver_register(&skel_eth_driver);
}

Since we have already registered this Device from our BSP file v2m.c, the kernel sees that this driver matches the alreday registered device and calls the corresponding probe function.

Below is the probe function for the driver. The probe function allocates the driver structure,  allocates memory for the various descriptors queues, programs the hardware with the base address of these queues, initializes the hardware for sending and receivng ethernet packets and finally, it registers its etherent device operations with the OS telling by calling register_netdev().

static const struct net_device_ops skel_eth_netdev_ops = {
    .ndo_open        = skel_eth_open,
    .ndo_stop        = skel_eth_stop,
    .ndo_start_xmit  = skel_eth_hard_start_xmit,
    .ndo_do_ioctl    = skel_eth_do_ioctl,
};

/* Initializing private device structures, only called from probe */
static int skel_eth_init(struct net_device *dev)
{
    struct skel_eth_data *pdata = netdev_priv(dev);
    unsigned int eth_ctrl;
    unsigned int to = 100;
    rx_empty_desc_t *rx_empty_desc;

    int retval;

    // Allocate memory for Tx, TxDone, RxEmpty and RxDone Descriptor queues.
    ...

    // Program the hardware base address for Tx, TxDone, RxEmpty and RxDone Descriptor queues.
    ...

    // Enable ethernet peripheral. Set up speed and mode
    ...

    // Fill the rx ring with RX EB descriptors
    skel_eth_rx_desc_refill(pdata, NUM_RX_DESC);


    ether_setup(dev);
    dev->flags |= IFF_MULTICAST;
    netif_napi_add(dev, &pdata->napi, skel_eth_poll, SKEL_ETH_DEV_NAPI_WEIGHT);
    dev->netdev_ops = &skel_eth_netdev_ops;

    pr_info(" %s DONE!!!\n", __FUNCTION__);
    return 0;
}
static int skel_eth_drv_probe(struct platform_device *pdev)
{
    struct net_device *dev;
    struct skel_eth_data *pdata;
    struct resource *res, *irq_res;
    int res_size, irq_flags;
    int retval;

    res = platform_get_resource(pdev, IORESOURCE_MEM, 0);    
    request_mem_region(res->start, res_size, SKEL_ETH_NAME);

    irq_res = platform_get_resource(pdev, IORESOURCE_IRQ, 0);

    dev = alloc_etherdev(sizeof(struct skel_eth_data));
    SET_NETDEV_DEV(dev, &pdev->dev);

    pdata = netdev_priv(dev);
    dev->irq = irq_res->start;
    irq_flags = irq_res->flags & IRQF_TRIGGER_MASK;
    pdata->ioaddr = ioremap_nocache(res->start, res_size);

    pdata->dev = dev;
    platform_set_drvdata(pdev, dev);

    pdata->ops = &standard_skel_eth_ops;

    retval = skel_eth_init(dev);

    retval = request_irq(dev->irq, skel_eth_irqhandler, irq_flags | IRQF_SHARED, dev->name, dev);

    retval = register_netdev(dev);

    retval = skel_eth_mii_init(pdev, dev);

    spin_lock_irq(&pdata->mac_lock);
    /* Check if mac address has been specified when bringing interface up */
    if (is_valid_ether_addr(dev->dev_addr)) {
        skel_eth_set_hw_mac_address(pdata, dev->dev_addr);
    } else {
        /* eeprom values are invalid, generate random MAC */
        dev_hw_addr_random(dev, dev->dev_addr);
        skel_eth_set_hw_mac_address(pdata, dev->dev_addr);
    }
    spin_unlock_irq(&pdata->mac_lock);
    return 0;
}

Transmitting a packet

The network stack calls the registered _start_xmit function whenever it needs to send an ethernet packet out. Below is the listing of the _start_xmit function. It simply updates the Tx Descriptor pointed to by the Tx Descriptor queue head pointer with the buffer address and size. It then increments the head pointer by one telling the hardware that a new packet is queued.

static int skel_eth_hard_start_xmit(struct sk_buff *skb, struct net_device *dev)
{
    struct skel_eth_data *pdata = netdev_priv(dev);
    dma_addr_t bus_addr;
    struct netdev_queue *txq;
    unsigned int nr_frags;
    tx_desc_t *tx_desc;
    u32 tx_desc_fifo_tail = 0;

    // Read the TX EB head and tail pointeres from the hardware
    tx_desc_fifo_head = skel_eth_reg_read(pdata, TX_DESC_FIFO_HEAD);

    /* total number of fragments in the SKB */
    nr_frags = skb_shinfo(skb)->nr_frags;

    // Get the current tx_desc
    tx_desc = pdata->tx_desc_base + (tx_desc_fifo_head * sizeof(tx_desc_t));

    // Update tx_desc parameters
    len = skb_headlen(skb);
    bus_addr = dma_map_single(&dev->dev, skb->data, len, DMA_TO_DEVICE);

    tx_desc->u.tx_buf_address = cpu_to_le32(bus_addr & 0xFFFFFFFF);
    tx_desc->u.buffer_length = len;

    // Increment the head pointer with nr_frags+1
    tx_desc_fifo_head = (tx_desc_fifo_head + nr_frags+1) & TX_INDEX_MASK;
    skel_eth_reg_write(pdata, TX_DESC_FIFO_HEAD, tx_desc_fifo_head);

    return NETDEV_TX_OK;
}

Receiving a packet

Receiving a packet starts when the hardware has received a complete ethernet packet and raises a Rx Done interrupt.  In the irqhandler function we simply acknowledge the Rx interrupt and also disable it. We then call the napi_schedule function to indicate the OS to call our napi poll function.

static irqreturn_t skel_eth_irqhandler(int irq, void *dev_id)
{
    int serviced = IRQ_NONE;
    struct net_device *dev = dev_id;
    struct skel_eth_data *pdata = netdev_priv(dev);
    u32 intsts;

    intsts = skel_eth_reg_read(pdata, INT_STS1);
    skel_eth_reg_write(pdata, INT_CLR1, intsts);

    if(intsts & (1<<INT_STS1_RX_QUEUE_SHIFT)) {
        if (likely(napi_schedule_prep(&pdata->napi))) {

            /* Disable Rx interrupts */
            ...

            /* Schedule a NAPI poll */
            __napi_schedule(&pdata->napi);
        } 
        serviced = IRQ_HANDLED;
    }
    return serviced;
}

Below is the listig of napi_poll function. We go through the rx_done descriptor queue, extract the required skb information from the corresponding s/w ring and call the netif_receive_skb() to send the skb to the network stack.
If we have received enough packets budgeted in our NAPI budget or we have received all available packets, we tell the napi to stop calling poll function and then we enable the rx interrupt.

static int skel_eth_poll(struct napi_struct *napi, int budget)
{
    struct skel_eth_data *pdata = container_of(napi, struct skel_eth_data, napi);
    struct net_device *dev = pdata->dev;
    int npackets = 0;
    u32 rx_done_desc_fifo_tail = 0;
    u32 rx_done_desc_fifo_head = 0;
    rx_done_desc_t* rx_done_desc;
    dma_addr_t bus_addr;

    rx_done_desc_fifo_tail = skel_eth_reg_read(pdata, RX_DONE_DESC_FIFO_TAIL);
    rx_done_desc_fifo_head = skel_eth_reg_read(pdata, RX_DONE_DESC_FIFO_HEAD);

    // Work till we have processed all the completed Rx packets or budget
    while ((npackets < budget) && (rx_done_desc_fifo_head != rx_done_desc_fifo_tail)) {
        struct sk_buff *skb;

        //Get the current rx_done_desc corresponding to this rx_done_desc_fifo_tail index
        rx_done_desc = pdata->rx_done_desc_base + (rx_done_desc_fifo_tail * sizeof(rx_done_desc_t));

        // Get the skb information from the sw ring
        skb = pdata->sw_rx_desc_info_ring[rx_done_desc->u.index].skb;
        bus_addr = pdata->sw_rx_desc_info_ring[rx_done_desc->u.index].dma_addr;
        pdata->sw_rx_desc_info_ring[rx_done_desc->u.index].skb = NULL;

        dma_unmap_single(&pdata->dev->dev, bus_addr, rx_done_desc->u.cur_buf_length, DMA_FROM_DEVICE);

        skb_put(skb, rx_done_desc->u.cur_buf_length);
        skb->protocol = eth_type_trans(skb, dev);

        // Send this skb to tcp/ip layer
        netif_receive_skb(skb);

        npackets++;

        // increment the rx_done_desc_fifo_tail and write it to h/w
        ...

        // refill the rx_desc with descriptors.
        skel_eth_rx_desc_refill(pdata, 1);
    }

    if(rx_done_desc_fifo_head == rx_done_desc_fifo_tail)
    {
        /* We processed all packets available. Tell NAPI to stop polling & re-enable rx interrupts */
        if(npackets != SKEL_ETH_DEV_NAPI_WEIGHT) {
            napi_complete(napi);
        }
       
        // Renabled the Rx interrupt
        ...
    }
    return npackets;
}

Running our new system

Now our new Emulated system is ready. We will run it with the following command.

./qemu-2.5.0/arm-softmmu/qemu-system-arm  -M vexpress-a9 -m 256M -kernel linux-3.2/arch/arm/boot/zImage -initrd rootfs.img -append "root=/dev/ram rdinit=/sbin/init ip=dhcp console=ttyAMA0" -net user,hostfwd=tcp::2222-:22 -net nic,model=skel_eth_dev -nographic

Here we are using the new ethernet model skel_eth_dev that we developed.  We can run the same tcp_client application withing our emulated system with the corresponding tcp_server running on our host pc and see that our new ethernet model is working fine.

 

 

 

License

This article, along with any associated source code and files, is licensed under The GNU General Public License (GPLv3)