Web-Controlling a Win32 App using WebSockets

Michael Chourdakis

4.71/5 (9 votes)

30 Apr 2022CPOL3 min read

10.4K

A quick way to interoperate between Win32 and a browser using WebSockets

This article shows how to remote-control your win32 application via a web browser.

Download source code

Introduction

Now that web applications are prevalent, it would seem very useful to be able to remotely control your Win32 application via an HTML application.

Background

When I visualized my now full-featured video and audio sequencer, Turbo Play, back in 2010, little was going over the remote control via a mobile device of a Windows app. Now this is a standard feature found in many audio and video related applications.

The problem however with contiguous GETS and POSTS to a HTTP server is that the overhead would be too big, especially for a realtime application. What I present here is how to use the (poorly documented) WebSocket Win32 API to demonstrate a faster method of control.

Included in the repository is a simple SOCKET wrapper and my MIME library, capable of building a small, quick web server.

Creating the Web Server

You need two sockets, one for the HTTP request and one for the WebSocket. There are four variables in webinterface.cpp:

C++

// ------- Variables
int Port = 12345;
int WebSocketPort = 12347;
std::string host4 = "";
std::string host6 = "";
// -----------------

There is a PickIP() function that would scan all your interfaces (using GetAdaptersAddresses()) and set the host4 and host6 to the first IPs (to be later used in mDNS discovery) but you can hardcode them as well.

Listening to the first port for the web server is a standard WinSock Bind and Listen. When you have the connection, you reply to the browser passing a HTML document (1.html in the repository as an example) that would contain websocket-connection code:

C++

void WebServerThread(XSOCKET y)
{
    std::vector<char> b(10000);
    std::vector<char> b3;
    for (;;)
    {
        b.clear();
        b.resize(10000);
        int rval = y.receive(b.data(), 10000);
        if (rval == 0 || rval == -1)
            break;

        MIME2::CONTENT c;
        c.Parse(b.data(), 1);

        // Get /, display 1.html
        std::string host;
        bool v6 = 0;
        for (auto& h : c.GetHeaders())
        {
            if (h.Left() == "Host")
            {
                host = h.Right();
                std::vector<char> h2(1000);
                strcpy_s(h2.data(), 1000, host.c_str());
                auto p2 = strstr(h2.data(), "]:");
                if (p2)
                {
                    *p2 = 0;
                    host = h2.data() + 1;
                    v6 = 1;
                    break;
                }
                auto p = strchr(h2.data(), ':');
                if (p)
                {
                    *p = 0;
                    host = h2.data();
                }
                break;
            }
        }
        const char* m1 = "HTTP/1.1 200 OK\r\nContent-Type: 
                          text/html\r\nConnection: Close\r\n\r\n";

        b.clear();
        ExtractResource(GetModuleHandle(0), L"D1", L"DATA", b);
        b.resize(b.size() + 1); 
        char* pb = (char*)b.data();

        char b2[200] = {};

        if (v6)
            sprintf_s(b2,200,"ws://[%s]:%i", host.c_str(), WebSocketPort);
        else
            sprintf_s(b2, 200 , "ws://%s:%i", host.c_str(), WebSocketPort);

        b3.resize(b.size() + 1000);
        strcat_s(b3.data(), b3.size(), m1);
        sprintf_s(b3.data() + strlen(b3.data()), b3.size() - 
                              strlen(b3.data()), pb, b2);
        char* pb2 = (char*)b3.data();

        y.transmit((char*)pb2, (int)strlen(pb2), true);
        y.Close();
    }
}

We must scan the headers for the "Host" to get the actual host the browser used to connect to us, then pass the 1.html which has an empty space ("%s") to fill in the ws://IP:Port. Remember that, in IPv6, you must use braces [] to the IP address.

The HTML document we would pass contains, among the standard jQuery and Bulma stuff I used here, the simple websocket code:

HTML

<script>
     var socket = new WebSocket("%s");
     socket.onopen = function (e) {
         $("#live").html("Connected");
         $("#messagex").show();
     };

     socket.onerror = function (e) {
         $("#messagex").hide();
         $("#live").html("Disonnected");
     }

     socket.onclose = function (e) {
         $("#live").html("Disconnected");
         $("#messagex").hide();
     }

     socket.onmessage = function (event) {
         var e = event.data;
         $("#received").html(e);
     }

     function message() {
         msg = prompt("Please say something...", "Hello");
         if (msg != null)
             socket.send(msg);
     }
 </script>

This creates callbacks for connection and errors, as long as message reception and sending we will send/receive to/from the Win32 application.

Creating the WebSocket Server

A WebSocket server is a HTTP server that switches protocols when a WebSocket request initiates. The good thing with the Win32 WebSocket API is that it's connection-independent. That means that you provide the data you received somehow from the browser and it returns the data you must reply to it, without knowing how you are going to reply (TCP, TLS, etc.).

The class WS in the code contains simple WebSocket functions:

C++

// Create a server side websocket handle. 
HRESULT Init() 
{ 
    return WebSocketCreateServerHandle(NULL, 0, &h);
}

Once we got a handle, we can have our websocket server accept a connection like before:

C++

void WebSocketThread(XSOCKET s)
{
    std::vector<char> r1(10000);
    for (;;)
    {
        int rv = s.receive(r1.data(), 10000);
        if (rv == 0 || rv == -1)
            break;

        std::vector<WEB_SOCKET_HTTP_HEADER> h1;
        MIME2::CONTENT c;
        c.Parse(r1.data(), 1);
        std::string host;
        for (auto& h : c.GetHeaders())
        {
            if (h.IsHTTP())
                continue;
            WEB_SOCKET_HTTP_HEADER j1;
            auto& cleft = h.LeftC();
            j1.pcName = (PCHAR)cleft.c_str();
            j1.ulNameLength = (ULONG)cleft.length();
            auto& cright = h.rights().rawright;
            j1.pcValue = (PCHAR)cright.c_str();
            j1.ulValueLength = (ULONG)cright.length();
            h1.push_back(j1);
        }

        auto& ws2 = Maps[&s];
        if (FAILED(ws2.Init()))
            break;
        std::vector<char> tosend;
        if (FAILED(ws2.PerformHandshake(h1.data(), (ULONG)h1.size(), tosend)))
            break;

This PerformHandshare calls WebSocketBeginServerHandshake, with all the headers we receive from the browser and returns the headers that we must send to the browser to initiate the WebSocket protocol. This must also begin with the "HTTP/1.1 101 Switching Protocols\r\n" message to notify the browser that we will successfully switch.

Once that is done, we can now send and receive messages. We loop in order to receive messages:

C++

std::vector<char> msg;
for (;;)
{
    int rv = s.receive(r1.data(), 10000);
    if (rv == 0 || rv == -1)
        break;

    msg.clear();
    auto hr = ws2.ReceiveRequest(r1.data(), rv, msg);
    if (FAILED(hr))
        break;
    if (msg.size() == 0)
        continue;
    msg.resize(msg.size() + 1);

    MessageBoxA(hMainWindow, msg.data(), "Message", MB_SYSTEMMODAL | MB_APPLMODAL);
}

Once we get some bytes, we pass them to ReceiveRequest() and this calls WebSocketReceive, WebSocketGetAction and WebSocketCompleteAction to decode the websocket message and return us a buffer with the actual data sent.

To send data, we call SendRequest() in a similar fashion.

C++

for (auto& m : Maps)
{
    std::vector<char> out;
    m.second.SendRequest("Hello", 5,out);
    m.first->transmit((char*)out.data(),(int)out.size(),1);
}

Note that I'm saving a map of all the websocket servers along with a WS structure in order to handle multiple connections - you have to synchronize them.

Using this technology, I'm able to create a small (yet) web control for Turbo Play:

Discovering the Service

Windows 10+ has also a ZeroConf/mDNS discovery API so you can publish the service in dns-sb. The main function is DNSServiceRegister that would publish our service:

C++

rd = {};
rd.pServiceInstance = &di;
rd.unicastEnabled = 0;
di.pszInstanceName = (LPWSTR)L"app._http._tcp.local";
di.pszHostName = (LPWSTR)L"myservice.local";
InetPtonA(AF_INET6, host6.c_str(), (void*)&i6);
di.ip6Address = &i6;
InetPtonA(AF_INET, host4.c_str(), (void*)&i4);
DWORD dword = i4;

// Hey, this IP4_ADDRESS is different than in_addr
DWORD new_dword = (dword & 0x000000ff) << 24 | (dword & 0x0000ff00) << 8 |
    (dword & 0x00ff0000) >> 8 | (dword & 0xff000000) >> 24;
i4 = new_dword;
di.ip4Address = &i4;
di.wPort = (WORD)Port;

rd.Version = DNS_QUERY_REQUEST_VERSION1;
rd.pRegisterCompletionCallback = [](DWORD Status,
    PVOID pQueryContext,
    PDNS_SERVICE_INSTANCE pInstance)
{
    DNSRegistration* r = (DNSRegistration*)pQueryContext;
    if (pInstance)
        DnsServiceFreeInstance(pInstance);
};
rd.pQueryContext = this;
auto err = DnsServiceRegister(&rd, 0);
if (err != DNS_REQUEST_PENDING)
    MessageBeep(0);

To terminate the registration, we would call DnsServiceDeRegister().

The Code

The code contains a small executable that creates a web server and a websocket server to be opened by any browser, then allows the app to send a message and the browser to receive it and reply. Have fun with it!

History

1^st May, 2022: First release

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)