In this article
- Introduction
- A Teeny-Weeny Intro to Clustering and Load Balancing
- My Idea for Dynamic Software Load Balancing
- Architecture and Implementation
- Architecture Outlook
- Assemblies & Types
- Collaboration
- Some Implementation Details
- Load Balancing in Action - Balancing a Web Farm
- Building, Configuring and Deploying the Solution
- Configuration
- Deployment
- Some thoughts about MC++ and C#
- Managed C++ to C# Translation
- C#'s readonly fields vs MC++ non-static const members
- "Bugs suck. Period."
- TODO(s)
- Conclusion
- A (final) word about C#
- Disclaimer
"Success is the ability to go from one failure to another with no loss of
enthusiasm."
Winston Churchill
Introduction
<blog date="2002-12-05">
Yay! I passed 70-320 today and I'm now MCAD.NET. Expect the next article to
cover XML Web Services, Remoting, or Serviced Components:)
</blog>
This article is about Load Balancing. Neither "Unleashed", nor
"Defined" -- "Implemented":) I'm not going to discuss in details what load balancing
is, its different types, or the variety of load balancing algorithms. I'm not
going to talk about proprieatary software like WLBS, MSCS, COM+ Load Balancing or
Application Center either. What I am going to do in this article, is present you
a custom .NET Dynamic Software Load Balancing solution, that I've implemented in
less than a week and the issues I had to resolve to make it work. Though the
source code is only about 4 KLOC, by the end of this article, you'll see, that
the solution is good enough to balance the load of the web servers in a web
farm. Enjoy reading...
Everyone can read this article
...but not everybody would understand everything. To read, and understand the
article, you're expected to know what load balancing is in general, but even if
you don't, I'll explain it shortly -- so keep reading. And to read the code, you
should have some experience with multithreading and network programming (TCP,
UDP and multicasting) and a basic knowledge of .NET Remoting. Contrarily of what
C# developers think, you shouldn't know Managed C++ to read the code. When you're
writing managed-only code, C# and MC++ source code looks almost the same with
very few differences, so I have even included a section
for C# developers which explains how to convert (most of the) MC++ code to C#.
I final warning, before you focus on the article -- I'm not a professional writer,
I'm just a dev, so don't expect too much from me (that's my 3rd article). If you
feel that you don't understand something, that's probably because I'm not a native
English speaker (I'm Bulgarian), so I haven't been able to express what I have
been thinking about. If you find a grammatical nonsense, or even a typo, report it
to me as a bug and I'll be more than glad to "fix" it. And thanks for
bearing this paragraph!
A Teeny-Weeny Intro to Clustering and Load Balancing
For those who don't have a clue what Load Balancing means, I'm about to give a short
explanation of clustering and load balancing. Very short indeed, because I
lack the time to write more about it, and because I don't want to waste the
space of the article with arid text. You're reading an article at
www.CodeProject.com, not at www.ArticleProject.com:) The enlightened may
skip the following paragraph, and I encourage the rest to
read it.
Mission-critical applications must run 24x7, and networks need to be able to
scale performance to handle large volumes of client requests without unwanted
delays. A "server cluster" is a group of independent servers managed as a single
system for higher availability, easier manageability, and greater scalability.
It consists of two or more servers connected by a network, and a cluster
management software, such as WLBS, MSCS or Application Center. The software
provides services such as failure detection, recovery, load balancing, and the
ability to manage the servers as a single system. Load balancing is a technique
that allows the performance of a server-based program, such as a Web server, to
be scaled by distributing its client requests across multiple servers within a
cluster of computers. Load balancing is used to enhance scalability, which
boosts throughput while keeping response times low.
I should warn you that I haven't implemented a complete clustering software,
but only the load balancing part of it, so don't expect anything more than
that. Now that you have an idea what load balancing is, I'm sure you don't know
what is my idea for its implementation. So keep reading...
My Idea for Dynamic Software Load Balancing
How do we know that a machine is busy? When we feel that our machine is
getting very slow, we launch the Task Manager and look for a hung instance of
iexplore.exe:) Seriously, we look at the CPU utilization. If it is low, then the
memory is low, and disk must be trashing. If we suspect anything else to be the
reason, we run the System Monitor and add some performance counters to look at.
Well, this works if you're around the machine and if you have one or two machines
to monitor. When you have more machines you'll have to hire a person, and buy
him a 20-dioptre glasses to stare at all machines' System Monitor consoles and
go crazy in about a week :). But even if you could monitor your machines constantly
you can't distribute their workload manually, could you? Well, you could use some
expensive software to balance their load, but I assure you that you can do it
yourself and that's what this article is all about. While you are able to
"see" the performance counters, you can also collect their values
programmatically. And I think that if we combine some of them in a certain way,
and do some calculations, they could give you a value, that could be used to
determine the machine's load. Let's check if that's possible!
Let's monitor the \\Processor\% Processor Time\_Total
and
\\Processor\% User Time\_Total
performance counters. You can
monitor them by launching Task Manager, and looking at the CPU utilization in
the "Performance" tab. (The red curve shows the % Procesor time, and
the green one -- the %User time). Stop or pause all CPU-intensive applications
(WinAMP, MediaPlayer, etc.) and start monitoring the CPU utilization. You have
noticed that the counter values stay almost constant, right? Now, close Task
Manager, wait about 5 seconds and start it again. You should notice a big peak
in the CPU utilization. In several seconds, the peak vanishes. Now, if we were
reporting performance counters values instantly (as we get each counter sample),
one could think that our machine was extremely busy (almost 100%) at that moment,
right? That's why we're not going to report instant values, but we will collect
several samples of the counter's values and will report their average. That would
be fair enough, don't you think? No?! I also don't, I was just checking you:)
What about available memory, I/O, etc. Because the CPU utilization is not enough
for a realistic calculation of the machine's workload, we should monitor more
than one counter at a time, right? And because, let's say, the current number of
ASP.NET sessions is less important than the CPU utilization we will give each
counter a weight. Now the machine load will be calculated as the sum of the
weighted averages of all monitored performance counters. You should be guession
already my idea for dynamic software load balancing. However, a picture worths
thousand words, and an ASCII one worths 2 thousand:) Here' is a real sample, and
the machine load calculation algorithm. In the example below, the machine load
is calculated by monitoring 4 performance counters, each configured to collect
its next sample value at equal intervals, and all counters collect the same
number of samples (this would be your usual case):
+-----------+ +-----------+ +-----------+ +-----------+
|% Proc Time| |% User Time| |ASP Req.Ex.| |% Disk Time|
+-----------+ +-----------+ +-----------+ +-----------+
|Weight 0.4| |Weight 0.3| |Weight 0.2| |Weight 0.5|
+-----------+ +-----------+ +-----------+ +-----------+
| 16| | 55| | 11| | 15|
| 22| | 20| | 3| | 7|
| 8| | 32| | 44| | 4|
| 11| | 15| | 16| | 21|
| 18| | 38| | 21| | 3|
+-----+-----+ +-----+-----+ +-----+-----+ +-----+-----+
| Sum | 75| | Sum | 160| | Sum | 95| | Sum | 50|
+-----+-----+ +-----+-----+ +-----+-----+ +-----+-----+
| Avg | 15| | Avg | 32| | Avg | 19| | Avg | 10|
+-----+-----+ +-----+-----+ +-----+-----+ +-----+-----+
| WA | 6.0| | WA | 9.6| | WA | 3.8| | WA | 5.0|
+-----+-----+ +-----+-----+ +-----+-----+ +-----+-----+
Legend:
- Sum
- the sum of all counter samples
- Avg
- the average of all counter samples (Sum/Count)
- WA
- the weighted average of all counter samples (Sum/Count * Weight)
- % Proc Time
- (Processor\% Processor Time\_Total), the percentage of elapsed time
that the processor spends to execute a non-Idle thread. It is calculated by measuring
the duration of the idle thread is active in the sample interval, and subtracting that
time from interval duration. (Each processor has an idle thread that consumes cycles
when no other threads are ready to run). This counter is the primary indicator of
processor activity, and displays the average percentage of busy time observed during
the sample interval. It is calculated by monitoring the time that the service is
inactive, and subtracting that value from 100%
- % User Time
- (Processor\% User Time\_Total) is the percentage of elapsed time the
processor spends in the user mode. User mode is a restricted processing mode designed
for applications, environment subsystems, and integral subsystems. The alternative,
privileged mode, is designed for operating system components and allows direct access
to hardware and all memory. The operating system switches application threads to
privileged mode to access operating system services. This counter displays the average
busy time as a percentage of the sample time
- ASP Req.Ex.
- (ASP.NET Applications\Requests Executing\__Total__) is the number of
requests currently executing
- % Disk Time
- (Logical Disk\% Disk Time\_Total) is the percentage of elapsed time
that the selected disk drive was busy servicing read or write requests
Sum (% Proc Time) = 16 + 22 + 8 + 11 + 18 = 75
Average (% Proc Time) = 75 / 5 = 15
Weighted Average (% Proc Time) = 15 * 0.4 = 6.0
...
MachineLoad = Sum (WeightedAverage (EachCounter))
MachineLoad = 6.0 + 9.6 + 3.8 + 5.0 = 24.4
Architecture and Implementation
I wondered about half a day how to explain the architecture to you. Not that it
is so complex, but because it would take too much space in the article, and I
wanted to show you some code, not a technical specification or even a DSS. So I
wondered whether to explain the architecture using a "top-to-bottom"
or "bottom-to-top" approach, or should I think out something else?
Finally, as most of you have already guessed, I decided to explain it in my own
mixed way:) First, you should learn of which assemblies is the solution comprised
of, and then you could read about their collaboration, the types they contain
and so on... And even before that, I recommend you to read and understand two
terms, I've used throughout the article (and the source code's comments).
- Machine Load
- the overall workload (utilization) of a machine - in our case,
this is the sum of the weighted averages of all performance counters (monitored
for load balancing); if you've skipped the section
"My Idea for Dynamic Software Load Balancing", you may want to
go back and read it
- Fastest machine
- the machine with the least current load
Architecture Outlook
First, I'd like to appologize about the "diagrams". I can work with only two
software products that can draw the diagrams, I needed in this article. I can't afford
the first (and my company is not willing to pay for it too:), and the second bedeviled
me so much, that I dropped out of the article one UML static structure diagram, a UML
deployment diagram and a couple of activity diagrams (and they were nearly complete).
I won't tell you the name of the product, because I like very much the company that
developed it. Just accept my appologies, and the pseudo-ASCII art, which replaced the
original diagrams. Sorry:)
The load balancing software comes in three parts: a server, that reports the
load of the machine it is running on; a server that collects such loads, no
matter which machine they come from; and a library which asks the collecting
server which is the least loaded (fastest) machine. The server that reports the
machine's load is called "Machine Load Reporting Server" (MLRS), and
the server, that collects machine loads is called "Machine Load Monitoring
Server" (MLMS). The library's name "Load Balancing Library" (LBL).
You can deploy these three parts of the software as you like. For example, you could install
all of them on all machines.
The MLRS server on each machine joins a special, designated for the purpose of
the load balancing, multicasts group, and sends messages, containing the
machine's load to the group's multicast IP address. Because all MLMS servers join
the same group at startup, they all receive each machine load, so if you run both
MLRS and MLMS servers on all machines, they will know each other's load. So what?
We have the machine loads, but what do we do with them? Well, all MLMS servers
store the machine loads in a special data structure, which lets them quickly
retrieve the least machine load at any time. So all machines now know which is
the fastest one. Who cares? We haven't really used that information to balance
any load, right? How do we query MLMS servers which is the fastest machine? The
answer is that each MLMS registers a special singleton object with the .NET
Remoting runtime, so the LBL can create (or get) an instance of that object, and
ask it for the least loaded machine. The problem is that LBL cannot ask
simultaneously all machines about this (yet, but I'm thinking on this issue), so
it should choose one machine (of course, it could be the machine, it is running
on) and will hand that load to the client application that needs the information
to perform whatever load balancing activity is suitable. As you will later see,
I've used LBL in a web application to distribute the workload between all web
servers in web farm. Below is a "diagram" which depicts in general
the collaboration between the servers and the library:
+-----+ ______ +-----+
| A | __/ \__ | B |
+-----+ __/ \__ +-----+
+-->| LMS |<--/ Multicast \-->| LMS |<--+
| | | / \ | | |
| | LRS |-->\__ Group __/ | | |
| | | \__ __/ | | |
|<--| LBL | ^ \______/ | LBL |---+
| +-----+ | +-----+
| | +-----+
| | | C |
| | +-----+
| | | |
| | | |
| +--| LRS |
| Remoting | |
+--------------------| LBL |
+-----+
MLMS, MLRS and LBL Communication
Note: You should see the strange figure between the machines as a cloud,
i.e. it represents a LAN :) And one more thing -- if you don't understand what
multicasting is, don't worry, it is explained later in the
Collaboration section.
Now look at the "diagram" again. Let me remind you that when a machine
joins a multicast group, it receives all messages sent to that group, including
the messages, that the machine has sent. Machine A receives its own load, and
the load, reported by C. Machine B receives the load of A and C (it does not
report its load, because there's no MLRS server installed on it). Machine C does
not receive anything, because it has not MLMS server installed. Because the
machine C's LBL should connect (via Remoting) to an MLMS server, and it has no
such server installed, it could connect to machine A or B and query the remoted
object for the fastest machine. On the "diagram" above, the LBL of A
and C communicate with the remoted object on machine A, while the LBL of B
communicates with the remoted object on its machine. As you will later see in
the Configuration section, there are very few
things that are hardcoded in the solution's source code, so don't worry -- you
will be able to tune almost everything.
Assemblies & Types
The solution consists of 8 assemblies, but only three of them are of some
interest to us now: MLMS, MLRS, and LBL, located respectively in two console
applications (MachineLoadMonitoringServer
.exe
and
MachineLoadReportingServer
.exe
) and one dynamic link library
(LoadBalancingLibrary.dll
). Surprisingly, MLMS and MLRS do not
contain any types. However, they use several types get their job done. You may
wonder why I have designed them in that way. Why hadn't I just implemented both
servers directly in the executables. Well, the answer is quite simple and
reflects both my strenghts and weaknesses as a developer. If you have the time
to read about it, go ahead, otherwise click here to skip
the slight detour.
GUI programming is what I hate (though I've written a bunch of GUI apps). For
me, it is a mundane work, more suitable for a designer than for a developer. I
love to build complex "things". Server-side applications are my
favorite ones. Multi-threaded, asynchronous programming -- that's the
"stuff" I love. Rare applications, that nobody "sees" except
for a few administrators, which configure and/or control them using some sort of
administration consoles. If these applications work as expected the end-user will
almost never know s/he is using them (e.g. in most cases, a user browsing a web
site does not realize that an IIS or Apache server is processing her requests and
is serving the content). Now, I've written several Windows C++ services in the
past, and I've written some .NET Windows services recently, so I could easily
convert MLMS and MLRS to one of these. On the other hand I love console (CUI)
applications so much, and I like seing hundreds of tracing messages on the
console, so I left MLMS and MLRS in their CUI form for two reasons. The first
reason is that you can quickly see what's wrong, when something goes wrong (and
it will, at least once:), and the second one is because I haven't debugged .NET
Windows services (and because I have debugged C++ Windows services, I can assure
you that it's not "piece of cake"). Nevertheless, one can easily
convert both CUI applications in Windows services in less than half an hour. I
haven't implemented the server classes into the executables to make it easier for
the guy who would convert them into Windows services. S/he'll need to write just
4 lines of code in the Window Service class's to get the job done:
- declare the server member variable:
LoadXxxServer __gc* server;
- instantiate and start it in the overridden
OnStart
method:
server = new LoadXxxServer ();
server->Start ();
- stop it in the overriden
OnStop
method:
server->Stop ();
Xxx
is either Monitoring
or Reporting
.
I'm sure you understand me now why I have implemented the servers' code in
separate classes in separate libraries, and not directly in the executables.
I mentioned above that the solution consists of 8 assemblies, but as you
remember, 2 of them (the CUIs) did not contain any types, one of them was LBL,
so what are the other 5? MLMS and MLRS use respectively the types, contained in
the libraries LoadMonitoringLibrary
(LML) and
LoadReportingLibrary
(LRL). On the other hand, they and LBL use
common types, shared in an assembly, named SharedLibrary
(SL). So
the assemblies are now MLMS + MLRS + LML + LRL + LBL + SL = 6. The 7th is a
simple CUI (not interesting) application, I used to test the load balancing, so
I'll skip it. The last assembly, is the web application that demonstrates the
load balancing in action. Below is a list of the four most important assemblies
that contain the types and logic for the implementation of the load balancing
solution.
SharedLibary
(SL) - contains common and helper types, used by LML,
LRL and/or LBL. A list of the types (explained further) follows:
ServerStatus
- enumeration, used by LML and LRL's
LoadXxxServer classes
WorkerDoneEventHandler
- delegate, ditto
Configurator
- utility class (I'll discuss later), ditto
CounterInfo
- "struct" class, used by LRL and SL
ILoadBalancer
- interface, implemented in LML and used by LBL
IpHelper
- utility class, used by LML and LRL
MachineLoad
- "struct" class (with MarshalByValue
semantics for the needs of the Remoting runtime), used by LML, LRL and LBL
Tracer
- utility class, which most classes in LML and LRL
inherit in order to trace in the console in a consistent manner
NOTE:
CounterInfo
is not exactly what C++ developers call a "struct" class,
because it does a lot of work behind the scenes. Its implementation is non-
trivial and includes topics like timers, synchronization, and performance
counters monitoring; look at the Some Implementation Details
section for more information about it.
LoadMonitoringLibrary
(LML) - contains the LoadMonitoringServer
(LMS) class,
used directly by MLMS, as well as all classes, used internally in the LMS
class. List of LML's types (explained further) follows:
LoadMonitoringServer
- (LMS) class, the MLMS core
MachineLoadsCollection
- a simulation of a priority queue that stores the
machines' loads in a sorted manner, so it could quickly return the least
loaded machine (its implementation is more interesting than its name)
LoadMapping
- "struct" class, used internally by MachineLoadsCollection
CollectorWorker
- utility class, its only (public) method is the worker
thread that accepts and collects machine load reports
ReporterWorker
- utility class, its only (public) method is the worker
thread that accepts LBL requests and reports machine loads
WorkerTcpState
- "struct" class, used internally by the CollectorWorker
WorkerUdpState
- "struct" class, used internally by the ReporterWorker
ServerLoadBalancer
- a special Remoting-enabled (MarshalByRefObject) class,
which is registered for remoting as a Singleton, and activated on the server
side by LBL to service its requests
NOTE: I used the ReporterWorker
to implement the first version of LBL in some
faster, more lame way, but I've dropped it later; now, LMS registers a Singleton
object for the LBL requests; however, LMS is still using (the fully functional)
ReporterWorker
class, so one could build another kind of LBL that connects to an
MLMS and asks for the least loaded machine using a simple TCP socket (I'm sorry
that I've overwritten the old LBL library).
LoadReportingLibrary
(LRL) - contains the LoadReportingServer
(LRS) class, used
directly by MLRS, as well as all classes, used internally in the LRS class.
List of LRL's types (explained further) follows:
LoadReportingServer
- class, the MLRS core
ReportingWorker
- utility class, its only (public) method is the worker
thread that starts the monitoring of the performance counters and periodically
reports to one or more MLMS the local machine's load
LoadBalancingLibrary
(LBL) - contains just one class, ClientLoadBalance
r, which
is instantiated by client applications; the class contains only one (public)
method, which is "surprisingly" named GetLeastMachineLoad
returning the least
loaded machine:) LBL connects to LML's ServerLoadBalancer
singleton object via
Remoting. For more details, read the following section.
Collaboration
In order to understand how the objects "talk" to each other within an assembly
and between assemblies (and on different machines), you should understand some
technical terms. Because they amount to about a page, and maybe most of you do
know what they mean, here's what I'll do: I'll give you a list of the terms, and
if you know them, click here to read about the collaboration,
otherwise, keep reading... The terms are: delegate, worker, TCP, UDP, (IP) Multicasting,
and Remoting.
- Delegate
- a secure, type-safe way to call a method of a class indirectly, using
a "reference" to that method; very similar to and at the same time quite
different from C/C++ function pointers (callbacks);
- Worker
- utility class, usually with just one method, which is started as a
separate thread; the class holds the data (is the state), needed by the thread
to do its job;
- TCP
- a connection-based, stream-oriented delivery protocol with end-to-end
error detection and correction. Connection-based means that a communication
session between hosts is established before exchanging data. A host is any
device on a TCP/IP network identified by a logical IP address. TCP provides
reliable data delivery and ease of use. Specifically, TCP notifies the sender of
packet delivery, guarantees that packets are delivered in the same order in
which they were sent, retransmits lost packets, and ensures that data packets
are not duplicated;
- UDP
- a connectionless, unreliable transport protocol. Connectionless means that
a communication session between hosts is not established before exchanging data.
UDP is often used for one-to-many communications that use broadcast or multicast
IP datagrams. The UDP connectionless datagram delivery service is unreliable
because it does not guarantee data packet delivery and no notification is sent
if a packet is not delivered. Also, UDP does not guarantee that packets are
delivered in the same order in which they were sent. Because delivery of UDP
datagrams is not guaranteed, applications using UDP must supply their own
mechanisms for reliability, if needed. Although UDP appears to have some
limitations, it is useful in certain situations. For example, Winsock IP
multicasting is implemented with UDP datagram type sockets. UDP is very
efficient because of low overhead. Microsoft networking uses UDP for logon,
browsing, and name resolution;
- Multicasting
- technology that allows data to be sent from one host and then
replicated to many others without creating a network traffic nightmare. This
technology was developed as an alternative to broadcasting, which can negatively
impact network bandwidth if used extensively. Multicast data is replicated to a
network only if processes running on workstations in that network are interested
in that data. Not all protocols support the notion of multicasting -- on Win32
platforms, only two protocols are capable of supporting multicast traffic: IP
and ATM;
- IP Multicasting
- IP multicasting relies on a special group of addresses known
as multicast addresses. It is this group address that names a given group. For
example, if five machines all want to communicate with one another via IP
multicast, they all join the same group address. Once they are joined, any data
sent by one machine is replicated to every member of the group, including the
machine that sent the data. A multicast IP address is a class D IP address in
the range 224.0.0.0 through 239.255.255.255
- Remoting
- the process of communication between different operating system
processes, regardless of whether they are on the same computer. The .NET
remoting system is an architecture designed to simplify communication between
objects living in different application domains, whether on the same computer or
not, and between different contexts, whether in the same application domain or
not.
I'll start from inside-out, i.e. I'll first explain how various classes
communicate with each other with the assemblies, and then I'll explain how the
assemblies collaborate between them.
In-Assembly collaboration (a thread synchronization how-to:)
When the LoadReportingServer
and LoadMonitoringServer
classes are instantiated,
and their Start
methods are called, they launch respectively one or two
threads to do their job asynchronously (and to be able to respond to "Stop"
commands, of course). Well, if starting a thread is very easy, controlling it is
not that easy. For example, when the servers should stop, they should notify the
threads that they are about to stop, so the threads could finish their job and
exit appropriately. On the other hand, when the servers launch the threads, they
should be notified when the threads are about to enter their thread loops and
have executed their initialization code. In the next couple of paragraphs I'll
explain how I've solved these synchronization issues, and if you know a cooler
way, let me know (with the message board below). In the paragraphs below, I'll
refer to the instances of the LoadReportingServer
and LoadMonitoringServer
classes as "(the) server".
When the Start
method is executed, the LMS object creates a worker class
instance, passing it a reference to itself (this), a reference to a delegate and
some other useful variables that are not interesting for this section. The
server object then creates an AutoResetEvent
object in a unsignalled state. Then
the LMS object starts a new thread, passing for the ThreadStart
delegate the
address of a method in the worker class. (I call a worker class' method, launched
as a thread a worker thread.) After the thread has been started, the server object
blocks, waiting (infinitely) for the event object to be signalled. Now, when the
thread's initialization code completes, it calls back the server via the server-
supplied delegate, passing a boolean parameter showing whether its initialization
code executed successfully or something went wrong. The target method of the
delegate in the server class sets (puts in signalled state) the AutoResetEvent
object and records in a private boolean member the result of the thread
initialization. Setting the event object unblocks the server: it now knows that
the thread's startup code has completed, and also knows the result of the thread's
initialization. If the thread did not manage to initialize successfully, it has
already exited, and the server just stops. If the thread succeeded to initialize, then
it enters its thread loop and waits for the server to inform it when it should
exit the loop (i.e. the server is stopping). One could argue that this
"worker thread-to-main thread" synchronization looks too complicated
and he might be right. If we only needed to know that the thread has finished the
initialization code (and don't care if it initialized successfully) we could
directly pass the worker a reference to the AutoResetEvent
object,
and the thread would then set it to a signalled state, but you saw that we need
to know whether the thread has initialized successfully or not.
Now that was the more complex part. The only issue we have to solve now is how
to stop the thread, i.e. make it exit its thread loop. Well, that's what I call
a piece of cake. If you remember, the server has passed a reference to itself
(this
) to the worker. The server has a Status
property,
which is an enumeration, describing the state of the server (Starting, Started,
Stopping, Stopped). Because the thread has a reference to the server, in its
thread loop it checks (by invoking the Status
property) whether the
server is not about to stop (Status == ServerStatus::Stopping
). If
the server is stopping, so is the thread, i.e. the thread exits silently and
everything's OK. So when the server is requested to stop, it modifies its private
member variable status
to Stopping
and Join
s
the thread (waits for the thread to exit) for a configured interval of time. If
the thread exits in the specified amount of time, the server changes its status
to Stopped
and we're done. However, a thread may timeout while
processing a request, so the server then aborts the thread by calling the
thread's Abort
method. I've written the thread loops in
try...catch...finally
blocks and in their catch
clause,
the threads check whether they die a violent death:), i.e. a
ThreadAbortException
was raised by the server. The thread then
executes its cleanup code and exits. (And I thought that was easier to explain:)
That much about how the server classes talk to worker classes (main thread to
worker threads). The rest of the objects in the assemblies communicate using
references to each other or via delegates. Now comes the part that explains how
the assemblies "talk" to each other, i.e. how the MLRS sends its
machine's load to the MLMS, and how LBL gets the minimum machine load from MLMS.
Cross-machine assembly collaboration
I'll first "talk" how the MLRS reports the machine load to MLMS. To save some
space in the article (some of your bandwidth, and some typing for me:), I'll
refer to the LoadReportingServer
class as LRS and to the LoadMonitoringServer
class as LMS. Do not confuse them with the server applications, having an "M"
prefix.
LMS starts two worker threads. One for collecting machine loads, and one for
reporting the minimum load to interested clients. The former is named
CollectorWorker
, and the latter -- ReporterWorker
.
I've mentioned somewhere above that the ReporterWorker
is not so
interesting, so I'll talk only about the CollectorWorker
. In the
paragraphs below, I'll call it simply a collector. When the collector thread is
started, it creates a UDP socket, binds it locally and adds it to a multicast
group. That's the collector's initialization code. It then enters a thread loop,
periodically polling the socket for arrived requests. When a request comes, the
collector reads the incomming data, parses it, validates it and if it is a valid
machine load report, it enters the load in the machine loads
collection of the LMS class. That's pretty much everything you need to know
about how MLMS accepts machine loads from MLRS.
LRS starts one thread for reporting the current machine load. The worker's name
is ReportingWorke
r, and I'll refer to it as reporter. The
initialization code of this thread is to start monitoring the performance
counters, create a UDP socket and make it a member of the same multicast group
that MLMS's collector object has joined. In its thread loop, the reporter waits
a predefined amount of time, then gets the current machine load and sends it to
the multicast endpoint. A network device, called a "switch" then
dispatches the load to all machines that have joined the multicast group, i.e.
all MLMS collectors will receive the load, including the MLMS that runs on the
MLRS machine (if a MLMS has been installed and running there).
Here comes the most interesting part -- how LBL queries which is the machine
with the least load (the fastest machine). Well, it is quite simple and requires
only basic knowledge about .NET Remoting. If you don't understand Remoting, but
you do understand DCOM, assume that .NET Remoting compared to DCOM is what C++
is compared to C. You'll be quite close and at the same time quite far from what
Remoting really is, but you'll get the idea. (In fact, I've read several books
on DCOM, and some of them refered to it as "COM Remoting Infrastructure").
When MLMS starts, it registers a class named ServerLoadBalancer
with
the Remoting runtime as a singleton (an object that is instantiated just once,
and further requests for its creation end up getting a reference to the same,
previously instantiated object). When a request to get the fastest machine comes
(GetLeastMachineLoad
method gets called) the singleton asks the
MachineLoadsCollection
object to return its least load, and then
hands it to the client object that made the remoted call.
Below is a story you would like to hear about remoted objects that need to have
parameter-less constructors. If you'd like to skip the story,
click here, otherwise enjoy...
Now that all of you know that an object may be registered for remoting, probably
not many of you know that you do not have easy control over the object's
instantiation. Which means that you don't create an instance of the singleton
object and register it with the Remoting runtime, but rather the Remoting
runtime creates that object when it receives the first request for the object's
creation. Now, all server-activated objects must have a parameter-less
constructor, and the singleton is not an exception. But we want to pass our
ServerLoadBalancer
class a reference to the machine loads collection.
I see only two ways to do that -- the first one is to register the object with
the Remoting runtime, create an instance of it via Remoting and call an
"internal" method Initialize
, passing the machine loads
collection to it. At first that sounded like a good idea and I did it just like
that. Then I launched the client testing application first, and the server after
it. Can you guess what happened? The client managed to create the singleton
first, and it was not initialized -- boom!!! Not what we expected, right? So I
thought a bit how to find a workaround. Luckily, it occured to me how to hack
this problem. I decided to make a static member of the
LoadMonitoringServer
class, which would hold the machine loads
collection. At the begining it would be a null
reference, then
when the server starts, I would set it to the server's machine loads collection.
Now when our "parameter-less constructed" singleton object is
instantiated for the first time by the Remoting runtime, it would get the
machine loads via the LoadMonitoringServer::StaticMachineLoads
member variable and the whole problem has disappeared. I had to only mark the
static member variable as (private public
) so it is visible only
within the assembly. I know my appoach is a hack, and if you know a better
pattern that solves my problem, I'll be happy to learn it.
Here's another interesting issue. How does the client (LBL) compile against the
remoted ServerLoadBalancer
class? Should it have a reference
(#using "...dll") to LML or what? Well, there is a solution to this
problem, and I haven't invented it, thought I'd like much:) I mentioned before,
that the SharedLibrary
has some shared types, used by LBL, LMS and
LRS. No, it's not what you're thinking! I haven't put the
ServerLoadBalancer
class there even if I wanted to, because it
requires the MachineLoadsCollection
class, and the latter is located
in LML. What I consider an elegant solution, (and what I did) is defining an
interface in the SharedLibrary
, which I implemented in the
ServerLoadBalancer
class in LML. LBL tries to create the
ServerLoadBalancer
via Remoting, but it does not explicitly try to
create a ServerLoadBalancer
instance, but an instance, implementing
the ILoadBalancer
interface. That's how it works. LBL
creates/activates the singleton on the LMS side via Remoting and calls its
GetLeastMachineLoad
method to determine the fastest machine.
Some Implementation Details
Below is a list of helper classes that are cool, reusable or worth to mention.
I'll try to explain their cool sides, but you should definitely peek at the
source code to see them:)
Configurator
I like the .NET configuration classes very much, and I hate reinventing the
wheel, but this class is a specific configuration class for this solution and is
cooler than .NET configuration class in at least one point. What makes the class
cooler, is that it can notify certain objects when the configuration changes,
i.e. the underlying configuration file has been modified with some text editor.
So I've built my own configurator class, which uses the FileSystemWatcher class
to sniff for writes in the configuration file, and when the file changes, the
configurator object re-loads the file, and raises an event to all subscribers
that need to know about the change. These subscribers are only two, and they are
the Load Monitoring and Reporting servers. When they receive the event, they
restart themselves, so they can reflect the latest changes immediately.
CounterInfo
I used to call this class a "struct" one. I wasn't fair to it :), as it is one
of the most important classes in the solution. It wraps a PerformanceCounter
object in it, retrieves some sample values, and stores them in a cyclic queue.
What is a cyclic queue? Well, I guess there's no such animal :) but as I have
"invented" it, let's explain you what it is. It is a simple queue with finite
number of elements allowed to be added. When the queue "overflows", it pops up
the first element, and pushes the new element into the queue. Here's an example
of storing the numbers from 1 to 7 in a 5-element cyclic queue:
Pass Queue Running Total (Sum)
---- ----- -------------------
[] = 0
1 [1] = 0 + 1 = 1
2 [2 1] = 1 + 2 = 3
3 [3 2 1] = 3 + 3 = 6
4 [4 3 2 1] = 6 + 4 = 10
5 [5 4 3 2 1] = 10 + 5 = 15
6 [6 5 4 3 2] = 15 - 1 + 6 = 20
7 [7 6 5 4 3] = 20 - 2 + 7 = 25
Why do I need the cyclic queue? To have a limited state of each monitored
performance counter, of course. If pass 5 was the state of the counter 3 seconds
ago, the its average was 15/5 = 3, and if now we are at pass 7, the counter's
average is 20/5 = 4. Sounds reallistic, doesn't it? So we use the cyclic queue
to store the transitory counter samples and know its average for the past N
samples which were measured for the past M seconds. You see how easy is calculated
the running sum. Now the only thing a counter should do to tell its average is ro
divide the running sum to the number of sample values it had collected. You know
that the machine load is the sum of the weigted averages of all monitored performance
counters for the given machine. But you might ask, what happens in the following
situation:
We have two machines: A and B. Both are measuring just one counter, their CPU
utilization. A Machine Load Monitoring Server is running on a third machine C,
and a load balancing client is on fourth machine D. A and B's Load Reporting
Servers are just started. Their CounterInfo classes have recorded respectively
50 and 100 (because the administrator on machine B has just launched IE:). A and
B are configured to report each second, but they should report the weighted
averages of 5 sample values. 1 second elapses, but both A and B has collected
only 1 sample value. Now D asks C which is the least loaded machine. Which one
should be reported? A or B? The answer is simple: None. No machine is allowed to
report its load unless it has collected the necessary number of sample values
for all performance counters. That means, that unless A and B has filled up
their cyclic queues for the very first time, they block and don't return their
weighted average to the caller (the LRS's reporter worker).
MachineLoadsCollection
This class is probably more tricky than interesting. Generally, it is used to
store the loads, that one or more LRS report to LMS. That's the class' dumb
side. One cool side of the class is that it stores the loads in 3 different data
structures to simulate one, that is missing in the .NET BCL - a priority queue
that can store multiple elements with the same key, or in STL terms, something
like
std::priority_queue <std::vector <X *>, ... >
I know that C++ die-hards know it by heart, but for the rest of the audience:
std::priority_queue
is a template container adaptor class
that provides a restriction of functionality limiting access to the top element
of some underlying container type, which is always the largest or of the highest
priority. New elements can be added to the priority_queue
and the
top element of the priority_queue
can be inspected or removed. I took
the definition from MSDN, but I'd like to correct it a little bit: you should
read "which is always the largest or of the highest priority" as
"which is always what the less
functor returns as largest or of
the highest priority". At the beginning, I thought to use the
priority_queue
template class, and put there
"gcroot"
-ed references, but then I thought that it would
be more confusing and difficult than helping me, and you, the reader. Do you know
what the "gcroot"
template does? No? Nevermind then:) In
.NET BCL classes, we have something which is very similiar to a priority queue -- that's the
SortedList
class in System::Collections
. Because it
can store any Object
-based instances, we could put ArrayList
references in it to simulate a priority queue that stores multiple elements with
the same key. There's also a Hashtable
to help us solve certain
problems, but we'll get to it in a minute. Meanwhile, keep reading to understand
why I need these data structures in the first place.
Machine loads do not enter the machine loads collection by name, i.e. they are
added to the loads collection with the key, being the machine's load. That's why
before each machine reports its load, it converts the latter to unsigned long
and then transmits it over the wire to LMS. It helps restricting the number of
stored loads, e.g. if machine A has a load of 20.1, and machine B has a load of
20.2, then the collection considers the loads as equal. When LMS "gets" the
load, it adds it in the SortedList
, i.e. if we have three
machines -- "A", "B", and "C" with loads 40, 20
and 30, then the SortedList
looks like:
[C:30][A:40]
If anyone asks for the fastest machine, we always return the 1st positional
element in the sorted list, (because it is sorted in ascending order).
Well, I'd like it to be so simple, but it isn't. What happens when a 4th
machine, "D" reports a load 20? You should have guessed by now why I need to
store an ArrayList
for each load, so here is it in action -- it stored the loads
of machines B and D:
[D:20]
[C:30][A:40]
Now, if anyone asks for the fastest machine, we will return the first element of
the ArrayLis
t, that is stored in the first element of the SortedList
, right? It
is machine "B".
But then what happens when machine "B" reports another load, equal to 40? Shall
we leave the first reported load? Of course, not! Otherwise, we will return "B"
as the fastest machine, where "D" would be the one with the least load. So we
should remove machine "B"'s older load from the first ArrayList
and insert its
new load, wherever is appropriate. Here's the data structure then:
[D:20][C:30][A:40]
Now how did you find machine "B"'s older load in order to remove it?
Eh? I guess with your eyes. Here's where we need that Hashtable
I mentioned
above. It is a mapping between a machine's older load and the list it resides in
currently. So when we add a machine load, we first check whether the machine has
reported a load before, and if it did, we find the ArrayList
, where
the old load was placed, remove it from the list, and add the new load to a new
list, right? Wrong. We have one more thing to do, but first let me show you the
problem, and you'll guess what else I've done to make the collection work as expected.
Imagine that machine "D" reports a new load -- 45. Now you'll say that
the data now looks like the one below:
[C:30][A:40][D:45]
You wish it looks like this! But that's because I made a mistake when I was
trying to visualize the first loads. Actually the previous loads collection
looked like this:
^
|
M
A
C . . . .
H . . . .
I . . . .
N . . . .
E . . B .
S D C A .
LOAD 20 30 40 . . . -->
So you now will agree, that the collection actually looks like this:
^
|
M
A
C . . . .
H . . . .
I . . . .
N . . . .
E . . B .
S . C A D
LOAD 20 30 40 45 . . -->
Yes, the first list is empty, and when a request to find the least loaded
machine comes and you try to pop up the first element of the
ArrayList
for load 20 (which is the least load), you'll get
IndexOutOfRangeException
, as I got it a couple of times before I
debugged to understand what was happenning. So when we remove an old load from
an ArrayList
, we should check whether it has orphaned (is now empty),
and if this is the case, we should remove the ArrayList
from the
SortedList
as well.
Here's the code for the Add
method:
void MachineLoadsCollection::Add (MachineLoad __gc* machineLoad)
{
DEBUG_ASSERT (0 != machineLoad);
if (0 == machineLoad)
return;
String __gc* name = machineLoad->Name;
double load = machineLoad->Load;
Object __gc* boxedLoad = __box (load);
rwLock->AcquireWriterLock (Timeout::Infinite);
ArrayList __gc* loadList = 0;
if (!loads->ContainsKey (boxedLoad))
{
loadList = new ArrayList ();
loads->Add (boxedLoad, loadList);
}
else
{
loadList = static_cast<ArrayList __gc*> (loads->get_Item (boxedLoad));
}
if (!mappings->ContainsKey (name))
{
loadList->Add (machineLoad);
mappings->Add (name, new LoadMapping (machineLoad, loadList));
}
else
{
LoadMapping __gc* mappedLoad =
static_cast<LoadMapping __gc*> (mappings->get_Item (name));
MachineLoad __gc* oldLoad = mappedLoad->Load;
ArrayList __gc* oldList = mappedLoad->LoadList;
mappings->Remove (name);
int index = oldList->IndexOf (oldLoad);
oldList->RemoveAt (index);
loadList->Add (machineLoad);
mappings->Add (name, new LoadMapping (machineLoad, loadList));
if (oldList->Count == 0)
loads->Remove (__box (oldLoad->Load));
}
rwLock->ReleaseWriterLock ();
}
Now, for the curious, here's the get_MinimumLoad
property's code:
MachineLoad __gc* MachineLoadsCollection::get_MinimumLoad ()
{
MachineLoad __gc* load = 0;
rwLock->AcquireReaderLock (Timeout::Infinite);
if (loads->Count > 0)
{
ArrayList __gc* minLoadedMachines =
static_cast<ArrayList __gc*> (loads->GetByIndex (0));
load = static_cast<MachineLoad __gc*> (minLoadedMachines->get_Item (0));
}
rwLock->ReleaseReaderLock ();
return (load);
}
Well, that's prety much about how the MachineLoadsCollection
class works in
order to store the machine loads, and return the least loaded machine. Now we
will see what else is cool about this class. I called it the Grim Reaper, and
that's what it is -- a method, named GrimReaper
(GR), that runs
asynchronously (using a Timer
class) and kills dead machines!:) Seriously,
GR knows the interval at which each machine, once reported a load, should report
it again. If a machine fails to report its load in a timely manner it is removed
from the MachineLoadsCollection
container. In this way, we guarantee
that a machine, that is now dead (or is disconnected from the network) will not
be returned as the fastest machine, at least not before it reports again (it is
brought back to the load balancing then). However, in only about 30 lines of
code, I managed to made two mistakes in the GR code. The first one was very
lame -- I was trying to remove an element from a hash table while I was iterating
over its elements, but the second was a real bich! However, I found it quite
quickly, because I love console applications:) I was outputting a start (*) when
GR was executing, and a caret (^) when it was killing a machine. I then observed
that even if the (only) machine was reporting regularly its load, at some time,
GR was killing it! I was staring at the console at least for 3 minutes. The GR
code was simple, and I thought that there's no chance to make a mistake there. I
was wrong. It occured to me that I wasn't considering the fact, that the GR code
takes some time to execute. It was running fast enough, but it was taking some
interval of time. Well, during that time, GR was locking the machine loads
collection. And while the collection was locked, the collector worker was blocked,
waiting for the collection to be unlocked, so it can enter the newly received
load ther. So when the collection was finally unlocked at the end of the GR
code, the collector entered the machine's load. You can guess what happens when
the GR is configured to run in shorter intervals and the machines report in
longer intervals. GR locks, and locks and locks, while the collector blocks, and
blocks and blocks, until a machine is delayed by the GR itself. However, because
GR is oblivious to the outer world, it thinks that the machine is a dead one, so
it removes the machine from the load balancing, until the next time it reports a
brand new load. My solution for this issue? I have it in my head, but I'll
implement it in the next version of the article, because I really ran out of
time. (I couldn't post the article for the November's contest, because I
couldn't finish this text in time. It looks that writing text in plain English is
more difficult than writing Managed C++ code, and I don't want to miss December's
contest too:)
If anyone is interested, here is Grim Reaper's code:
void MachineLoadsCollection::GrimReaper (Object __gc* state)
{
MachineLoadsCollection __gc* mlc = static_cast<MachineLoadsCollection __gc*> (state);
mlc->grimReaper->Change (Timeout::Infinite, Timeout::Infinite);
if (!mlc->keepGrimReaperAlive)
return;
ReaderWriterLock __gc* rwLock = mlc->rwLock;
SortedList __gc* loads = mlc->loads;
Hashtable __gc* mappings = mlc->mappings;
int reportTimeout = mlc->reportTimeout;
rwLock->AcquireWriterLock (Timeout::Infinite);
StringCollection __gc* deadMachines = new StringCollection ();
DateTime dtNow = DateTime::Now;
IDictionaryEnumerator __gc* dic = mappings->GetEnumerator ();
while (dic->MoveNext ())
{
LoadMapping __gc* map = static_cast<LoadMapping __gc*> (dic->Value);
TimeSpan tsDifference = dtNow.Subtract (map->LastReport);
double difference = tsDifference.TotalMilliseconds;
if (difference > (double) reportTimeout)
{
String __gc* name = map->Load->Name;
MachineLoad __gc* oldLoad = map->Load;
ArrayList __gc* oldList = map->LoadList;
deadMachines->Add (name);
int index = oldList->IndexOf (oldLoad);
oldList->RemoveAt (index);
if (oldList->Count == 0)
loads->Remove (__box (oldLoad->Load));
}
}
for (int i=0; i<deadMachines->Count; i++)
mappings->Remove (deadMachines->get_Item (i));
deadMachines->Clear ();
rwLock->ReleaseWriterLock ();
mlc->grimReaper->Change (reportTimeout, reportTimeout);
}
Load Balancing in Action - Balancing a Web Farm
I've built a super simple .NET Web application (in C#), that uses LBL to perform
a load balancing in a web farm. Though the application is very little, it is
interesting and deserves some space in this article, so here we go. First, I've
written a class that wraps the load balancing class ClientLoadBalancer
from LBL, named it Helper
, and implemented it as a singleton so the
Global
class of the web application and the web page classes could
see one instance of it. Then I used it in the Session_OnStart
method
of the Global
class to redirect every new session's first HTTP
request to the most available machine. Furthermore, in the sample web page, I've
used it again to dynamically build URLs for further processing, replacing the
local host again with the fastest machine. Now one may argue (and he might be
right) that a user can spend a lot of time reading that page, so when he
eventually clicks on the "faster" link, the previously faster machine
could not be the fastest one at that time. Just don't forget that hitting another
machine's web application will cause its Session_OnStart
trigger
again, so anyway, the user will be redirected to the fastest machine. Now, if
you don't get what am I talking about, that's because I haven't shown any code
yet. So here it its:
protected void Session_Start (object sender, EventArgs e)
{
string fastestMachineName = Helper.Instance.GetFastestMachineName ();
string thisMachineName = Environment.MachineName;
if (String.Compare (thisMachineName, fastestMachineName, false) != 0)
{
string fasterUrl = Helper.Instance.ReplaceHostInUrl (
Request.Url.ToString (),
fastestMachineName);
Response.Redirect (fasterUrl);
}
}
And here's the code in the sample web page:
private void OnPageLoad (object sender, EventArgs e)
{
string fastestMachineName = Helper.Instance.GetFastestMachineName ();
link.Text = String.Format (
"Next request will be processed by machine '{0}'",
fastestMachineName);
link.NavigateUrl = Helper.Instance.ReplaceHostInUrl (
Request.Url.ToString (),
fastestMachineName);
}
If you think that I hardcoded the settings in the Helper
class,
you are wrong. First, I hate hardcoded or magic values in my code (though you
may see some in an article like this). Second, I was testing the solution on
my coleagues' computers, so writing several lines of code in advance, helped
me to avoid the inevitable otherwise re-compilations. I just deployed the
web application there. Here's the trivial C# code of the Helper
class (note that I have hardcoded the keys in Web.config
file ;-)
class Helper
{
private Helper ()
{
loadBalancer = null;
try
{
NameValueCollection settings = ConfigurationSettings.AppSettings;
string machine = Environment.MachineName;
int port = 14000;
RemotingProtocol protocol = RemotingProtocol.TCP;
string machineName = settings ["LoadBalancingMachine"];
if (machineName != null)
machine = machineName;
string machinePort = settings ["LoadBalancingPort"];
if (machinePort != null)
{
try
{
port = int.Parse (machinePort);
}
catch (FormatException)
{
}
}
string machineProto = settings ["LoadBalancingProtocol"];
if (machineProto != null)
{
try
{
protocol = (RemotingProtocol) Enum.Parse (
typeof (RemotingProtocol),
machineProto,
true);
}
catch (ArgumentException)
{
}
}
loadBalancer = new ClientLoadBalancer (
machine,
protocol,
port);
}
catch (Exception e)
{
if (e is OutOfMemoryException || e is ExecutionEngineException)
throw;
}
}
public string GetFastestMachineName ()
{
string fastestMachineName = Environment.MachineName;
if (loadBalancer != null)
{
MachineLoad load = loadBalancer.GetLeastMachineLoad ();
if (load != null)
fastestMachineName = load.Name;
}
return (fastestMachineName);
}
public string ReplaceHostInUrl (string url, string newHost)
{
Uri uri = new Uri (url);
bool hasUserInfo = uri.UserInfo.Length > 0;
string credentials = hasUserInfo ? uri.UserInfo : "";
string newUrl = String.Format (
"{0}{1}{2}{3}:{4}{5}",
uri.Scheme,
Uri.SchemeDelimiter,
credentials,
newHost,
uri.Port,
uri.PathAndQuery);
return (newUrl);
}
public static Helper Instance
{
get { return (instance); }
}
private ClientLoadBalancer loadBalancer;
private static Helper instance = new Helper ();
}
If you wonder how the servers look like when running, and what a great
look and feel I've designed for the web application, here's a screenshot to
disappoint you:)
Building, Configuring and Deploying the Solution
There's a little trick you need to do, in order to load the solution file.
Open your IIS administration console (Start/Run... type inetmgr
) and
create a new virtual directory LoadBalancingWebTest
. when you're asked
about the folder, choose X:\Path\To\SolutionFolder\LoadBalancingWebTest
.
You can now open the solution file (SoftwareLoadBalancing.sln
) with no
problems. Load it in Visual Studio .NET, build the SharedLibrary
project
first, as the others depend on it, then build LML and LRS, and then the whole solution.
Note that the setup projects won't build automatically so you should select and build
them manually.
Note:
When you compile the solution, you will get 15 warnings. All of them state:
warning C4935: assembly access specifier modified from 'xxx'
, where
xxx
could be private
or public
. I don't know
how to make the compiler stop complaining about this. There are no other warnings
at level 4. Sorry if these embarass you.
That's it if you have VS.NET. If you don't, you can compile only the web application,
as it is written in C# and can be compiled with the free C# compiler, coming with the
.NET framework. Otherwise, buy a copy of VS.NET, and become a CodeProject (and
Microsoft) supporter :) BTW, I realized right now, that I should write my next
articles in C#, so the "poor" guys like me can have some fun too. I'm sorry
guys! I promise to use C# in most of the next articles I attempt to write.
Configuration
If you look at the Common.h
header, located in the SharedFiles
folder in the solution, you'll notice that I've copied and pasted the meaning of
all configuration keys from that file. However, because I know you won't look at
it until you've liked the article (and it's high time do so, as it is comming to
its end:), here's the explanation of the XML configuration file, and various
macros in Common.h
.
What's so common in Common.h?
This header file is used by almost all projects in the solution. It has several
(helpful) macros I'm about to discuss, so if you're in the mood to read about them,
go ahead. Otherwise, click here to read only about the
XML configuration file.
First, I'm going to discuss the .NET member access modifiers. There are 5 of
them, though you may use only four of them, unless you are writing IL code.
Existing languages refer to them in a different way, so I'll give you a
comparison table of their names and some explanations.
.NET term |
C# keyword(s) |
MC++ keyword(s) |
Explanation |
private |
private |
private private |
the member is visible only in the class, it is defined in, and is not
visible from other assemblies; note the double use of the private
keyword in MC++ -- the first one (in this table) specifies whether the member is
visible from other assemblies, and the other specifies whether the member is
visible from other classes within the same assembly. |
public |
public |
public public |
visible from all assemblies and classes |
family |
protected |
public protected |
visible from all assemblies, but can be used only from derived
classes |
family and assembly |
internal |
private public |
visible from all classes withing the assembly, but not visible
to external assemblies |
Because I like most the C# keywords, I #define
d and used thoughout
the code four macros to avoid typing the double keywords in MC++:
#define PUBLIC public public
#define PRIVATE private private
#define PROTECTED public protected
#define INTERNAL private public
Here comes the more interesting "stuff". You have three options for
communication between a load reporting and monitoring servers: via UDP +
multicasting, UDP-only, or TCP. BTW, if I was writing the article in C#, you
wouldn't have them. Really! C# is so lame in preprocessing, and the compiler
writers were so wrong that they did not include some real preprocessing
capabilities in the compiler, that I have no words! Nevertheless, I wrote the
article in MC++, so I have the cool #define
directives I needed so
badly, when I started to write the communication code of the classes. There are
two macros you can play with, to make the solution use one communication protocol
or another, and/or disable/enable multicasting. Here are their definitions:
#define USING_UDP 1
#define USING_MULTICASTS 1
Now, a C# guru:) will argue that I could still write the protocol-independent
code with several pairs of #ifdef
and #endif
directives.
To tell you the truth, I'm not a fan of this coding style. I'd rather define a
generic macro in such an #if
block, and use it everywhere I need it.
So that's what I did. I've written macros that create TCP or UDP sockets, connect
to remote endpoints, and send and receive data via UDP and TCP. Then I wrote
several generic macros that follow the pattern below:
#if defined(USING_UDP)
# define SOCKET_CREATE(sock) SOCKET_CREATE_UDP(sock)
#else
# define SOCKET_CREATE(sock) SOCKET_CREATE_TCP(sock)
#endif
You get the idea, right? No #ifdef
s inside the real code.
I just write SOCKET_CREATE (socket);
and the preprocessor
generates the code to create the appropriate socket. Here's another good macro,
I use for exception handling, but before that I'll give you some rules (you
probably know) about .NET exception handling:
-
Catch only the exceptions you can handle, and no more. This means that if you
expect the method you're calling to throw
ArgumentNullException
and/or
ArgumentOutOfRangeException
, you should write two catch clauses and
catch only these exceptions.
- Another rule is to never "swallow" an exception you caught, but
cannot handle. You must re-throw it, so the caller of your method knows why it
failed.
- This one relates to the 2nd rule: there are 2 exceptions you can do nothing
about but report them to the user and die: these are the
OutOfMemoryException
,
and ExecutionEngineException
. I don't know which one is worse --
probably the latter, though if you're out of memory, there's almost nothing you
can do about it.
Because I'm not writing production code here, I allowed myself to catch
(in most of the source code) all possible exceptions, when I don't need to handle
them except to know that something went bad. So I catch the base class
Exception
. This violates all rules, I've written above, but I
wrote some code to fit into the second and third one -- if I catch an
OutOfMemoryException
or ExecutionEngineException
, I
re-throw it immediately. Here's the macro I call, after I catch the generic
Exception
class:
#define TRACE_EXCEPTION_AND_RETHROW_IF_NEEDED(e) \
System::Type __gc* exType = e->GetType (); \
if (exType == __typeof (OutOfMemoryException) || \
exType == __typeof (ExecutionEngineException)) \
throw; \
Console::WriteLine ( \
S"\n{0}\n{1} ({2}/{3}): {4}\n{0}", \
new String (L'-', 79), \
new String ((char *) __FUNCTION__), \
new String ((char *) __FILE__), \
__box (__LINE__), \
e->Message);
And finally, a word about assertions. C has assert
macro, VB had
Debug.Assert
method, .NET has a static method Assert
in the Debug
class too. One of the overloads of the method takes
a boolean expression, and a string, describing the test. C's assert
is smarter. It just needs an expression, and it builds the string, containing the
expression automatically by stringizing the expression. Now, I really hate the
fact, that C# lacks some real preprocessing features. However, MC++ (thanks God!)
was not slaughtered by the compiler writers (long live legacy code support), so
here's my .NET version of the C's assert
macro:
#define DEBUG_ASSERT(x) Debug::Assert (x, S#x)
If I was writing in C# the code for this article, I should have typed
Debug.Assert (null != objRef, "null != objRef");
everywhere I needed to assert. In MC++, I just write
DEBUG_ASSERT (0 != objRef);
and it is automatically expanded into
Debug::Assert (0 != objRef, S"0 != objRef");
Not to speak about the
__LINE__
,
__FILE__
and
__FUNCTION__
macros I could use in the
DEBUG_ASSERT
macro! Now let's everybody scream loudly with me:
"C# sucks!":)
Tweaking the configuration file
I know you're all smart guys (otherwise what the heck are you doing on
CodeProject?:), and smart guys don't need lengthy explanations, all they need is
to take a look at an example. So here it is -- the XML configuration file, used
by both the Machine Load Monitoring and Reporting servers. The explanation of all
elements is given below the file:
="1.0" ="utf-8"
<configuration>
<LoadReportingServer
>
<IpAddress>127.0.0.1</IpAddress>
<Port>12000</Port>
<ReportingInterval>2000</ReportingInterval>
</LoadReportingServer
>
<LoadMonitoringServer
>
<IpAddress>127.0.0.1</IpAddress>
<CollectorPort>12000</CollectorPort>
<CollectorBacklog>40</CollectorBacklog>
<ReporterPort>13000</ReporterPort>
<ReporterBacklog>40</ReporterBacklog>
<MachineReportTimeout>4000</MachineReportTimeout>
<RemotingProtocol>tcp</RemotingProtocol>
<RemotingChannelPort>14000</RemotingChannelPort>
<PerformanceCounters>
<counter alias="cpu"
category="Processor"
name="% Processor Time"
instance="_Total"
load-weight="0.3"
interval="500"
maximum-measures="5" />
<-- ... -->
</PerformanceCounters>
</LoadMonitoringServer
>
</configuration>
Even though you're smart, I know that some of you have some questions, I am
about to answer. First, I'm going to explain the purpose of all elements and
their attributes, and I'll cover some wierd settings, so read on...
(to save some space, I'll refer to the element LoadReportingServer
as LRS
, and I'll write LMS
instead of
LoadMonitoringServer
).
Element/Attribute |
Meaning/Usage |
LRS/IpAddress |
When you're using UDP + multicasting (the default), the IpAddress
is the IP address of the multicast group, MLMS and MLRS join, in order to
communicate. If you're not using multicasting, but are still using UDP or TCP,
this element specifies the IP address (or the host name) of the MLMS server,
MLRS report to. Note that because you don't use multicasting, there's no way for
the MLRS servers to "multicast" their machine loads to all MLMS
servers. There's no doubt that this element's text should be equal to
LMS/IpAddress in any case.
|
LRS/Port |
Using UDP + multicasting, UDP only or TCP, that's the port to which MLRS servers
send, and at which MLMS servers receive machine loads.
|
LRS/ReportingInterval |
MLRS servers report machine loads to MLMS ones. The ReportingInterval
specifies the interval (in milliseconds) at which, a MLRS server should report
its load to one or more MLMS servers. If you have paid attention in the
Some Implementation Details section, I said, that even if
the interval has ellapsed, a machine may not report its load, because it has not
gathered the raw data it needs to calculate its load. See the
counter element's interval attribute for more
information.
|
LMS/IpAddress |
In the UDP + multicasting scenario, that's the multicast group's IP address, as
in the LRS/IpAddress element. When you're using UDP or TCP only, this
address is ignored.
|
LMS/CollectorPort |
The port, on which MLMS servers accept TCP connections, or receive data from,
when using UDP.
|
LMS/CollectorBacklog |
This element specifies the maximum number of sockets, a MLMS server will use,
when configured for TCP communication.
|
LMS/ReporterPort |
If haven't been reading the article carefully, you're probably wondering what
does this element specify. Well, in my first design, I was not thinking that
Remoting will serve me so well to build the Load Balancing Library (LBL). I wrote
a mini TCP server, which was accepting TCP requests and returning the least
loaded machine. Because LBL had to connect to an MLMS server and ask which is
the fastest machine, you can imagine that I've written several overloads of the
GetLeastLoadedMachine method, accepting timeouts and default
machines, if there're no available machines at all. At the moment I finished the
LBL client, I decided that the design was too lame, so I rewritten the LBL library
from scratch (yeah, shit happens:), using Remoting. Now, I regret to tell you
that I've overwritten the original library's source files. However, I left the
TCP server completely working -- it lives as the ReporterWorker
class, and persists in the ReporterWorker.h/.cpp files in the
LoadMonitoringLibrary project. If you want to write an alternative
LBL library, be my guest -- just write some code to connect to the LMS reporter
worker and it will report the fastest machine's load immediatelly. Note that the
worker is accepting TCP sockets, so you should always connect to it using TCP.
|
LMS/ReporterBacklog |
It's not difficult to figure out that this the backlog of the TCP server I was
talking about above.
|
LMS/MachineReportTimeout |
Now that's an interesting setting. The MachineReportTimeout is the
biggest interval (in milliseconds) at which a machine should report its successive
load in order to stay in the load balancing. This means, that if machine has
reported 5 seconds ago, and the timeout interval is set to 3 seconds, the machine
is being removed from the load balancing. If it later reports, it is back in
business. I think this is a bit lame, because one would like to configure each
machine to report in different intervals, but I don't have time (now) to fix
this, so you should learn to live with this "feature". One way to work
around my "lameness" is to give this setting a great enough value. Be
warned though, that if a machine is down, you won't be able to remove it from
the load balancing until this interval ellapses -- so don't give it too big
values.
|
LMS/RemotingProtocol |
Originally, I thought to use Remoting only over TCP. I thought that HTTP would be
too slow (it is one level above TCP in the OSI stack). Then, after I recalled how
complex Remoting was, I realized that the HTTP protocol is blazingly faster than
the Remoting itself. So I decided to give you an option which protocol to use.
Currently, the solution supports only the TCP and HTTP protocols, but you can
easily extend it to use any protocol you wish. This setting accepts a string,
which is either "tcp" or "http" (without the quotes, of
course).
|
LMS/RemotingChannelPort |
That's the port, MLMS uses to register and activate the load balancing object with
the Remoting runtime.
|
LMS/PerformanceCounters |
This element contains a collection of performance counters, used to calculate the
machine's load. Below are the given the attributes of the counter
XML element, used to describe a CounterInfo object, I written about
somewhere above.
|
counter/alias |
Though currently not used, this attribute specifies the alias for the otherwise
too long performance counter path. See the TODO(s) section
for the reason I've put this attribute.
|
counter/category |
The general category of the counter, e.g. Processor , Memory ,
etc.
|
counter/name |
The specific counter in the category, e.g. % Processor Time ,
Page reads/sec , etc.
|
counter/instance |
If there are two or more instances of the counter, the instance
attribute specifies the exact instance of the counter. For example, if you
have two CPUs, then the first CPU's instance is "0", the second one
is "1", and both are named "_Total"
|
counter/load-weight |
The weight that balance the counter values. E.g. you can give more weight to
the values of Processor\% Processor Time\_Total then to
Processor\% User Time\_Total ones. You get the idea.
|
counter/interval |
The interval (in milliseconds) at which a performance counter is asked to
return its next sample value.
|
counter/maximum-measures |
The size of the cyclic queue (I talked about above), that stores the transient
state of a performance counter. In other words, the element specifies how many
counter values should be collected in order to get a decent weighted average (WA).
The counter does not report its WA until it collects at least
maximum-measures of sample values. If the CounterInfo
class is asked to return its WA before it collects the necessary number of sample
values, it blocks and waits until it has collected them.
|
... and the other configuration file:)
Which is "the other configuration file"? Well, it is the
Web.config
file in the sample load-balanced web application. It
has 3 vital keys defined in the appSettings
section. They are the
machine on which MLMS runs, and the Remoting port and protocol where the machine
has registered its remoted object.
<appSettings>
<add key="LoadBalancingMachine" value="..." />
<add key="LoadBalancingPort" value="..." />
<add key="LoadBalancingProtocol" value="..." />
</appSettings>
You can figure out what the keys mean, as you have seen the code in the
Helper
class of the web application. The last key accepts a string,
which can be either "TCP" or "HTTP" and nothing else.
Deployment
There are 7 ways to deploy the solution onto a single machine. That's right -- seven.
To shorten the article and lengthen my life, I'll refer to the Machine Load Monitoring
Server as LMS, to the Machine Load Reporting Server as LRS, and the Load Balancing
Library as LBL. Here're the variations:
- LMS, LRS, LBL
- LMS, LRS
- LMS, LBL
- LMS
- LRS, LBL
- LRS
- LBL
It is you, who decide what to install and where. But is me, who developed the
setup projects, so you have to pay some attention to what I'm about to tell you.
There are 4 setups. The first one is for sample load-balanced web application.
The second one is for the server part of the solution, i.e. the Machine Load
Monitoring and Reporting Servers. They're bundled in one single setup, but it's
your call which one you run, once you've installed them. The 3rd setup contains
only the load balancing library and the 4th one contains the entire source code
of the solution, including the source for the setups.
Here is a simple scenario to test whether the code works: (you should have setup
a multicast group on your LAN or ask an admin to do that). We'll use 2 machines --
A and B. On machine A, build the SharedLibrary
project first, then
build the whole solution (you may skip the setup projects). Deploy the web
application. Modify the XML configuration file for MLMS and MRLS. Run the servers.
Deploy the web application, modify its Web.config
file and launch it.
Click on the web page's link. It should work, and the load balancing should
redirect you to the same machine (A). Now deploy only MLRS, and the web
application to machine B. Modify the configuration files, but this time, in
Web.config
, set the LoadBalancingMachine
key to
"A". You've just explained to B's LBL to use machine A's remoted
load balancing object. Run MLRS on machine B. It should start to report B's
load to A's MLMS. Now do some CPU-intensive operation (if < WinXP, right-click
the Desktop and drag your mouse behind the Task Bar; this should give you about
100% CPU utilization) on machine A. Its web application should redirect you to
the web app on machine B. Now stop the B's MLRS server. Launch B's web
application. It should redirect you to A's one. I guess that's it. Enjoy playing
around with all possible deployment scenarios:)
Some thoughts about MC++ and C#
Managed C++ to C# translation
There's nothing easier than converting pure managed C++ code to C#. Just press
Ctrl-H in your mind and replace the following sequences (this will work only for
my source files, as other developers may not use the whitespace in the same way
I use it).
MC++ C#
---- ----
:: .
-> .
__gc*
__gc
__sealed __value
using namespace using
: public :
S" "
__box (x) x
While the replacements above will translate 85% of the code, there're several
things you should do manually:
-
You have to translate all preprocessor directives, e.g. remove the
header guards (
#if !defined (...) ... #define ... #endif
),
and manually replace the macros with the code, they are supposed to
generate.
-
You have to convert all C++ casts to C# ones, i.e.
static_cast<SomeType __gc*> (expression) to
((SomeType) expression) or (expression as SomeType)
-
You have to put the appropriate access modifier keyword to all members
in a class, i.e. you should change:
PUBLIC:
... Method1 (...) {...}
... Variable1;
PRIVATE:
... Method3 (...) {...}
to
public ... Method1 (...) {...}
public ... Variable1;
private ... Method3 (...) {...}
-
You have to combine the header and the implementation files into a single
C# source file.
C#'s readonly fields vs MC++ non-static const members
It is really frustrating that MC++ does not have an equivalent of C#'s
readonly fields (not properties). In C# one could write the following
class:
public class PerfCounter
{
public PerfCounter (String fullPath, int sampleInterval)
{
Debug.Assert (null != fullPath);
if (null == fullPath)
throw (new ArgumentNullException ("fullPath"));
Debug.Assert (sampleInterval > 0);
if (sampleInterval <= 0)
throw (new ArgumentOutOfRangeException ("sampleInterval"));
FullPath = fullPath;
SampleInterval = sampleInterval;
}
public readonly String FullPath;
public readonly int SampleInterval;
}
You see that the C# programmer doesn't have to implement read-only properties,
because the readonly fields are good enough. In Managed C++, you can simulate
the readonly fields, by writing the following class:
public __gc class PerfCounter
{
public:
PerfCounter (String __gc* fullPath, int sampleInterval) :
FullPath (fullPath),
SampleInterval (sampleInterval)
{
Debug::Assert (0 != fullPath);
if (0 == fullPath)
throw (new ArgumentNullException (S"fullPath"));
Debug::Assert (sampleInterval > 0);
if (sampleInterval <= 0)
throw (new ArgumentOutOfRangeException (S"sampleInterval"));
}
public:
const String __gc* FullPath;
const int SampleInterval;
};
So far, so good. You're probably wondering why I am complaining about MC++.
It looks like the MC++ version is even cooler than the C# one. Well, the example
class was too simple. Now, imagine that when you find an invalid parameter, you
should change it to a default value, like in the C# class below:
public class PerfCounter
{
public PerfCounter (String fullPath, int sampleInterval)
{
Debug.Assert (null != fullPath);
if (null == fullPath)
throw (new ArgumentNullException ("fullPath"));
Debug.Assert (sampleInterval > 0);
if (sampleInterval <= 0)
sampleInterval = DefaultSampleInterval;
FullPath = fullPath;
SampleInterval = sampleInterval;
}
public readonly String FullPath;
public readonly int SampleInterval;
private const int DefaultSampleInterval = 1000;
}
Now, the corresponding MC++ code will not compile, and you'll see why below:
public __gc class CrashingPerfCounter
{
public:
CrashingPerfCounter (String __gc* fullPath, int sampleInterval) :
FullPath (fullPath),
SampleInterval (sampleInterval)
{
Debug::Assert (0 != fullPath);
if (0 == fullPath)
throw (new ArgumentNullException (S"fullPath"));
Debug::Assert (sampleInterval > 0);
if (sampleInterval <= 0)
SampleInterval = DefaultSampleInterval;
}
public:
const String __gc* FullPath;
const int SampleInterval;
private:
static const int DefaultSampleInterval = 1000;
};
Now, one may argue, that we could initialize the const member SampleInterval
in the initialization list of the constructor like this:
SampleInterval (sampleInterval > 0 ? sampleInterval : DefaultSampleInteval)
and he would be right. However, if we need to connect to a database first, in
order to do the check, or we need to perform several checks for the parameter I
can't figure out how to do this in the initialization list. Do you? That's why
MC++ sucks compared to C# for readonly fields. Now, the programmer is forced to
make the const fields non-const and private, and write code to implement read-only
properties, like this:
public __gc class LamePerfCounter
{
public:
LamePerfCounter (String __gc* fullPath, int sampleInterval)
{
Debug::Assert (0 != fullPath);
if (0 == fullPath)
throw (new ArgumentNullException (S"fullPath"));
Debug::Assert (sampleInterval > 0);
if (sampleInterval <= 0)
sampleInterval = DefaultSampleInterval;
this->fullPath = fullPath;
this->sampleInterval = sampleInterval;
}
__property String __gc* get_FullPath ()
{
return (fullPath);
}
__property int get_SampleInterval ()
{
return (sampleInterval);
}
private:
String __gc* fullPath;
int sampleInterval;
static const int DefaultSampleInterval = 1000;
};
"Bugs suck. Period."
John Robins
"I trust that I and my colleagues will use my code correctly. To avoid bugs,
however, I verify everything. I verify the data that others pass into my code, I
verify my code's internal manipulations, I verify every assumption I make in my
code, I verify data my code passes to others, and I verify data coming back from
calls my code makes. If there's something to verify, I verify it. This obsessive
verification is nothing personal against my coworkers, and I don't have any
psychological problems (to speak of). It's just that I know where the bugs come
from; I also know that you can't let anything by without checking it if you want
to catch your bugs as early as you can."
John Robins
I do what John preaches in his book, and you should do it too. Trust me, but
verify my code. I think I have debugged my code thoroughly, and I haven't met
bugs in it since the day before yesterday (when I started to write the article).
However, if you see one of those nasty creatures, let me know. My e-mail is
stoyan_damov[at]hotmail.com.
Though I think I don't have bugs (statistically, I should have 8 bugs in the
4 KLOC) I'd love to share an amazing Microsoft bug with you. It
caused me quite some time to find it, but unfortunatelly, I was not able to
reproduce it, when it disappeared (yes! it disappeared) later. I've wrapped all
of my classes in the namespace SoftwareLoadBalancing
. So far so good. Now I
have several shared classes in the SharedLibrary
assembly. The Load Monitoring
Library uses one of these classes to do its job, so it is #using
the SharedLibrary.
I was able to build LML several times, and then suddenly, the linker complained
that it cannot find the shared class I was using in the namespace
SoftwareLoadBalancing
. I'll name that class X to save myself some typing. I
closed the solution, went to the Debug
folder of the shared library, deleted
everything, deleted all files in the common Bin
folder and tried again. Same
result! I let the linker grumble for three more tries and then launched the ILDAsm
tool. When I looked at the SharedLibrary.dll
, I found that the class X was
"wrapped" twice in the namespace SoftwareLoadBalancing
, i.e. it was now
SoftwareLoadBalancing::SoftwareLoadBalancing::X
. Because I wanted to do some
tests and had no time to deal with the bug, I tried to alias the namespace in
LML like this:
using namespace SLB = SoftwareLoadBalancing;
Then, I tried to access the X class, using the following construct:
SLB::SLB::X __gc* x = new SLB::SLB::X ();
Maybe I don't understand C++ namespace aliasing very well, or maybe the
documentation does not explain it, but what happened this time was that the
linker complained again, that it can't find SoftwareLoadBalancing::SLB::X
class!!! The compiler "replaced" SLB with SoftwareLoadBalancing only once.
Needless to say, I was quite embarassed. Not only the compiler put my class
wrapped in two namespaces, but it was not helping me to work around the
problem!:) Do you know what I did then? I aliased the namespace in a way the
linker or compiler should understand:
using namespace SLB = SoftwareLoadBalancing::SoftwareLoadBalancing;
Then, I tried to instantiate the X class like this:
SLB::X __gc* x = new SLB::X ();
I'm sure you don't know what happened then, because I was hiding a simple fact
from you. I was rebuilding each time. Now are you able to guess what happened?
The linker complained again that it cannot find class X in the namespace
SoftwareLoadBalancing::SoftwareLoadBalancing
. WTF?! I was furious! I gone crazy!
I launched ILDasm once again, and looked at the SharedLibrary
. The class
was properly wrapped once in the namespace SoftwareLoadBalancing
. Now, I don't
know if this is a bug in the compiler or in the linker or in my mind. What I
know is that when I have such a problem next time, I won't go to chase
unexisting bugs in my source files, but will launch my love ILDasm and see
whether I'm doing something wrong, or Microsoft are trying to drive me crazy:)
TODO(s)
(So what the heck have you done, when there're so much TODOs?!)
Conclusion
Thank you for reading the article! It was the longest article I've written in
my entire life (I started writing articles several months ago:) I'm really
impressed how patient you are! Now that I thanked you, I should also say
"Thanks!" to Microsoft, which brought us the marvelous .NET technology.
If .NET did not exist, I doubt if I would write such an article, and even if I
did, it wouldn't include the C++ source code. The .NET framework makes
programming so easy! It just forces you to write all day long:) I feel like I'm
not programming, but rather prototyping. It is easier than VB was once. Really!
Now let's see what you've learned (or just read) from the article:
- what load balancing is in general
- my idea for dynamic software load balancing, the architecture
and some of the implementation details
- some multithreading issues and how to solve them
- network programming basics, including TCP, UDP and multicasting
- some (I hope) helpful tips and workarounds
- that if COM is love, then .NET is PASSION
- that I'm a pro-Microsoft guy:)
AaaaaaaaaaaaaI forgot to tell you! Please do not post messages in the message
board below that teach me to not use __gc*
, when I could just type
*
. I just love the __gc
keyword, that's it:)
Below, there are two books, I've read once upon a time, that served me well to
write this article's source code. You'll be surprised that they are no .NET
books. I'm not joking -- there're maybe over 250 .NET books, and I've read 10 or
so, that's why I can't recommend you any .NET book, really. It wouldn't be fair
if I say "Book X is the best on topic Y" because I haven't read at least 1/2 of
the .NET books to give you an (authoritive) advice. The books below are not just
a "Must-Have" and "Must-Have-Read":) ones. They are priceless for the Windows
developer. Stop reading this article, and go buy them now! :)
Programming Server-Side Applications for Microsoft Windows 2000 (ISBN 0-7356-0753-2)
by Jeffrey Richter, Jason D. Clark
"We developers know that writing error-tolerant code is what we should do, but
frequently we view the required attention to detail as tedious and so omit it.
We've become complacent, thinking that the operating system will 'just take care
of us.' Many developers out there actually believe that memory is endless, and
that leaking various resources is OK because they know that the operatingm
system will clean up everything automatically when the process dies. Certainly
many applications are implemented in this way, and the results are not
devastating because the applications tend to run for short periods of time and
then are restarted. However, services run forever, and omitting the proper
error-recovery and resource-cleanup code is catastrophic!"
Debugging Applications (ISBN 0-7356-0886-5), by John Robins
"Bugs suck. Period. Bugs are the reason you endure death-march projects with
missed deadlines, late nights, and grouchy coworkers. Bugs can truly make your
life miserable because if enough of them creep in to your software, customers
will stop using your product and you could lose your job. Bugs are serious
business... As I was writing this book, NASA lost a Mars space probe because of
a bug that snuck in during the requirements and design phase. With computers
controlling more and more mission-critical systems, medical devices, and
superexpensive hardware, bugs can no longer be laughed at or viewed as something
that just happens as a part of development."
And the best text on Managed Extentions for C++ .NET (besides the specification
and the migration guide), which is not a book, but a Microsoft Official
Curriculum (MOC) course: "Programming with Managed Extensions for Microsoft
Visual C++ .NET" (2558). You should definitely visit this course if you're
planning to do any .NET development using Managed C++.
A (final) word about C#
Many of you are probably wondering why I have implemented this solution in Managed C++,
and not in C# since I'm writing only managed code. I know that most of the .NET developers are on the C# bandwagon. So am I.
I'm programming in C# all day long. That's my job. However, I love C++ so much,
that I prefer to write in Managed C++. There are probably hundreds of reasons I prefer
MC++ to C#, and IJW (and unmanaged code in general) is the probably the last on the list. C# has
nothing to do with C or C++, except for some slight similarities in the syntax, no matter
that Microsoft are trying to convince us in the opposite. It is way closer to Java, than
to C/C++. Do I sound extreme? Well, a C/C++ die-hard coleague of mine
(Boby -- http://606u.dir.bg/) forced himself to learn and use VB.NET in order
to not forget C++ when he was developing a .NET application. Now who's extreme?:)
Microsoft are pushing us very badly to forget C++, so they and some open-source C++
die-hards are the only ones who use it:) Haven't you noticed? As of today, the ATL list
generates a couple of unique posts each day, compared to at least 10-20, several months
ago, before Microsoft suddenly decided to drop it, and put it back in a week, when
catcalled by the ATL community. And what does Microsoft say about this, eh? COM is not
dead! COM is here to stay! ATL lives (somewhere in time). Blah, blah:) I adore
.NET, but I don't want to search for the "The C Programming Language"
and "The C++ Programming Language" books in the dusty bookshelves in 3
or 4 years, when it will be replaced by some even-cooler technology. Long live
C++!:) Anyway, I was tempted to implement the solution in C#, because the monthly
prizes for it were better-looking:)
Disclaimer
This software comes "AS IS" with all faults and with no warranties whatsoever.
If you find the source code or the article useful, wish me a Merry Christmas:)
Wishing you the merriest Christmas, and looking forward to seeing you next year:)
Stoyan