In this blog post, I am going to talk about Microsoft .NET Common Language Runtime (CLR) journey from its early days to latest Windows Runtime (WinRT) platform, which is released with Windows 8. I would suggest you read my previous blog post on WinMD files before reading further to gain a better understanding of WinMD files introduced with Windows 8.
WinRT is the new platform released with Microsoft’s latest operating system Windows 8 and .NET Framework 4.5. It is important to remember that for WinRT APIs to work, it needs at least Windows 8 as operating system (or Windows Server 2012 for server family of operating systems), and hence, WinRT is not available for previous platforms.
Coming back to WinRT, did we need another set of runtime libraries and a new file extension, WinMD? I am sure you are having this basic question in mind for what good this new WinRT library is going to do and why on earth Microsoft added a new file type? I will try and answer it in this blog post.
Windows Runtime (WinMD at Heart)
WinRT is the new platform which exposes underlying operating system APIs in object-oriented manner to developers (across languages [1]) and enables them to make best use of underlying platform capabilities to build great immersive apps.
WinMD files are at the heart of new Windows Runtime (WinRT) concept. These new WinMD files only contain metadata for WinRT types. WinMD files themselves do not contain any code. Hence, when you open any WinMD file using ILDAsm tool, you see that all class methods are marked as "runtime managed" which, as per ECMA-335 CLI specification, means the managed implementation for the method will be provided by CLR. Please read my earlier blog post for more details on WinMD files.
Underneath, CLR bridges the gap of converting managed calls to low-level operating system (native) API calls and vice versa, i.e., invoking managed callback functions from native world. This is a great step forward now because CLR is taking care of a lot of complexities related to making these native API calls from managed code and vice-versa. Native APIs are written as C/C++ libraries, and before WinRT was introduced, these were consumed using Platform Invoke feature of CLR, commonly referred to as PInvoke.
With the introduction of WinRT, there has been a paradigm shift for managed developers for how they consume platform capabilities which otherwise are not already wrapped up in BCL. Let’s understand this evolution starting from its early days to latest WinRT platform. This will help you understand the internal workings of CLR and role of WinMD files too. We have come a long way with the introduction of WinRT, it’s time for a little flashback.
.NET is Born!
To be able to best understand why we need Windows Runtime (WinRT) and WinMD files or how it is helpful, we need to go back a little into history and understand what kind of problems were there, before WinRT was born, which are now addressed by it. With the introduction of CLR, Microsoft developed rich metadata to describe types and their members in libraries (which is defined in ECMA-335 spec). This was a great step forward as existing libraries (DLLs) did not have a way to manifest it in themselves but we had to separately produce type library (TLB) files for components so that these components could be consumed in other applications. For example, you could write a COM component in C++ and then consume it in VB6 application. With the advent of .NET, assemblies themselves contained code as well as rich metadata, which was great as assemblies could describe its own types, members, and the external assemblies that it references (this was key feature which solved lot of DLL referencing related issues along with CLR assembly binder), which made assemblies self-contained and deployable unit. Hence, referencing and consuming these assemblies from other applications was a lot easier as compared to olden days, where referencing correct version of DLL, building code, and deploying binaries itself was a big challenge, commonly known as “DLL Hell”. The single biggest factor which helped solve DLL hell problem in the .NET world was the ability to manifest all the rich metadata within the assembly (library) itself. Traditionally, components had very tight binary linking whereas CLR assemblies had metadata based linking.
PInvoke is Born (Re-use Existing Investment)
To begin with, .NET developers did not have everything in the form of these assemblies, especially when it came to underlying low-level operating system APIs (such as APIs exposed by kernel32.dll) because, these were written and available in the form of C/C++ libraries. Microsoft did wrap certain APIs with managed wrappers over native Windows APIs, such as those for logging entries to event viewer, EventLog*classes in System.Diagnostics namespace
which, are just managed wrappers over underlying native APIs. But, Microsoft could not have done it for all APIs (simply because there would be no value addition with creating and maintaining tons of managed wrappers over native APIs).
Therefore, Microsoft provided a way to enable .NET developers to consume these libraries using PInvoke. The way PInvoke works is that you would define API method signature in code and mark it as external, i.e., the method implementation being available in another DLL (mentioned using attributes). Underneath, CLR would create Runtime Callable Wrapper (RCW) stub which would take care of marshalling calls between .NET Apps and native libraries. CLR would rely on method signature which you re-create in code to take care of marshalling requirements. Methods, marked as external, and type definitions re-created in source code helped CLR get the required metadata for it to be able to generate the RCW stub for marshalling and forwarding calls to libraries. In essence, metadata for native types/structures and methods was key input for CLR to make its magic and connect the two worlds.
PInvoke Limitations
PInvoke was great but it had its own limitations because there was still considerable amount of work to be done for .NET developers to recreate types in code (which were needed for the API call), define marshalling rules for these types, their memory layout, and so on (which was ugly and to make even a basic API call work, it took considerable amount of effort). Lots of these underlying APIs work with pointers whereas .NET developers don't directly work with pointers so they struggled with making API calls using PInvoke (working with IntPtr
did not come easy to VB.NET or C# developers, to say the least). Although, they used familiar programming language such as VB.NET or C# but concepts were alien, such as pointers, memory layout, etc.
With PInvoke, CLR would use types and method signature created in source code to derive metadata, which it needed to create RCW stub, required for marshalling the call to native library. Lot of problems with this approach were derived from the fact that developers found it difficult to re-create method signatures and types in C#/VB code equivalents as they were required by CLR. In fact, there was a site (http://pinvoke.net) available, just for managed developers, to help with signature of these native methods and types such that these can be specified in their C#/VB counterparts, and this site proved to be a great asset for .NET developers trying to make native API calls using PInvoke. I am sure every developer who has created CLR applications, would have used this site at some point in time.
Therefore, Microsoft provided a way to enable .NET developers to consume these libraries using PInvoke. The way PInvoke works is that you would define API method signature in code and mark it as external, i.e., the method implementation being available in another DLL (mentioned using attributes). Underneath, CLR would create Runtime Callable Wrapper (RCW) stub which would take care of marshalling calls between .NET Apps and native libraries. CLR would rely on method signature which you re-create in code to take care of marshalling requirements. Methods, marked as external, and type definitions re-created in source code helped CLR get the required metadata for it to be able to generate the RCW stub for marshalling and forwarding calls to libraries. In essence, metadata for native types/structures and methods was key input for CLR to make its magic and connect the two worlds.
For example:
[DllImport("kernel32.dll", CharSet=CharSet.Auto, SetLastError=true)]
internal static extern IntPtr GetProcAddress([In] IntPtr hModule,
[In, MarshalAs(UnmanagedType.LPStr)] string lpProcName);
WinRT to Rescue (Partially*)
.NET developers had to re-create metadata in source code such as method signatures and type definitions for CLR to do RCW magic. And this was the biggest pain area for most developers. This is where WinRT makes life easy for developers as Microsoft came up with these new WinMD files which define WinRT libraries classes and methods in rich metadata format supported and understood by CLR . Now, developers don’t need to re-create method signatures in code but just add WinMD files as reference to the project.
Now, with following three things falling in place with Windows 8 (and Windows Server 2012):
- underlying operating system providing runtime libraries (WinRT), exposing native APIs in object-oriented manner
- rich metadata available for WinRT classes from WINMD files and
- RCW magic of CLR already in place,
Microsoft added new WinRT Interop functionality (on top of COM Interop infrastructure) to CLR and made life easy for managed developers. Now we can consume WinRT APIs just as you would consume any other managed API (in natural and familiar manner). With Windows Runtime you don’t have to deal with pointers and lot of complexities related with native Windows APIs but they are dealt for you by CLR. You still write code in managed language which you are familiar with.
So, WinMD files enable developers to consume native APIs from .NET apps in a natural and familiar manner (which they otherwise had to do painfully using PInvoke). So this solved an existing pain area for developers and then Microsoft went a step further which enabled developers to consume these native APIs from an array of programming languages, which are traditionally not used for .NET apps development, such as JavaScript by introducing a new feature called as "language projection".
So, to answer the question as WHY Microsoft introduced new WINMD files - it was to manifest rich metadata for operating system APIs so that developers can consume these platform APIs from managed code in a natural and familiar manner. This is the reason why Microsoft says that WinRT is NOT a new layer in itself BUT only exposes underlying operating system APIs to managed world developers.
Does WinRT replace PInvoke?
WinRT does not replace PInvoke as WinRT does not cover every API supported by native Windows libraries and hence, for certain APIs you would still need to use PInvoke. Rather, WinRT is enhanced version of PInvoke minus its complexities. WinRT builds on top of existing PInvoke and COM Interop infrastructure and build on existing concepts of RCW and CCW, which is great as it is re-using existing infrastructure and at the same time reducing complexities of it.
WinRT Only for .NET?
Windows Runtime libraries (e.g., Windows.UI.dll) are not implemented in managed code but some low-level language (my guess is C++) and made available to managed code using Windows Runtime Interop (built on top of COM Interop infrastructure) and .winmd files. The types which we see in WinMD files are defined and actually implemented by these WinRT libraries. But is WinRT only available for .NET developers? Not really, WinRT libraries can be accessed from outside CLR as well, for example, when you use JavaScript to build Metro apps, you are not using CLR to host your application but “Chakra” engine, which is used by Internet Explorer too, and internally it is consuming these WinRT libraries. So that proves the point that WinRT is not just for managed developers.
In fact, you can create classic-style COM components using Windows Runtime C++ Template Library (WRL) which can consume WinRT APIs and the COM component can then be accessed from any COM enabled technology including classic desktop apps.
WinRT Duplicates APIs?
Why Windows Runtime libraries duplicate certain sets of classes, especially those related to UI controls and thread pool when these are already available with .NET Framework? Why this duplication? – This duplication is there for a reason. As we saw above, WinRT is not just for managed developers but is consumed from other application hosts as well and it became necessary to make these basic sets of classes available through WinRT too. Managed developers had some of these classes already available to them and it looks like duplication for them but these classes exist in WinRT for a reason.
WinRT for Desktop Apps
Only a subset of WinRT Classes/APIs are available to desktop apps. Also, it is important to remember that WinRT is only available Windows 8 onwards so your target platform for the desktop app must be at least Windows 8 for the ability to access WinRT APIs.
S Hanslman has written this excellent blog post which shows manual steps for how to access WinRT APIs from desktop apps.
Summary
Microsoft has introduced a new set of runtime libraries for Windows platform, commonly referred to as WinRT library, which expose native functionality in an object oriented manner. These new libraries are based on solid principles of COM but Microsoft has shielded developers from complexities of COM. To make it easy for developers to interact with this underlying platform library, Microsoft developed WinMD files and supporting interop functionality to different runtime environments such as CLR and Chakra engine so that developers can consume these libraries from different languages (Java Script, VB .NET, C#, C++, etc.) in natural and familiar manner. WinRT and its interop with runtime environments makes platform capabilities available to an array of programming languages in natural and familiar manner.
In my next blog post, I will try to go further into this new magical world of WinRT and further explore its internal working.
Happy coding!
Vande Mataram!
(A salute to motherland)
P.S. In addition to blogging, I use Twitter to share tips, links, etc. My Twitter handle is: @girishjjain