Introduction
This is the final post on the initial proposed F# series that I had planned. That doesn’t mean there may not be more from me in the future, but this will be the final one in the current batch. So what will this one be on?
This one will be on type providers. Type providers are a fairly complex beast, and they certainly do not fit into a beginners space (at least not in my opinion), so we will be concentrating on using Type Providers and not how to create them (standing on the shoulders of giants if you like).
So What Are Type Providers
Here is what MSDN has to say about type providers.
An F# type provider is a component that provides types, properties, and methods for use in your program. Type providers are a significant part of F# 3.0 support for information-rich programming. The key to information-rich programming is to eliminate barriers to working with diverse information sources found on the Internet and in modern enterprise environments. One significant barrier to including a source of information into a program is the need to represent that information as types, properties, and methods for use in a programming language environment. Writing these types manually is very time-consuming and difficult to maintain. A common alternative is to use a code generator which adds files to your project; however, the conventional types of code generation do not integrate well into exploratory modes of programming supported by F# because the generated code must be replaced each time a service reference is adjusted.
The types provided by F# type providers are usually based on external information sources. For example, an F# type provider for SQL will provide the types, properties, and methods you need to work directly with the tables of any SQL database you have access to. Similarly, a type provider for WSDL web services will provide the types, properties, and methods you need to work directly with any WSDL web service.
The set of types, properties, and methods provided by an F# type provider can depend on parameters given in program code. For example, a type provider can provide different types depending on a connection string or a service URL. In this way, the information space available by means of a connection string or URL is directly integrated into your program. A type provider can also ensure that groups of types are only expanded on demand; that is, they are expanded if the types are actually referenced by your program. This allows for the direct, on-demand integration of large-scale information spaces such as online data markets in a strongly typed way.
But I actually prefer what one user said on this StackOverflow post.
Say you have some arbitrary data entity out in the world. For this example, let’s say it’s a spreadsheet. Let’s also say you have some way to get/infer schema/metadata for that data – that is, you can know types (e.g., double
versus string
) and relationships (e.g., this column means ‘salary
’) and metadata (e.g., this sheet is for the June 2009 budget). Type providers let you code up a kind of ‘shim library’ that knows about some kind of data entity (e.g., a spreadsheet) and use that library as part of the compiler/IDE toolchain so that you can write code like:
mySpreadsheet.ByRowAndColumn.C4
or something, and get Intellisense (autocompletion) and tooltips (e.g., describing cell C4
as Salary
for Bob
) and static typing (e.g., have it be a double
or a string
or whatever it is). Essentially, this gives you the tooling affordances of statically-typed object models with the ease-of-use leverage of various dynamic or code-generation systems, with some improvements on both. The ‘cost’ is that someone has to write the shim library (the ‘type provider’), but many such providers are very general (e.g., one that speaks OData or Excel or WMI or whatnot) and so a small handful of type provider libraries makes vast quantities of the world’s data available in your programming language with static typing and first-class tooling support.
The architecture is an open compiler, where provider-authors implement a small interface that allows them to inject new names/types into the programming context.
What is clear is that type providers must be doing a whole lot of work behind the scenes to create new types, which must be using something like Reflection.Emit
to create new types based on initial metadata, at the compilation stage, which is pretty whack.
Where Can I Get Me Some Type Providers
F# 3.0 comes with a few standard Type providers, namely the following ones that have sample usages shown at the links below, I however will not be covering these particular type providers, as they all rely on external things (such as SQL server) that are a bit hard for me to demo in a blog post, and I wanted to show examples in this F# series that would allow users to kind of copy and paste the code I have posted here. So if you want to try out the type providers listed below, you will have to follow the links which will take you to the relevant examples.
We will be looking at some other type providers that have some more manageable dependencies, such as an XML file, a CSV file, etc.
So where can you get your hands on these extra type providers? There is a F# data library which contains many type providers which you can read more about here, and it is also available as a NuGet package, so it is nice and easy to install. When you download this package, you will get the following type providers:
- JSON type provider
- XML type provider
- CSV type provider
- Worldbank type provider
- Freebase type provider
You may have a need for another type provider, someone may have even written one for you, so it is worth doing a quick Google search before you set off to write your own one.
How Can I Use These Here Type Providers
As I say, there will be a multitude of type providers out there in the wild, so chances are there may be one doing what you need. In this post, I will be concentrating on 2 type providers found in the F# data library that I just mentioned. I will not be covering all of the functions of these type providers, as the original authors have some great documentation on them anyway.
XML Type Provider
Here is a simple example that shows how to use the type provider to parse some XML.
open System
open FSharp.Data
type Detailed = XmlProvider<"""<author><name full="true">Karl Popper</name></author>""">
[<EntryPoint>]
let main argv =
let info = Detailed.Parse("""<author><name full="false">Thomas Kuhn</name></author>""")
printfn "%s (full=%b)" info.Name.Value info.Name.Full
Console.ReadLine() |> ignore
0
Which when run gives this result:
The key points to take away from this are:
- We get intellisense for this data that we do not actually have a concrete type for, it is simply inferred by the type provider.
- We can use the properties of the type that has been inferred.
This is pretty crazy stuff when you stop and think about it
Here is a slightly more advanced version of the XML type provider that shows how to deal with multiple nodes, and a separate XML file:
open System
open FSharp.Data
type Authors =
XmlProvider<"C:\Users\sacha\Desktop\ConsoleApplication1\ConsoleApplication1\Writers.xml">
[<EntryPoint>]
let main argv =
let authors = """
<authors topic="Philosophy of Mathematics">
<author name="Bertrand Russell" />
<author name="Ludwig Wittgenstein" born="1889" />
<author name="Alfred North Whitehead" died="1947" />
</authors> """
let topic = Authors.Parse(authors)
printfn "%s" topic.Topic
for author in topic.Authors do
printf " – %s" author.Name
author.Born |> Option.iter (printf " (%d)")
printfn ""
Console.ReadLine() |> ignore
0
Which when run gives the following output:
CSV Type Provider
Here is another example to a CSV file this time.
open System
open FSharp.Data
type Stocks = CsvProvider<"C:\Users\sacha\Desktop\ConsoleApplication1\ConsoleApplication1\MSFT.csv">
[<EntryPoint>]
let main argv =
let data = "Date,Open,High,Low,Close,Volume,Adj Close
2012-01-27,29.45,29.53,29.17,29.12,44187700,22.23
2012-01-26,29.61,29.70,29.40,29.13,49102800,23.50
2012-01-25,29.07,29.65,29.07,29.14,59231700,24.56
2012-01-24,29.47,29.57,29.18,29.15,51703300,25.34"
let msft = Stocks.Parse(data)
let firstRow = msft.Rows |> Seq.head
let lastDate = firstRow.Date
let lastOpen = firstRow.Open
for row in msft.Rows do
printfn "HLOC: (%A, %A, %A, %A)" row.High row.Low row.Open row.Close
Console.ReadLine() |> ignore
0
I also wanted to call out the fact that the data types are correct, where the following is true
:
Date
is a DateTime
Open
is decimal
This can be seen in the screen shot below, where I was hovering the mouse over the firstRow.Date
expression here:
I think type providers are pretty crazy, and do a lot of good work behind the scenes, I hope you too can see the value of them, and explore what is out there, and give them a try.
That’s It
Like I say, this is the final post in this series, and my, it’s been quite the ride for me, I just hope you guys/gals also enjoyed it. I would love to know actually, I got a bit of feedback along the way, but it would be nice to know if people have enjoyed this series, and whether it hit the mark or not. So if you feel inclined to leave me a comment, or want to buy me a virtual beer, I would love to hear about that. I also accept travellers cheques, free holidays and Lego for my kid. He likes fire engines.
Ha Ha.
Anyway, thanks for listening to my rants on F#. I will now be getting back to my roots, which is writing articles for CodeProject which I really enjoy, so until we all meet again, adios, see you later. Happy F#ing!