Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / database / SQL-Server

Vector: A Concept-Driven Approach to SQL UDT

4.98/5 (30 votes)
26 May 2021CPOL50 min read 1   636  
A walkthrough in UDT programming with primary focus on database concepts
This article discusses UDT from the technical perspective, by taking you step by step in creating one. Along the way, you will discover quirks, inconsistencies, obscure settings, and workarounds.

Contents

Introduction

Have you tried searching for anything about SQL Server User Defined Type (UDT) in your favorite developer portal? More often than not, you’ve ended up finding something that ultimately culminates to debates about the author’s choice of model or whether UDT does have a place in software development at all. Argumentations like these are healthy for experienced developers, but for a beginner, they can be frustrating. I’ve been in those shoes before, and being an open-minded person, I did not care about what other people said, because they would just distract me from my focus on learning something that has the greatest potential of changing the way we develop database applications.

This article provides exactly those things I craved for when I was still writing my first UDT. It discusses UDT from the technical perspective, by taking you step by step in creating one. Along the way, you will discover quirks, inconsistencies, obscure settings, and workarounds. Different ways of implementing your UDT in terms of settings and .NET types are also discussed and pitted against each other, just to find out if common conceptions about them indeed translate to real-world behaviors.

A less conventional approach is employed for this article, by focusing primarily on the concepts that influence the structure and behaviors of UDT rather than the codes that implement them. This makes the text more verbose than your typical technical article, so people with short attention deficit, you’ve been warned. As for those who are happy to know the causes of things, read on!

What is a UDT?

A User Defined Type (UDT) is a SQL Server data type created using your favorite .NET language. It is one of the artifacts you can create with SQL Server CLR Integration (SQL-CLR for short). Let’s just say it’s the closest thing SQL Server can get to object-based programming.

Model

In a blatant gesture that would probably earn ire to the purists, we’re going to model an entity which is, of course, not a scalar – the vector. A vector is a quantity composed of a magnitude and a direction. We confine our discussion to the two dimensional vector which can be represented in the Cartesian coordinate system using two points, with one being understood as the origin (0,0). Our vector is then isomorphic to the polar coordinate and thus, can also be used to represent the said entity. The unit for direction is radian, for easy manipulation in .NET and T-SQL. Examples of vectors are velocity, acceleration, force, and magnetic fields. I won’t elaborate much about vector anymore, because the Internet has lots of good materials about it. For those who need some refresher on the model and other concepts applied here, these are some good resources:

Our vector is a simplified model of its real world counterpart. It is small, and the implementations of the members are naïve. It’s not industrial strength; having only enough members to illustrate the different capabilities of a UDT. We opted for a simplified model so we can concentrate on the main objective of the article, which is just to show you how to create a UDT, and not how to choose the right entity to model, let alone perfect it.

Implementation

Setup

A UDT can be implemented using a struct (structure in VB.NET) or a class. The struct is recommended because it is stored in the stack, and is generally faster. Being a set-based language, SQL can rip this benefit in transactions involving very high numbers of rows. It also requires less coding than the class counterpart. We’ll know later with our informal tests if there is really truth about all these theoretical claims.

The code in this article uses .NET 2.0, and runs against SQL Server 2005 and 2008. I refrained from using features of the later versions of the .NET Framework as much as possible to maintain backwards compatibility. The changes made for UDT in SQL Server 2008 is so subtle and there's a dedicated small section on that at the latter part of the article.

A UDT, like any other artifact in SQL-CLR, is integrated to SQL Server via a .NET assembly. To do this, you should create a class library project. The only relevant assemblies for the project are System, System.Data, and System.Xml, so you can delete the rest that Visual Studio imports by default. The initial implementation of our model is shown below:

VB
Imports System

Namespace MathSqlObj
    Public Structure Vector
        Private _magnitude As Double
        Private _direction As Double

        Public Property Magnitude() As Double
            Get
                Return _magnitude
            End Get
            Set(ByVal value As Double)
                _magnitude = value
            End Set
        End Property

        Public Property Direction() As Single
            Get
                Return _direction
            End Get
            Set(ByVal value As Single)
                _direction = value
            End Set
        End Property
    End Structure 
End Namespace
C#
using System;
namespace MathSqlObj
{    
    public struct Vector 
    {
        double _magnitude; 
        float _direction;
       
        public double Magnitude
        {          
            get { return _magnitude; }            
            set { _magnitude = value; }
        }

        public float Direction
        {
            set { _direction = value; }
            get { return _direction; }
        }        
    }
}
Fig. 1 - Basic vector struct

To turn our struct into a UDT, we need to adorn it with the SqlUserDefinedTypeAttribute found in the Microsoft.SqlServer.Server namespace. This tells the compiler that our type can be integrated to SQL Server as a UDT. It has one required parameter which specifies the kind of serialization format the UDT is using. Specify Format.Native for now. We’ll elaborate on this later.

VB
Imports System
Imports Microsoft.SqlServer.Server

Namespace MathSqlObj
    <SqlUserDefinedType(Format.Native)>
    Public Class Vector
        ' Rest of the codes here...
C#
using System;
using Microsoft.SqlServer.Server;

namespace MathSqlObj
{
    [SqlUserDefinedType(Format.Native)]
    public struct Vector : INullable
    {
        // Rest of the codes here
Fig. 2 - Vector in native format

We got our first taste of attributes in UDT programming, and just a heads-up; brace yourself for the coming attributes galore! UDT involves ample dosage of declarative programming to define some of its behaviors in SQL Server.

Accompanying the first ever adornment are four members that our struct should implement in order to qualify as a full-fledged UDT. We are going to add them one by one, as we explore the concepts of database and SQL that necessitate them.

Initialization and Value Assignment

Before you can use your UDT, you must initialize it by assigning values to its fields. This is not as straightforward as a native SQL type, because SQL does not have the concepts of constructor and property initializer which let you separate the values for each field. The only way to achieve value assignment in UDT while sticking to the standard SQL syntax is by putting all the values in one big chunk of string, like this:

Figure 3 - UDT SQL initialization

Fig. 3 – UDT SQL initialization

What’s happening behind the scenes is SQL Server calling a factory function Parse which accepts a string argument representing the input in SQL. This is responsible for splitting the string input, converting them to the desired types, and ultimately, assigning them to the fields. Of course, you’re the one who should provide the logic for that. You must also provide a delimiting character to facilitate easy splitting. The bulk of the code deals with format validations, and the more fields you have, the more tedious your work would be. If you want to showcase your RegEx prowess, this is the right place!

If you’ve read so many things about validation being one of the most important benefits of UDT, I’m sorry this article disappoints you. For me, it’s enough that you know where to put these validations. The logic is entirely up to you. So, for our UDT, let’s just assume that every input is correct.

VB
Public Shared Function Parse(ByVal input As SqlString) As Vector
    ' Happy Path! No validation!
    Dim inputParts As String() = input.Value.Split(","c)
    Dim v As New Vector()
    v.Magnitude = Double.Parse(inputParts(0))
    v.Direction = Single.Parse(inputParts(1))
    Return v
End Function
C#
static public Vector Parse(SqlString input)
{
    // Happy Path! No validation!
    string[] inputParts = input.Value.Split(',');
    Vector v = new Vector();
    v.Magnitude = double.Parse(inputParts[0]);
    v.Direction = float.Parse(inputParts[1]);
    return v;
}
Fig. 4 - Initial implementation of Parse

The SqlString type of the parameter is mandatory. It allows the use of the SQL keyword NULL, which is actually not a string, but a state that your UDT can assume.

Figure 5 - Assigning null

Fig. 5 - Assigning null

SqlString is just one of the many wrappers for .NET types located in the System.Data.SqlTypes namespace. Its main purpose is to provide the NULL state for the .NET data types; making them isomorphic to their SQL Server counterparts. For example, SqlDouble is isomorphic with the SQL Server double data type. These wrapper types can also be used in UDT, but we opted for the native types for our fields because the model requires it.

The null state we’re talking here is relevant only to SQL, and is different from the null in .NET. We’ll talk more about this when we tackle the other required members.

Display

The UDT should be able to display itself as a distinct entity, in a certain format, using all the values of its fields. You can specify this format in the ToString function. (Yes, this is the ubiquitous member of System.Object that every type should override.) Its implementation is a no brainer. Just make sure that you adhere to good UDT design by ensuring that its return value is readily consumable by Parse, just like this:

Figure 6 - Using ToString for initialization

Fig. 6 - Using ToString for initialization

This is what our vector achieves with its naïve implementation of ToString:

VB
Public Overrides Function ToString() As String
    Return String.Format("{0},{1}", Magnitude, Direction)
End Function
C#
public override string ToString()
{
    return string.Format("{0},{1}", Magnitude, Direction);
}
Fig. 7 - ToString implementation

Null

In the initialization section, I mentioned that our UDT should be capable of assuming an unknown state called null. This is different from our common notion of null in conventional programming, which is a reference to an empty memory. The null state and its logic are implemented by the properties Null and IsNull, respectively. If not properly implemented, it’s possible that SQL Server interprets your UDT as null when, in fact, it’s not. This is because null is actually an instance of your UDT which is understood by SQL Server as unknown. SQL Server uses the static read-only property Null to get this instance. (From this point on, I will refer to the UDT null as SQL-null, while the invalid-memory reference in conventional programming as simply null.)

If both Null and Parse return a valid instance of our UDT, how would SQL Server know which is SQL-null? It simply checks the read-only property IsNull. This member is an implicit implementation of the System.Data.SqlTypes.INullable interface. This interface implementation gives our UDT the SQL-null state, just like the SqlString parameter of Parse.

There are two approaches to implementing the SQL-nullability logic of a UDT. The first approach evaluates the values of your fields, while the second uses a simple flag field. If SQL-null is desired for the first approach, the values of the fields should be something that are not acceptable in the context of the model. Let’s say you have a UDT that has a field mass. It’s universally accepted that mass is never negative. In the UDT, a negative value for this field can convey a SQL-null:

VB
Private _mass As Double

Public Shared ReadOnly Property Null() As SimpleUdt
     Get
          Dim u As New SimpleUdt()
          u._mass = -1  'This is the flag
          Return u
     End Get
End Property

Public ReadOnly Property IsNull() As Boolean Implements INullable.IsNull
     Get
          ' If this instance is from Null property,
          ' the next line must be true
          Return (_mass < 0)
     End Get
End Property
C#
double _mass;

static public SimpleUdt Null
{
     get
     {
         SimpleUdt u = new SimpleUdt();
         u._mass = -1.0  // This is the flag
         return v;
     }
}

public bool IsNull
{
    get 
    { 
        // If this instance is from Null property,
        // the next line must be true
        return (_mass < 0.0;)
    }
}
Fig. 8 - SQL-nullability using member values

Personally, I prefer the second approach because it’s succinct and promotes self-documenting codes. These could also be the reasons why it is the default implementation in Visual Studio. In our UDT, that is a simple flag before we return the value in the Null property. Below are the SQL-nullability and null implementations of our UDT.

VB
Private _isNull As Boolean  ' Will be initialized to False

Public Shared ReadOnly Property Null() As Vector
    Get
        Dim v As New Vector()
        v._isNull = True  ' Here's our flag
        Return v
    End Get
End Property

Public ReadOnly Property IsNull() As Boolean Implements INullable.IsNull
    Get
        Return _isNull
    End Get
End Property
C#
bool _isNull;  // Will be initialized to false

static public Vector Null
{
    get
    {
        Vector v = new Vector();
        v._isNull = true;  // Here's our flag
        return v;
    }
}

public bool IsNull
{
    get { return _isNull; }
}

Serialization

The UDT object is different from your usual .NET object because it is saved and retrieved directly in the disk. You don’t have any other layers sitting between your UDT and the disk which do the saving and retrieval for you. These processes of saving and retrieving objects are called binary serialization and binary deserialization, respectively. The rest of the text may refer to both as simply serialization.

Implementation of serialization in UDT is implicit if the types of your fields are readily serializable. Such types are said to be blittable in programming parlance. They don’t require special handling in .NET because their representation is the same as that in unmanaged code. No additional code is required, and all we have to do is tell SQL Server that it can handle the serialization. We already did this in Figure 2. Our UDT satisfies all the conditions for native serialization, and hence it’s now usable. It should have the following members by now:

Figure 10 – Vector UDT basic members

Fig. 10 – Vector UDT basic members

If there is at least one field that is not blittable, then we must specify Format.UserDefined as the serialization format and implement the serialization manually. We do the latter by providing the implementation of the Microsoft.SqlServer.Server.IBinarySerialize interface. This interface has two members, aptly named Read and Write. You use the BinaryWriter argument to persist the values of your fields, while the BinaryReader to retrieve and assign them back. The order of your writing should be the same as your reading. The implementation ranges from the most trivial read/write to the uber-complex byte manipulations. The latter is beyond the scope of this article.

Suppose we decided to have the highest accuracy for our magnitude. This is the right job for System.Decimal, a type that SQL Server doesn’t know how to serialize. The manual binary serialization is still straightforward:

VB
' These are the fields
' we need to serialize/deserialize
Private _magnitude As Decimal
Private _direction As Single
Private _isNull As Boolean

Public Sub Write(ByVal w As System.IO.BinaryWriter) _
           Implements IBinarySerialize.Write
    w.Write(IsNull)

    ' No point of serializing
    ' the other fields if the UDT is SQL-NULL
    If Not IsNull Then
        w.Write(Magnitude)
        w.Write(Direction)
    End If
End Sub

Public Sub Read(ByVal r As System.IO.BinaryReader) _
       Implements IBinarySerialize.Read
    ' Note the order
    _isNull = r.ReadBoolean()

    ' We do the same check as that 
    ' in the Write
    If Not IsNull Then
        Magnitude = r.ReadDecimal()
        Direction = r.ReadSingle()
    End If
End Sub
C#
// These are the fields
// we need to serialize
decimal _magnitude;
float _direction;
bool _isNull;

void IBinarySerialize.Write(System.IO.BinaryWriter w)
{
    w.Write(IsNull);

    // No point of serializing
    // the other fields if the UDT is SQL-NULL
    if (!IsNull)
    {
        w.Write(Magnitude);
        w.Write(Direction);
    }
}

void IBinarySerialize.Read(System.IO.BinaryReader r)
{
    // Note the order
    _isNull = r.ReadBoolean();

    // We do the same check as that 
    // in the Write
    if (!IsNull)
    {
        Magnitude = r.ReadDecimal();
        Direction = r.ReadSingle();
    }
}
Fig. 11 - Simple IBinarySerialize implementation

Cataloguing

Our vector is now officially a UDT. We are going to copy the assembly to SQL Server and extract the UDT from it. This process of creating objects in SQL Server is known as cataloguing, because incidentally, you are making your objects visible in the catalogue views of SQL Server. You should build the class library, and copy the assembly to a desired location. If you haven’t enabled SQL-CLR Integration, you may execute these statements:

Figure 12 - Enabling SQL-CLR

Fig. 12 - Enabling SQL-CLR

The first step in cataloguing your UDT is to copy the assembly to the database using the CREATE ASSEMBLY command. You specify the path of your assembly and the name that should appear in the database. Your database and the location of your assembly may differ from mine, so change them accordingly. In the commands below, we just use the assembly name as its logical name inside the database:

Figure 13 - Cataloguing the assembly

Fig. 13 - Cataloguing the assembly

Cataloguing our UDT is achieved by the CREATE TYPE command. The fully qualified name of your type, this time, has the format <name of assembly in the database>.[<namespace in the original assembly>.<type name in the original assembly>]. We enclose the namespace and the type with “[]” because currently, SQL-CLR does not support namespaces.

Figure 13 - Cataloguing the UDT

Fig. 14 - Cataloguing the UDT

You may verify our assembly and UDT by querying their corresponding catalogue views:

Figure 15 Verifying assembly and UDT

Fig. 15 - Verifying assembly and UDT

The ALTER ASSEMBLY command is handy when you make modifications in the functions and some parameters in the attributes of the UDT. This saves you one step because you’re not required to re-catalogue your UDT anymore. The syntax is very similar to its CREATE counterpart.

Figure 16 The ALTER ASSEMBLY command

Fig. 16 - The ALTER ASSEMBLY command

You may not use the ALTER ASSEMBLY command if you renamed, added, removed members, or made changes that affect the serialization of the UDT. You must re-catalogue everything. This means dropping all the objects in the assembly dependency tree and then repeating the previous steps. You will be using the commands below for most of the subsequent modifications to our UDT:

Figure 17 Recataloguing commands

Fig. 17 - Re-cataloguing commands

The commands I just presented are the most compact variants that handle our needs. Consult the Microsoft documentation for comprehensive discussions of them.

Dry Run

It’s time to see our UDT in action. We will examine what’s going on inside it as we execute some of the most common SQL statements. We might discover new things and learn more about creating UDT as we go along. Bear in mind though that this is a happy test; we must provide the correct input at all times, otherwise exceptions would be thrown.

Let’s kick-start by declaring a variable of our UDT and then displaying it. As expected, the following statements prints “NULL” because we haven’t initialized it.

Figure 18 Simple UDT usage

Fig. 18 - Simple UDT usage

If we initialize it before selecting it, we get something gibberish:

Figure 19 Hexadecimal display of initialized UD

Fig. 19 - Hexadecimal display of initialized UDT

This weird series of characters is actually the hexadecimal representation of a vector. Somewhere in that hexadecimal is your magnitude and direction. In order to get a meaningful display, we need to call the ToString function:

Figure 20 ToString display of initialized UDT

Fig. 20 - ToString display of initialized UDT

In my opinion, the integration to T-SQL would have been much more seamless if SQL Server calls ToString underneath if we omit ToString. This behavior would have been consistent with the implementation of the initialization with Parse.

Before we proceed, be aware that the members of your UDT are case sensitive, and the following statements illustrate that:

Figure 21 UDT member case-sensitivity

Fig. 21 - UDT member case-sensitivity

Displaying properties does not require a call to ToString unless the type is a UDT. Non-UDT properties can be mapped directly to some SQL native types so SQL Server has no problem displaying them.

Figure 22 Direct call to scalar property

Fig. 22 - Direct call to scalar property

To prove that our UDT adheres to good design, Parse should be able to consume ToString without a problem:

Figure 23 Parsing with ToString

Fig. 23 - Parsing with ToString

And we can now create a table with a Vector column, and insert or update rows:

Figure 24 Using UDT in a column

Fig. 24 - Using UDT in a column

Equality and Sorting

The ubiquitous equals "=" operator in SQL is not compatible with UDT, by default, and using it results in an error.

Figure 25 Equality operator call error

Fig. 25 - Equality operator call error

Being a .NET developer, the first thing that would come to your mind is operator overloading. Unfortunately, operator overloading is not yet supported in SQL-CLR. The good news is turning on this equality operator support in UDT is very simple. We’ll just tell SQL Server to use the persisted binaries of the fields when comparing. We do this by setting the IsByteOrdered property of the SqlUserDefinedTypeAttribute to true. Be aware that if your UDT is a class, there will be additional work involved. We’ll discuss this later.

The IsByteOrdered setting means SQL Server comparing the magnitude followed by the direction and then nullability. This suits our needs since it already conforms to the semantics of vector equality.

VB
<SqlUserDefinedType(Microsoft.SqlServer.Server.Format.Native, _
                       IsByteOrdered:=True)> _
Public Structure Vector
    Implements INullable

    ' SQL Server compares magnitude first,
    ' followed by direction, then isNull
    Private _magnitude As Double
    Private _direction As Single
    Private _isNull As Boolean

    ' Rest of the codes here...
C#
[SqlUserDefinedType(Format.Native, IsByteOrdered=true)]
public struct Vector : INullable 
{ 
    // SQL Server compares magnitude first,
    // followed by direction, then isNull
    double _magnitude; 
    float _direction; 
    bool _isNull;
    
    // Rest of the codes here
Fig. 26 - IsByteOrdered adornment

The new adornment is a significant change in the UDT, so a simple alteration is not allowed. You must re-catalogue everything. After the re-catalogue, executing the last command should work just fine.

Figure 27 IsByteOrdered at work

Fig. 27 - IsByteOrdered at work

The negated equal operator "!=" is also implemented as a logical consequence:

Figure 28 Negation operator

Fig. 28 - Negation operator

And even the rest of the equality operators, even though they don’t make any sense in the context of Vector:

Figure 29 Other equality operators

Fig. 29 - Other equality operators

If SQL Server can compare our vectors, definitely it can sort them too. Once again, sorting the Vector does not make sense, but here it is:

Figure 30 Sorting vectors

Fig. 30 - Sorting vectors

Everything is set for the other roles the UDT can assume at this point, like primary key, foreign key, and indexed key. You can indeed create these constraints with UDT, but it doesn’t mean you should. Besides being a blatant violation of fundamental relational principles, UDT was not built for this purpose, and the performance penalty for such an implementation can be very severe. Here’s a table with a UDT as a primary key:

Figure 31 UDT as PK

Fig. 31 - UDT as PK

Validation

Do you notice something odd about our implementation of Parse in Figure 4? The code always returns non-SQL-null, but if we assign null to our UDT, it behaves accordingly.

Figure 32 Using NULL keyword bypasses Parse

Fig. 32 - Using the NULL keyword bypasses Parse

We can infer that maybe SQL Server is bypassing Parse when NULL is assigned as the value. It could be because calling Parse explicitly is another story (calling static functions of UDT in T-SQL uses the :: scope qualifier).

Figure 33 Explicit call to Parse

Fig. 33 - Explicit call to Parse

The only logical conclusion we can arrive at is that SQL Server somehow has its own way of tagging the memory occupied by our UDT as SQL-null whenever we use the keyword NULL instead of calling Parse. It’s talking direct binaries this time instead of string. The error message tells us that we cannot use the value of the argument because it is SQL-null. The exception is thrown the moment we attempt to split the input.

Explicit call to Parse is not the recommended way of assigning values to a UDT. I just showed you that in order to make a point. If you still want to play safe, you can avoid this exception by making sure that you are not splitting a SQL-null argument, as what the next block of code shows:

VB
Public Shared Function Parse(ByVal input As SqlString) As Vector
    ' We return null if SqlString NULL is entered
    If input.IsNull Then
        Return Null
    End If

    ' Happy Path! No validation!
    Dim inputParts As String() = input.Value.Split(","c)
    Dim v As New Vector()
    v.Magnitude = Double.Parse(inputParts(0))
    v.Direction = Single.Parse(inputParts(1))
    Return v
End Function
C#
static public Vector Parse(SqlString input)
{
    // We return null if SqlString NULL is entered
    if (input.IsNull)
        return Null;

    // Happy Path! No validation!
    string[] inputParts = input.Value.Split(',');
    Vector v = new Vector();
    v.Magnitude = double.Parse(inputParts[0]);
    v.Direction = float.Parse(inputParts[1]);
    return v;
}
Fig. 34 - Explicit returning of Null in Parse

We see that the last statements work just fine this time:

Figure 35 Explicit call on a Parse with explicit null return

Fig. 35 - Explicit call on a Parse with explicit null return

This unusual overloading is required to boost performance. Bypassing Parse means SQL Server can implement the native way of handling SQL-nulls just like how it does on non-UDT types. This also means that we can actually assign non-string literals to initialize a UDT:

Figure 36 Direct binary assignment

Fig. 36 - Direct binary assignment

This can be scary. We now have something similar to the dreaded “SQL-injection” attack. In this case, it’s a direct binary deserialization attack. Garbage can be entered as long as its binary is converted to the UDT type. You’re lucky if an error results, or if it results to SQL-null; otherwise, it can wreck havoc to your program. The next set of statements show that inserting 12-bytes (8 bytes for the magnitude, 4 bytes for the direction) binary avoids SQL-null and NAN (Not A Number) while still ensures garbage results. Note that the length of the binary is the total length of our fields: double=8 + float=4 + Boolean=1 = 13.

Figure 37 Assignment using arbitrary string literal

Fig. 37 - Assignment using arbitrary string literal

SQL Server provides a safeguard from this circumvention through a function which is called whenever an external binary deserialization is performed. You specify this in the ValidationMethodName property of the SqlUserDefinedTypeAttribute. This function should be parameter-less, and return a Boolean indicating if the input is valid or not. So, for the sake of illustration, we simplify the model by accepting only a positive magnitude and a direction between 0 and 2PI. Our validation function becomes:

VB
<SqlUserDefinedType(Microsoft.SqlServer.Server.Format.Native _
        , IsByteOrdered:=True, ValidationMethodName:="ValidateInput")> _
Public Structure Vector
    Implements INullable
    Private Function ValidateInput() As Boolean
        Dim minDirection As Single = 0.0F
        Dim maxDirection As Single = CSng((2 * Math.PI))
        Return (Magnitude >= 0) AndAlso _
               (Direction >= minDirection AndAlso Direction <= maxDirection)
    End Function

     'Rest of the codes here...
C#
[SqlUserDefinedType(Microsoft.SqlServer.Server.Format.Native
    ,IsByteOrdered=true, ValidationMethodName="ValidateInput")]
public struct Vector : INullable
{
    private bool ValidateInput()
    {
        float minDirection = 0f;
        float maxDirection = (float)(2D * Math.PI);
        return (Magnitude >= 0D) &&

            (Direction >= minDirection && Direction <= maxDirection);
    }
    
    // Rest of the codes here...
Fig. 38 - Validation function

The real vector does not have these restrictions. A negative magnitude means that the vector is actually pointing to the opposite direction. Our model forces the user to reverse the direction instead of supplying a negative magnitude. A direction less than 0 or greater than 2PI is also allowed, but can be converted to the equivalent value between 0 and 2PI. We’ll provide the appropriate helper function for this later. With the validation function in place, those unreasonable values derived from arbitrary string literals are caught this time:

Figure 39 Validation function firing

Fig. 39 – Validation function firing

The validation can also be applied in Parse, but we need to transfer the logic to another static function. The validation function for binary deserialization merely delegates the task to the new static function as shown:

VB
<SqlUserDefinedType(Microsoft.SqlServer.Server.Format.Native _
        , IsByteOrdered:=True, ValidationMethodName:="ValidateInput")> _
Public Structure Vector
    Implements INullable
    Private Function ValidateInput() As Boolean
        Return Vector.Validate(Me)
    End Function

    Private Shared Function Validate(ByVal v As Vector) As Boolean
        Dim minDirection As Single = 0.0F
        Dim maxDirection As Single = CSng((2 * Math.PI))
        Return (v.Magnitude >= 0) AndAlso _
               (v.Direction >= minDirection AndAlso v.Direction <= maxDirection)
    End Function

    Public Shared Function Parse(ByVal input As SqlString) As Vector
        ' We return null if SqlString NULL is entered
        If input.IsNull Then Return Null

        ' Happy Path! No validation!
        Dim inputParts As String() = input.Value.Split(","c)
        Dim v As New Vector()
        v.Magnitude = Double.Parse(inputParts(0))
        v.Direction = Single.Parse(inputParts(1))
        If Not Validate(v) Then
            Throw New ArgumentException("Magnitude must be positive " & _
                      "and direction must be between 0 and 2PI.")
        Return v
    End Function

    ' Rest of the codes here...
C#
[SqlUserDefinedType(Microsoft.SqlServer.Server.Format.Native
    , IsByteOrdered = true, ValidationMethodName = "ValidateInput")]
public struct Vector : INullable
{
    private bool ValidateInput()
    {
        return Vector.Validate(this);
    }

    private static bool Validate(Vector v)
    {
        float minDirection = 0f;
        float maxDirection = (float)(2D * Math.PI);
        return (v.Magnitude >= 0D) &&
            (v.Direction >= minDirection && v.Direction <= maxDirection);
    }

    static public Vector Parse(SqlString input)
    {
        if (input.IsNull)
            return Null;

        // Happy Path! No validation!
        string[] inputParts = input.Value.Split(',');
        Vector v = new Vector();
        v.Magnitude = double.Parse(inputParts[0]);
        v.Direction = float.Parse(inputParts[1]);
        if (!Validate(v))
            throw new ArgumentException("Magnitude must be positive " +
                " and direction must be between 0 and 2PI.");

        return v;
    }    
    
    // Rest of the codes here...
Fig. 40 - Applying validation in Parse

With that code, valid literals are also taken care of:

Figure 41 Business invalid literals

Fig. 41 - Business-invalid literals

In production, malicious binary deserialization from arbitrary literals is virtually impossible unless your system is already vulnerable to other conventional means of attack, like SQL-injection. Preventing attacks like this is the primary responsibility of the front-end applications and the security infrastructure. But, in deference to Murphy's Law, just make sure you have the validation function in your UDT.

Updating with Properties and Functions

We have seen our UDT in an Update statement when we discussed about UDT initialization and value assignment. It was more of “replacing” since we simply create another instance of our UDT and assign it to the column. It needlessly updates all the fields, and this is not what we always want. Sometimes, we want to update just one property. The type of the properties Magnitude and Direction map to SQL types double and float respectively. This makes them readily updateable in SQL. (See the Microsoft documentation for a full list of CLR to SQL type mappings.)

Figure 42 Direct update on a property

Fig. 42 - Direct update on a property

And for some unknown purposes, string literals of valid inputs is also possible. Look!

Figure 43 String literals as property input

Fig. 43 - String literals as property input

Although, it produces errors with invalid literals like ‘Hello there’, I still find this very unnecessary. In the previous version of this article, I express my hope that this feature be removed from SQL Server 2008. Unfortunately, it's still there.

SQL Server does not allow updating of two columns in one UPDATE statement. Attempting so returns an error:

Figure 44 No update for 2 columns

Fig. 44 - No update for two columns

This must be due to the internal mechanics of the database engine, or some other relational principles, but I could care less. If there is a need to change two or more fields at the same time, it could be an indicator for the presence of a function. This function leverages on encapsulation by hiding the intricacies of updating the fields while making our UDT self-documenting. We can apply this to represent some common operations of a vector. Let’s augment our vector with these functions, starting with scalar multiplication. Scalar multiplication is nothing more than increasing and decreasing the magnitude by a certain factor. We can represent this with a less academic name Scale. If the factor is negative, that means the direction is shifted by 180 degrees or 2PI.

VB
<SqlUserDefinedType(Format.Native, ValidationMethodName:="ValidateInput" _ 
                      ,IsByteOrdered:="True"> _
Public Structure Vector
    Implements INullable

    Public Sub Scale(ByVal factor As Double)
        If factor < 0 Then
            ' Preserving PI precision
            Dim dir As Double = Direction

            If dir > Math.PI Then
                dir -= Math.PI
            Else
                dir += Math.PI
            End If

            ' We might lose precision with coersion
            ' but at least we've minimized it
            Direction = CSng(dir)

            ' We don't have to represent negative magnitude
            ' because it's the same as reversing direction
            factor *= -1
        End If
        Magnitude *= factor
    End Sub
    
    ' Rest of the codes here...
C#
[SqlUserDefinedType(Format.Native, ValidationMethodName = "ValidateInput"

                      ,IsByteOrdered = true)]
public struct Vector : INullable
{
    [SqlMethod(IsMutator=true)]
    public void Scale(double factor)
    {
        if (factor < 0)
        {
            // Preserving PI precision
            double direction = Direction;

            if (direction > Math.PI)
                direction -= Math.PI;
            else
                direction += Math.PI;

            // We might lose precision with coersion
            // but at least we've minimized it
            Direction = (float)direction;

            // We don't have to represent negative magnitude
            // because it's the same as reversing direction
            factor *= -1;
        }
        Magnitude *= factor;
    }
    
    // Rest of the codes here...
Fig. 45 - Scale function

If you execute this however, you would get an error:

Figure 46 Error from updating a non-mutator function

Fig. 46 - Error from updating a non-mutator function

What happened was that SQL Server could not recognize the member as a mutator. In programming parlance, a mutator is a function that alters the state of your object. Alter in the sense that the value of at least one of your fields is changed. In a language that does not have a construct for property, mutator is synonymous to a setter function. Therefore, it makes perfect sense that SQL Server considers a property as mutator, by default. You can change this setting through the IsMutator property of SqlMethodAttribute. SQL-CLR goes even beyond the default because IsMutator seems to have no effect on the property. You should try the code below and see for yourself.

VB
Public Property Magnitude() As Double
    Get
        Return _magnitude
    End Get

    ' No effect
    <SqlMethod(IsMutator:=False)> _
    Set(ByVal value As Double)
        _magnitude = value
    End Set
End Property
C#
public double Magnitude
{
    get { return _magnitude; }
    
    // No effect
    [SqlMethod(IsMutator=false)]
    set
    {
        _magnitude = value;
    }
}
Fig. 47 - IsMutator setting for property has no effect

Our Scale operator, being a function, does not enjoy the privilege that Magnitude does. We need to explicitly set the flag in the attribute:

VB
<SqlMethod(IsMustator:=True)>
Public Sub Scale(ByVal factor As Double)
    ' Rest of the codes here...
C#
[SqlMethod(IsMutator=true)]
public double Scale(double factor)
{
    // Rest of the codes here...
Fig. 48 - Scale function as mutator

In a good OOP design, a mutator is usually a public instance function that returns void. Even your property set function is converted to such a function. SQL Server makes sure you adhere to this by restricting mutator calls to UPDATE and SET statements only. If you don’t tag a void function as mutator, you are authorizing SQL Server to call that function in SELECT or PRINT statements. Error results because SELECT and PRINT expect argument values which should have been returned by the UDT function.

In the other hand, a non-mutator or accessor returns a value to the caller. SQL Server restricts accessor calls to SELECT and PRINT statements. It’s perfectly legal to mark an accessor as mutator, but in so doing, you render it useless. SQL Server cannot use the return value because the attribute precludes SELECT and PRINT. You can use UPDATE and SET, but they’re worthless because they do nothing to the return value of the function. The following table summarizes the SQL statement's dependency to the IsMutator property:

Figure 49 SQL statement dependency to IsMutator

Fig. 49 - SQL statement dependency to IsMutator

We add an accessor to return the vector which points to the opposite direction. You may omit the adornment because the default is already false.

VB
' False is also the default
<SqlMethod(IsMutator:=False)> _
Public Function GetReversed() As Vector
    Dim reversed As Vector = Me
    reversed.Scale(-1)
    Return reversed
End Function
C#
// False is also the default
[SqlMethod(IsMutator=false)]
public Vector GetReversed()
{
    Vector reversed = this;
    reversed.Scale(-1);
    return reversed;
}
Fig. 50 - GetReversed accessor function

After flagging our Scale function as a mutator, the last SQL statements using it run just fine:

Figure 51 Scale function at work

Fig. 51 - Scale function at work

And so does the new accessor:

Figure 52 GetReversed function at work

Fig. 52 - GetReversed function at work

We add the other vector operations of our model as static functions. This gives us the ability to use them in both displaying and updating our UDT. If we opted for an instance implementation instead, it would require us to create an equivalent static function if we only want to display the result. And because SQL-CLR does not support function overloading, we would end up with a function with a different name - not elegant at all! We added a helper class that converts all directions less than 0 or greater than 2PI, because it’s possible that intermediate operators can yield values beyond that range.

VB
Public Shared Function Add(ByVal lhs As Vector, ByVal rhs As Vector) As Vector
    Dim xOfMagnitude As Double = (Math.Cos(CDbl(lhs.Direction)) * lhs.Magnitude) _
                               + (Math.Cos(CDbl(rhs.Direction)) * rhs.Magnitude)
    Dim yOfMagnitude As Double = (Math.Sin(CDbl(lhs.Direction)) * lhs.Magnitude) _
                               + (Math.Sin(CDbl(rhs.Direction)) * rhs.Magnitude)
    Dim resultant As New Vector()
    resultant.Magnitude = Math.Sqrt((xOfMagnitude * xOfMagnitude) _
                                  + (yOfMagnitude * yOfMagnitude))
    resultant.Direction = ToFirstRevPositive( _
                          CSng(Math.Atan2(yOfMagnitude, xOfMagnitude)))
    Return resultant
End Function

Public Shared Function Subtract(ByVal lhs As Vector, ByVal rhs As Vector) As Vector
    rhs.Scale(-1)
    Return Add(lhs, rhs)
End Function

Private Shared Function ToFirstRevPositive(ByVal angleInRadian As Single) As Single
    Dim negative As Boolean = angleInRadian < 0.0F
    Dim oneRev As Single = CSng((2 * Math.PI))

    If negative Then angleInRadian *= (-1.0F)

    Dim angleFirstRev As Single = angleInRadian Mod oneRev
    Dim angleFirstRevPositive As Single = If(negative, _
        oneRev - angleFirstRev, angleFirstRev)
    Return angleFirstRevPositive
End Function
C#
public static Vector Add(Vector lhs, Vector rhs)
{
    double xOfMagnitude = (Math.Cos((double)lhs.Direction) * lhs.Magnitude)
                        + (Math.Cos((double)rhs.Direction) * rhs.Magnitude);
    double yOfMagnitude = (Math.Sin((double)lhs.Direction) * lhs.Magnitude)
                        + (Math.Sin((double)rhs.Direction) * rhs.Magnitude);
    Vector resultant = new Vector();
    resultant.Magnitude = Math.Sqrt((xOfMagnitude * xOfMagnitude)
                                   + (yOfMagnitude * yOfMagnitude));
    resultant.Direction = ToFirstRevPositive(
        (float)Math.Atan2(yOfMagnitude, xOfMagnitude));
    return resultant;
}

public static Vector Subtract(Vector lhs, Vector rhs)
{
    rhs.Scale(-1.0D);
    return Add(lhs, rhs);
}

static float ToFirstRevPositive(float angleInRadian)
{
    bool negative = angleInRadian < 0f;
    float oneRev = (float)(2.0 * Math.PI);

    if (negative)
        angleInRadian *= (-1f);

    float angleFirstRev = angleInRadian % oneRev;
    float angleFirstRevPositive = negative ? 
          oneRev - angleFirstRev : angleFirstRev;
    return angleFirstRevPositive;
}
Fig. 53 - Common vector operations with a helper function

We test these operators on two vectors of equal magnitudes: one is 45 degrees while the other is 135. The scope qualifier for the static function is :: which is first shown in Figure 34. Note again that our vector UDT accepts only angles expressed in radians, and that the precision of our answers is dependent on the precision of our inputs.

Figure 54 Common vector operations at work

Fig. 54 - Common vector operations at work

Function overloading and operator overloading are still not part of SQL Server 2008 but they would be a nice addition someday. Operator overloading is a must for mathematical entities like vectors where the overridable CLR operators already have fixed and well-defined semantics. With the current shortcomings of UDT, it’s still impossible to write a 100% SQL-agnostic .NET struct or class. More often than not, UDTs are not mere ports from an existing library. It’s primarily created with SQL Server in mind.

Before we end this section, let me just also point out that SQL Server does not support out and ref parameters. Your codes compile but calling the function in SQL returns this vague message:

Figure 55 Error message for out and ref parameter

Fig. 55 - Error message for out and ref parameters

Initialization Revisited

Our assignment statements for the last test are not efficient. We initialize the direction with 0, only to change it with another value later on. We did this because we wanted to use the PI() variable, but since the default UDT initialization accepts only string literals, we are forced to assign a dummy value for the direction. We can rectify this by another factory function, this time with the properly typed parameters. It’s just natural to initialize the fields using their own types rather than strings, which is very prone to error. From now on, we’ll be using this function a lot more often:

VB
Public Shared Function CreateVector(ByVal magnitude As Double _
                                  , ByVal direction As Single) As Vector
    Dim v As New Vector()
    v.Magnitude = magnitude
    v.Direction = direction
    Return v
End Function
C#
static public Vector CreateVector(double magnitude, float direction)
{
    Vector v = new Vector();
    v.Magnitude = magnitude;
    v.Direction = direction;
    return v;
}
Fig. 56 - A factory much better than Parse

This new factory function even makes the previous SQL statements a tad shorter:

Figure 57 Using a factory to initialize vector

Fig. 57 - Using a factory to initialize vector

Accessing SQL-Null UDT

At this point, we’ve seen all the members of the UDT in action except IsNull. We already know that SQL Server uses this internally to determine if the instance is a SQL-null after deserialization. If this is the case, then using it should be straightforward in SQL, right? Wrong. Take a look at these statements:

Figure 58 Unexpected default behavior of IsNull

Fig. 58 - Unexpected default behavior of IsNull

Examining the properties of our UDT explains why the delete did not happen as expected.

Figure 59 Calling properties of SQL-null UDT

Fig. 59 - Calling properties of SQL-null UDT

This quirk is due to the SQL Server default behavior of bypassing calls to members of a SQL-null UDT. It is a performance-enhancing technique. SQL Server can save processing power since it does not have to deserialize the fields of the UDT anymore. It peeks into the memory just long enough to find out whether or not the UDT is tagged as SQL-null.

In a SQL-null UDT, it’s perfectly logical to get null from the other members except for IsNull. IsNull returning a SQL-null is misleading. An uninitiated SQL developer could use this property as a predicate in a WHERE clause, resulting in a subtle bug. To avoid this, you can tell SQL Server to call a member of a SQL-null UDT by setting the property InvokeIfReceiverIsNull of SqlMethodAttribute to true. This tells SQL Server that if the UDT is SQL-null, it still has to deserialize it so that a call to the member is possible.

VB
' No adornment here
Public ReadOnly Property IsNull() As Boolean Implements INullable.IsNull
    ' Adornment here
    <SqlMethod(InvokeIfReceiverIsNull:=True)> _
    Get
        Return _isNull
    End Get
End Property
C#
// No adornment here
public bool IsNull
{
    // Adornment here
    [SqlMethod(InvokeIfReceiverIsNull=true)]
    get { return _isNull; }
}
Fig. 60 - InvokeIfReceiverIsNull property

This time you get what you wanted with your DELETE:

Figure 61 Expected behavior of IsNull

Fig. 61 - Expected behavior of IsNull

Be aware that this incurs unnecessary processing. Microsoft recommends you stick to the existing SQL way of testing nullability using the IS NULL predicate operator. This also gives you the correct result minus the deserialization of UDT fields, thus offering a significant performance boost.

I conducted an informal test on the performance between IsNull and the IS NULL construct on a 10M-row table with 33% SQL-nulls. The script and a sample output are shown below:

Figure 62 IsNull vs IS NULL script and output

Fig. 62 - IsNull vs. IS NULL script and output

There were so many processes in my machine that could affect the execution of the script, so I made 10 iterations in order to arrive to a much conclusive result. Of course, the more iterations, the more conclusive the result would be, but I didn’t have the time and resources to set a clean testing environment. Now you know why I called them “informal” tests.

The ten iterations yielded consistent values of IS NULL being 6 times faster than its UDT counterpart, as shown in the following graphs:

Figure 63 IS NULL vs IsNull results

Fig. 63 - IS NULL vs. IsNull results

The adverse effect of IsNull is probably negligible on a small number of rows, but it’s always safe to shy away from it on set-based operations because you never know when and how fast your table grows. For non-set-based operations like a condition in your IF statement, you’re pretty much safe with it.

SQL-Null Arguments

While SQL-null is still fresh in our minds, let’s tackle another behavior that has something to do with it. This time it’s not the UDT that is SQL-null, but the argument passed to its function. Like the case of IsNull, SQL Server treatment of SQL-null arguments takes both performance and semantics into consideration. This behavior is controlled by the OnNullCall property of the SqlMethodAttribute. Of course, the purpose of this property is fully realized only if the parameters are SQL-nullable.

Leaving OnCallNull to its default value of true tells SQL Server to call the function even if at least one of the arguments is null. This may seem absurd because it implies that performance is not the primary motivation for this property. We’ll find out the rationale behind this before we end this section.

Let’s start our exploration of this new property by adding a mutator function AddVector with OnNullCall set to false:

VB
<SqlMethod(IsMutator:=True _
                 , OnNullCall:=False)> _
Public Sub AddVector(ByVal v As Vector)
    Dim resultant As Vector = Me
    resultant = Vector.Add(Me, v)
    Magnitude = resultant.Magnitude
    Direction = resultant.Direction
End Sub
C#
[SqlMethod(IsMutator=true
    ,OnNullCall=false)]
public void AddVector(Vector v)
{
    this = Vector.Add(this, v);
}
Fig. 64 - OnNullCall adornment

If we pass a null argument to this function, we get a vague error message:

Figure 65 Error from a mutator supplied with SQL-null argument

Fig. 65 - Error from a mutator supplied with a SQL-null argument

We know that, generally, SQL operations involving a SQL-null return SQL-null. SQL Server assumes that by tagging your function as a mutator, you’re going to use the SQL-null argument to an operation involving your fields. In theory, this in turn can render your UDT SQL-null or your fields in an inconsistent state. This is what the previous message is all about.

SQL Server prevents the unintentional rendering of your UDT to SQL-null by restricting OnNullCall only to true in the case of mutators. Because the values in your fields are vital, SQL Server wants you to take the appropriate actions instead of just rendering your UDT SQL-null right away. You have the options of exiting the function or throwing an exception. If ever you want to render your UDT SQL-null, you may do so by setting the nullability flag field to true. SQL Server imposes a rule that setting a data to null should be done through explicit null assignment, as shown in Figure 5, and not as a mere effect of an operation to which the UDT is an operand. We fix our codes by setting OnNullCall to true, and exit the function if the argument is SQL-null:

VB
<SqlMethod(IsMutator:=True, OnNullCall:=True)> _
Public Sub AddVector(ByVal v As Vector)
    If v.IsNull Then
        Return
    End If

    Dim resultant As Vector = Me
    resultant = Vector.Add(Me, v)
    Magnitude = resultant.Magnitude
    Direction = resultant.Direction
End Sub
C#
[SqlMethod(IsMutator = true, OnNullCall = true)]
public void AddVector(Vector v)
{
    if (v.IsNull)
        return;

    this = Vector.Add(this, v);
}
Fig. 66 - Handling SQL-null arguments

The previous statements should succeed by now:

Figure 67 Accessor returning an instance when SQL-null argument is supplied

Fig. 67 - Accessor returning an instance when a SQL-null argument is supplied

The values of the fields might be valid for a particular operation even if SQL-nullability is ignored. This is the case of our vector UDT where the SQL-null values of zero magnitude is a special kind of vector – the zero vector. The zero vector is also an additive identity which means that adding that to a non-zero vector simply yields the same non-zero vector. The addition did happen, but we are given the impression otherwise, because the result is the same as the non-zero vector operand. You can confirm this by running the same statement, but this time on a function that does not return if the argument is SQL-null. We can go even further by changing the values of the fields in the Null property’s get function.

VB
<SqlMethod(IsMutator:=True _
            , OnNullCall:=True)> _
Public Sub AddVector(ByVal v As Vector)
    ' We don't care if it's SQL-Null

    Dim resultant As Vector = Me
    resultant = Vector.Add(Me, v)
    Magnitude = resultant.Magnitude
    Direction = resultant.Direction
End Sub

Public Shared ReadOnly Property Null() As Vector
    Get
        Dim v As New Vector()
        v._isNull = True
        v.Magnitude = 15  ' We change the default intentionally
        v.Direction = CSng(Math.PI)
        Return v
     End Get
End Property
C#
[SqlMethod(IsMutator=true
         , OnNullCall=true)]
public void AddVector(Vector v)
{
    // We don't care if it's SQL-Null

    this = Vector.Add(this, v);
}

static public Vector Null
{
    get
    {
        Vector v = new Vector();
        v._isNull = true;
        v.Magnitude = 15.0;  // We change the default intentionally
        v.Direction = (float)Math.PI;
        return v;
    }
}
Fig. 68 - Changing the default values of fields

The previous statements this time will yield a different result, which clearly shows that the addition does happen:

Figure 69 Default values affecting addition

Fig. 69- Default values affecting addition

During development, you must double check all your mutators to see if all the null arguments are explicitly handled. Ignoring them does not return an error, and this might lead you to believing that SQL Server does the job of returning the current instance for you. The operation is actually performed using the SQL-null arguments, and it succeeded because .NET has no concept of SQL-null; the SQL-null object is still a valid .NET object!

Accessor is a different story. SQL Server allows setting of OnNullCall to false on it because your UDT is in no way affected by the operation. And since almost all operations involving SQL-null yield SQL-null, SQL Server can safely return the SQL-null right away. If ever there are special cases, you are still allowed to set OnNullCall to true and handle the SQL-null argument accordingly.

I conducted another 10-iteration test to find out how much performance benefit you can get from setting OnNullCall to false. For this test, I made two variants of our vector UDT, each having an accessor GetScaled with a SQL-nullable parameter. The implementation of GetScaled for the variant that allows calling of this method is shown below:

VB
<SqlMethod(OnNullCall:=True)> _
Public Function GetScaled(ByVal factor As SqlDouble) As Vector
    If factor.IsNull Then _
        Return Vector.Null

    Dim v As New Vector()
    v = Me
    v.Scale(factor.Value)
    Return v
End Function
C#
[SqlMethod(OnNullCall = true)]
public Vector GetScaled(SqlDouble factor)
{
    if (factor.IsNull)
        return Vector.Null;

    Vector v = new Vector();
    v = this;
    v.Scale(factor.Value);
    return v;
}
Fig. 70 - An accessor version of Scaled which is called on SQL-null arguments

The test simulated insertion to two dummy tables with the SQL-null return values of GetScaled from the two variants. The following shows the script and a sample result:

Figure 71 Script for testing OnNullCall

Fig. 71 - Script for testing OnNullCall

There might be some processes in my machine that caused an unusual spike in one of the iterations, but it is still clear that there is indeed a performance benefit with bypassing the function call. My machine yielded a 21% performance advantage as shown here:

Figure 72 OnNullCall test results

Fig. 72 - OnNullCall test results

The OnNullCall setting precedes other validation checks that SQL Server does to the argument of a function. If the parameter of your accessor is a non-SQL-nullable, SQL Server does not return an error even if you supply a null value to it, because the check is already suppressed by the OnNullCall setting. To illustrate this, let’s create another accessor variant of our Scale, but unlike Figure 70, we retain the non-SQL-nullable parameter which is central for this discussion.

VB
' You may remove this later
Public Function ScaleAccessor(ByVal factor As Double) As Vector
    Dim scaled As Vector = Me
    scaled.Scale(factor)
    Return scaled
End Function
C#
// You may remove this later
public Vector ScaleAccessor(double factor)
{
    Vector scaled = this;
    scaled.Scale(factor);
    return scaled;
}
Fig. 73 - OnNullCall set to true for function with non-SQL-nullable parameter

As I’ve said earlier, setting OnNullCall to true on this function does not make sense because you would never be allowed to supply a SQL-null. SQL-Server does not know how to convert the argument of the function to SQL-null, and you’ll get an error similar to this:

Figure 74 Error from calling an accessor with non-SQL-nullable parameter

Fig. 74 - Error from calling an accessor with a non-SQL-nullable parameter

Don’t let the message fool you. Converting our parameter to out or ref doesn’t work either as shown in Figure 55. To avoid this error, you set OnNullCall to false. SQL Server never bothers to check if the parameter is SQL-nullable. There’s no point since it will not call the function anymore.

It’s clear at this point that OnNullCall is very much dependent on IsMutator. We’ll end this discussion by another table that summarizes this dependency. The table shows why SQL Server differs when it comes to the default OnNullCall setting, having performance as an important consideration notwithstanding. We can surmise that it’s just a natural consequence of SQL Server adherence to fundamental database concepts and its regard to the safety of your data.

Figure 75 OnNullCall dependency to IsMutator

Fig. 75 - OnNullCall dependency to IsMutator

More on Indexing

Figure 26 shows how easy it is to make our UDT index-ready through a simple flag in the UDT attribute. It’s not even our intention then, but just a mere side-effect of telling SQL Server how to compare two UDTs. In production however, you will find yourself indexing functions (property-gets are also functions) much more often than UDT, because generally, they are scalar. But unlike scalar outside a UDT, a scalar function is not index-ready by default, as shown here:

Figure 76 Scalar function is not index-ready

Fig. 76 - Scalar function is not index-ready

The message says that SQL Server requires the function should be deterministic. A function is deterministic if there is only one possible output for a particular input. In other words, no two inputs produce the same output. For example, the equation x+1 is deterministic because it produces a unique value for every value of x. On the other hand, x^2 (x squared) is not deterministic because it can produce the same output from two numbers; e.g. +2 and -2. In mathematics, a deterministic equation is actually called a function. Do not confuse this with how we use the term in the context of programming.

SQL Server cannot determine whether or not a function is deterministic. It’s your job to tell it by using the property IsDeterministic of the SqlMethodAttribute. It’s obvious that our candidate property qualifies since it does nothing but expose our field.

VB
Public Property Magnitude() As Double
    <SqlMethod(IsDeterministic:=True)> _
    Get 
         Return _double
    End Get
    
    ' Rest of the codes here...
C#
public double Magnitude
{
    [SqlMethod(IsDeterministic=true)]
    get 
    {
        return _double;
    }
    // Rest of the codes here...
Fig. 77 - A deterministic property

SQL Server can now grant our request:

Figure 78 UDT column as index key

Fig. 78 - UDT column as index key

To see how our index works, we perform two queries, with the first one leveraging on the clustered index. The execution plan shows that we get a significant performance boost in the first query because SQL Server doesn’t have to perform the sort anymore.

Figure 79 Indexed property at work

Fig. 79 - Indexed property at work

Before we proceed, let’s take a brief look at the statements for creating an index. The statement says that creating an index from a UDT property entails saving the value for that property on a derived column. This is achieved by the keyword PERSISTED, which also implies that the column is intended to be indexed.

You can still enhance the performance of a property- or function- based index by telling SQL Server that there is no floating point operation involved in yielding the return value. You should assure SQL Server that the result is precise. Unfortunately, none of our properties guarantee exact precision. If magnitude were precise, we could have set the IsPrecise property of the SqlMethodAttribute attribute like this:

VB
Public Property Magnitude() As Double
    <SqlMethod(IsDeterministic:=True, IsPrecise:=True)> _
    Get 
         Return _double
    End Get
    
    ' Rest of the codes here...
C#
public double Magnitude
{
    [SqlMethod(IsDeterministic=true, IsPrecise=true)]
    get 
    {
        return _double;
    }
    // Rest of the codes here...
Fig. 80 - A property flagged as precise

Precision Consideration

When dealing with floating point values, one would later find out that the ubiquitous equals “=” operator may be rendered unreliable. This is because the chance of two values to be exactly the same is very slim. The number of significant figures can vary from one input to another, and a certain degree of imprecision is introduced in the intermediate computations. Our UDT is very susceptible to this anomaly because all of the fields used in comparison are floating point types. The next set of statements simulates an equality test anomaly between two vectors conceived to be equal but proven otherwise by the difference in input presentation. The second vector accepts an approximate value for the direction, which leads to a very minute yet significant difference.

Figure 81 Different input presentation causes equality anomaly

Fig. 81 - Different input presentation causes equality anomaly

If we have reasonable number of significant figures, we can tolerate a certain difference between the two operands. We still consider them equal if the difference falls within this range. It is usually very small and insignificant to the model. We can call this range value our margin of tolerance. We augment our UDT with another member to cater for this:

VB
Public Shared Function ApproxEquals(ByVal lhs As Vector, _
       ByVal rhs As Vector, ByVal marginOfTolernace As Double) As Boolean
    Dim diffMagnitude As Double = Math.Abs(lhs.Magnitude - rhs.Magnitude)
    Dim diffDirection As Double = Math.Abs(CDbl((lhs.Direction - rhs.Direction)))
    Return (diffDirection < marginOfTolernace) AndAlso _
           (diffMagnitude < marginOfTolernace)
End Function
C#
public static bool ApproxEquals(Vector lhs, Vector rhs, double marginOfTolerance)
{
    double diffMagnitude = Math.Abs(lhs.Magnitude - rhs.Magnitude);
    double diffDirection = Math.Abs((double)(lhs.Direction - rhs.Direction));
    return (diffDirection < marginOfTolerance) && 
           (diffMagnitude < marginOfTolerance);
}
Fig. 82 - Flexible and lenient equal operator

You can achieve similar results by rounding off the two operands to a certain decimal place. In this case, the parameter is the number of decimal places to which the values will be rounded to. I like the first approach because it centers on the culprit value. We can easily zero in to that value by displaying the difference of the direction between the two vectors.

Figure 83 Delta due to precision lost

Fig. 83 - Difference due to precision lost

It’s a very minute quantity indeed, and yet very capable of wrecking havoc to your query. This is rectified by the new operator we just added:

Figure 84 ApproxEquals at work

Fig. 84 - ApproxEquals at work

It is recommended that you have an approximate equality operator for a byte-ordered UDT. This gives you a flexible workaround for the putative equality anomalies when dealing with floating point values. Some developers choose to implement this in the Equals override to make the UDT a little compact. This approach has two disadvantages. First, it dilutes the true meaning of “equals”. It could be easily interpreted as the absolute equality operator because of its name. Second, it does not give you the flexibility of adjusting the margin of tolerance. It would require code modification just for changing the value of the margin.

At this point, our UDT is complete, and you should be able to take it for the ride. The illustration below shows a rundown of its members. You may verify its functionalities using some of the sites you can get from Googling "vector calculator". Remember that you won't get exactly the same answers because approximation might be involved in those calculators.

Figure 85 Struct UDT complete member list

Fig. 85 - Struct UDT complete member list

Class UDT

Class implementation is a neglected aspect of UDT discussion. Most of the time, developers become dogmatic about the inherent benefits that value types, like struct, have when it comes to performance. We’ll find out if this is a reason enough to shy away from class implementation.

In this section, we also explore the implementation consideration entailed by a reference type such as a class. We’ll tackle them one by one as we create a class version of our struct UDT. To start, you need to copy and paste all the code from the struct UDT into a class named CVector.

Variable Assignment Modification

One quirk worth mentioning before we go further is that in SQL-CLR, assigning an instance to a variable always creates a new instance regardless of if the UDT is implemented as a class or a struct. This is a departure from our usual understanding of the class implementation where assignment simply points the variable to the instance. It is still recommended that any implementation that is specific to the struct should be changed accordingly to maintain the intended behavior of the UDT outside SQL Server. This means changing our implementation of GetReversed such that it creates a new instance rather than point the new variable to the current instance. Our code does not compile yet because we haven’t defined our constructors.

VB
<SqlMethod(IsMutator:=False)> _
Public Function GetReversed() As Vector
    Dim reversed As New Vector(Magnitude, Direction)
    reversed.Scale(-1)
    Return reversed
End Function
C#
[SqlMethod(IsMutator=false)]
public Vector GetReversed()
{
    Vector reversed = new Vector(Magnitude, Direction);
    reversed.Scale(-1.0);
    return reversed;
}
Fig. 86 - GetReversed class implementation

Access Modifier Modification

The next set of modifications has something to do with the way we access our fields. If you look at Figure 9, we directly access our field even though it’s private. A class is strict with access modifiers, and this is no longer permitted. We’ll write constructors to cater for this need.

VB
<SqlUserDefinedType(Microsoft.SqlServer.Server.Format.Native _
                     , IsByteOrdered:=True)> _
Public Class CVector
    Implements INullable
    Public Sub New()
        Me.New(0, 0.0F, False)
    End Sub

    Public Sub New(ByVal magnitude As Double, ByVal direction As Single)
        Me.New(magnitude, direction, False)
    End Sub

    Private Sub New(ByVal magnitude As Double, _
            ByVal direction As Single, ByVal isNull As Boolean)
        _direction = direction
        _magnitude = magnitude
        _isNull = isNull
    End Sub
    
    Public Shared Function Parse(ByVal input As SqlString) As CVector
        Dim inputParts As String() = input.Value.Split(","c)
        Dim magnitude As Double = Double.Parse(inputParts(0))
        Dim direction As Single = Single.Parse(inputParts(1))
        Dim v As New CVector(magnitude, direction)
        Return v
    End Function
        
    Public Shared ReadOnly Property Null() As Vector
        Get
            Return New CVector(0, 0.0F, True)
        End Get
    End Property

    ' Rest of the codes here...
C#
[SqlUserDefinedType(Format.Native
                  , IsByteOrdered = true)]    
public class CVector : INullable
{
              
    public CVector() : this(0D, 0F, false) {}

    public CVector(double magnitude, float direction) 
        : this(magnitude, direction, false) {}

    private CVector(double magnitude, float direction, bool isNull)
    {
        _direction = direction;
        _magnitude = magnitude;
        _isNull = isNull;
    }
    
    public static CVector Parse(SqlString input)
    {
        string[] inputParts = input.Value.Split(',');
        double magnitude = double.Parse(inputParts[0]);
        double direction = float.Parse(inputParts[1]);
        CVector v = new CVector(magnitude, direction);
        return v;
    }
    
    public static CVector Null
    {
        get { return new CVector(0, 0.0F, true); }
    }
    
    
    // Rest of the codes here...
Fig. 87 - Class UDT constructors

The private overload of these constructors is used just for setting the SQL-nullability field. It’s private in order to prevent setting of the SQL-nullability outside the SQL where it doesn’t have any meaning. The default overload is added to validate the lines that were originally calls to the constructor of the struct. The public and private operators are used by the Parse and Null functions respectively.

Memory Matters

Our class UDT compiles without a glitch, but if you catalogue it, you would receive the following error:

Figure 88 Class UDT native serialization error

Fig. 88 - Class UDT native serialization error

It’s obvious that this one is another case of a missing attribute. Don’t bother using the good ol’ Intellisense to rummage for the property in the existing attributes, because this one is arcane. If you’ve written considerable interop codes, then you must have encountered the StructLayoutAttribute in the System.Runtime.InteropServices namespace, which is what we need. It is relevant only to a class or struct type that is passed to unmanaged code. This is exactly what happens when SQL Server takes over the binary serialization of our class during persistence. The attribute has a Value property that is of type LayoutKind. This tells the unmanaged code how the members of the class are laid out in the memory. The LayoutKind.Sequential means the fields of our class UDT should be arranged in the memory in the same order we declare them. A struct type has an implicit LayoutKind.Sequential adornment, which is a good reason to prefer it over a class. Heeding the SQL Server request, our code becomes:

VB
<SqlUserDefinedType(Microsoft.SqlServer.Server.Format.Native _
                  , IsByteOrdered:=True)> _
<StructLayout(LayoutKind.Sequential)> _
Public Class CVector
    Implements INullable
    ' SQL Server compares magnitude first,
    ' followed by direction, then IsNull
    Private _magnitude As Double
    Private _direction As Single
    Private _isNull As Boolean

    ' Rest of the codes here...
C#
[SqlUserDefinedType(Microsoft.SqlServer.Server.Format.Native
                  , IsByteOrdered = true)]
[StructLayout(LayoutKind.Sequential)]
public class CVector : INullable
{
    // SQL Server compares magnitude first,
    // followed by direction, then IsNull
    double _magnitude;
    float _direction;
    bool _isNull;
    
    // Rest of the codes here...
Fig. 89 - StructLayout attribute

Cataloguing should succeed by now, and your class UDT is ready for action. There’s no other better way of doing this than a head to head match up against its struct counterpart, so I conducted another test. This time, I also included the user-defined-formatted class UDT. The test script simulates insertion and deletion of three tables, one for each serialization format of the UDT. The two transactions are contiguous, but I decided to show the results separately for clarity. Shown below is a fragment of the script intended for the table that uses the user-defined-formatted class UDT.

Figure 90 UDT insertion sample test script

Fig. 90 - UDT insertion sample test script

The results for the insertion test were pretty much what we’ve expected. The difference between the struct and the struct-wannabe class was very insignificant, with the struct getting a meager 1.1% advantage. On the other hand, the manual binary serialization of the user-defined-formatted class led to its 32% lag from the struct-wannabe.

Figure 91 Insertion test result

Fig. 91 - Insertion test result

If the insertion was pretty much expected, the deletion was a total surprise. The graph below doesn’t show any trend at all. I was fully aware that some other processes in my machine somehow affected the results, but what I couldn’t figure is how come the adjoining insertions seemed to be impervious. Another big surprise was the user-defined-formatted besting the native struct and class with advantage margins of 4.5% and 8.9%, respectively; not big, but still a point to ponder for an inquisitive mind. I’m not a SQL Server or CLR internal guy, so I can’t offer you a clear explanation for this.

Figure 92 Deletion test results

Fig. 92 - Deletion test results

Based on the results, it looks like serialization is the performance clincher, and not the implementation type as probably most of us have assumed. We just saw how a class UDT achieved almost the same performance as that of a native struct by setting its serialization to native. Likewise, a user-defined-formatted struct performance should be no different from a user-defined-formatted class. But, if there’s one thing that should deter you from using a class, it has to be the fact that it involves more code than its struct counterpart.

Other UDT Characteristics

Size

If your UDT is user-defined, SQL Server has no idea how much space is needed to store it. You must specify the maximum possible byte size in the MaxByteSize property of the SqlUserDefinedTypeAttribute. The maximum amount for SQL Server 2005 is 8000, but this restriction is removed in SQL Server 2008. I'll elaborate on this momentarily.

We illustrate the usage of MaxByteSize by creating a user-defined-formatted version of our struct UDT. We add a property Unit of type string which accepts values “rad” for radian and “deg” for degree. You should also add the implementation of the IBinarySerialize similar to Figure 11, and the necessary modifications to the other members. The code fragments below show the adornment for the new UDT.

VB
<SqlUserDefined(Format.UserDefined, IsByteOrdered:=True _
              , ValidationMethodName:="ValidateInput", MaxByteSize:=17)> _
Public Structure Vector
    Implements INullable
    Implements IBinarySerialize
    
    Private _unit As String
    
    ' Rest of the codes here...
C#
[SqlUserDefinedType(Format.UserDefined, IsByteOrdered=true
                  , ValidationMethodName="ValidateInput", MaxByteSize=17)]
public struct Vector : INullable, IBinarySerialize
{
    string _unit;

    // Rest of the codes here...
Fig. 93 - MaxByteSize property

The maximum byte size of 17 was arrived by adding all the bytes needed for our fields. We already have 13 bytes, and the additional 4 is the maximum number of characters of the input: 3 bytes for the string plus the 1-byte overhead of string serialization. The size should be increased if you have plans for supporting non-US ASCII characters.

You may add validation routines to the property if you want this to become part of your UDT. Our current implementation simply accepts any literal of not more than 3 characters. Beyond this number, SQL Server would run out of memory for the serialization, as shown below:

Figure 94 Error from serialization buffer overflow

Fig. 94 - Error from serialization buffer overflow

Length

Another SqlUserDefinedAttribute property pertinent to user-defined-formatted UDT is IsFixedLength. This flag is a misnomer in my opinion, because it specifies whether or not you want SQL Server to allocate the maximum byte size for all the instances of your UDT. This is optional and defaults to true. If set otherwise, SQL server allocates only the exact bytes needed for the values of the fields for the instance. The decision involved here is usually influenced by the developer’s regard to performance and disk space. A UDT with fixed length is usually faster in queries, while one with variable length is space-efficient. Similar pros and cons apply to UDT as that for scalars, e.g., char vs. varchar, binary vs. varbinary, and others.

Value Presentation

SQL-CLR allows you to control the characteristics of return values of your functions and properties thorough the SqlFacetAttribute. This attribute has properties corresponding to the different characteristic inherent to non-blittable types and the types found in the System.Data.SqlTypes namespace. You can find the list of supported data types and the properties they support in the Microsoft documentation. Our UDT does not contain any of the supported data types, but we can still illustrate this by changing the magnitude type to decimal. Be warned that this has an extensive effect in your existing codes. The snippet below shows an adornment of the attribute in a property.

VB
<SqlUserDefinedType(Format.UserDefined, IsFixedLength:=False _
                  , MaxByteSize:=8000)> _
Public Structure Vector
    Implements INullable
    Implements IBinarySerialize
    
    Private _magnitude As Decimal   
    <SqlFacet(Precision:=15, Scale:=8)> _
    Public Property Magnitude() As Decimal   
        Get
            Return _magnitude
        End Get
        ' Rest of the codes here...
C#
[SqlUserDefinedType(Format.UserDefined, IsFixedLength=false
                   ,MaxByteSize=8000)]
public struct Vector : INullable, IBinarySerialize
{
    decimal _magnitude
    [SqlFacet(Precision=15, Scale=8)]
    public decimal Magnitude
    {
        get
        {
            return _magnitude;
        }
        // Rest of the codes here...
Fig. 95 - Precision and Scale property of SqlMethodAttribute

In the code above, Precision is the total number of digits, while Scale is the number of digits to the right of the decimal point. We use the values in Figure 54 to illustrate the effects of the adornment. Note in the statements below, we call the property instead of ToString because SqlFacetAttribute has no effect inside the UDT. If you compare the results below with Figure 54, you can see that SQL Server indeed rounds off the magnitude as specified.

Figure 96 Precision and Scale at work

Fig. 96 - Precision and Scale at work

The placement of SqlFacetAttribute is misleading as far as a property is concerned. It suggests that the attribute encompasses both input and output. This can lead you to thinking that it has validation functionalities, when in fact, it only deals with presentation specification. We can confirm this by entering values that do not conform to the specifications of our adornment and see that it will still succeed:

Figure 97 SqlFacetAttribute has no validation specification

Fig. 97 - SqlFacetAttribute has no validation specification

For me, the SqlFacetAttribute has limited role in UDT. I even avoid using the properties we just illustrated, because they can introduce precision loss if those adorned UDT properties are used in intermediate computations.

The remaining properties of SqlFacetAttribute are information that SQL Server can use when defining a database object based on the member being adorned. This is very rare in UDT, but common among other CLR artifacts like managed procedures, functions, and user defined aggregates. We will not explore them anymore.

SQL Server 2008 Change

With the rare use of UDT and the shipping of Microsoft's own UDT's in the form of spatial data types, it's no surprise that UDT received only one enhancement in SQL Server 2008. As I mentioned a while ago, UDT is no longer limited to 8000 bytes. This might just be an afterthought during the development of the spatial data types which are nothing more than UDT's themselves but it's still better than nothing. When I first apply this feature though, I was left scratching my head. The documentation is misleading. If you want a UDT bigger than the 8000 bytes, you are only allowed the value -1 in SqlUserDefinedAttribute.MaxSize property. Specifying anything greater than 8000 results to cataloguing error.

On Exceptions

You’ve probably noticed that we haven’t delved deep into exceptions. Worse, we never even had any exception handling at all! I pointed out that Parse would be teeming with validation, and of course, exception handlers, but I completely left them out for brevity. Besides, a UDT is already in the core of an application, and the best thing you can do in an exception handler is re-throw the exception, accompanied by a friendly message. There might be situations specific to your domain where you should handle an exception in a different manner. As for the other unhandled exceptions, don’t fret; someone, somewhere has to catch them for you. The bottom line is that there are no other considerations when it comes to UDT exception handling. The best practices in .NET programming also apply here.

Digression with Enumeration

During the early days of my UDT tinkering, I thought that since enumeration is nothing more than a named numeric type, I would have no problem using it in a UDT. The tests I made then proved nothing could be farther from the truth. In a native format UDT, SQL Server insists that you should adorn the enumeration with LayoutKind.Sequential just like our native class UDT. This is not possible because StructLayoutAttribute cannot be applied to an enumeration. While in the user-defined format, your code will compile, but SQL Server cannot recognize the type. You will get a message similar to this:

Figure 98 Vague message from using enumeration

Fig. 98 - Vague message from using an enumeration

The idea of having enumeration support for UDT was once attractive to me. Enumeration helps minimize errors through its type-safety feature while enhancing the readability of your SQL. I realized later that this is too much to ask for, considering that it’s actually tantamount to implementing an entirely new SQL-CLR object. If implemented, it could have rendered the lookup (name-value) table obsolete, and caused another barrage of debates; and we’ve had enough of that.

Summary

The User Defined Type (UDT) is probably the most significant among the SQL-CLR artifacts. No other artifact has effects on SQL Server as extensive and profound as UDT does. UDT became instrumental to the evolution of SQL Server beyond its relational foundation through the introduction of concepts like “object” and properties. Along with these are new T-SQL features required to interact with a UDT.

Unfortunately, UDT is also the most complex among the SQL-CLR objects. Its implementation is on one language, but its structure, characteristics, and behaviors are defined by another. This dual nature necessitates a slight departure from our conventional understanding of some common concepts in programming; something that can be attributed to the UDT's obscure nature. Adding to the complexity is the ample dosage of declarative programming involved to achieve the intended behaviors of the UDT.

UDT can be implemented using a class or a structure .NET construct. The common conception that value types like structures have performance advantage over reference types when it comes to high volume transactions is inapplicable in UDT. The performance determinant is not the type, but the serialization format used in the UDT. Native serialization is faster than user defined. If there is a good reason to shy away from a class, it's the fact that it's more verbose than a structure implementation.

Perhaps, the greatest benefit of UDT has yet to be realized. By allowing us to store data of virtually any structure, it can pave the way for an entirely new class of applications. Microsoft seems to be heading that way, and it’s just a matter of time before UDT would finally gain prominence.

References

History

  • 23rd August, 2008 - Initial draft
  • Included SQL Server 2008 changes

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)