Contents
Introduction
The three chief virtues of a programmer are: Laziness, Impatience and Hubris.
Larry Wall
I've heard it said that laziness is a virtue of an effective programmer. Now while the literal interpretation of this is dubious, there is a measure of truth in it. I have found that laziness can sometimes be a driving force towards innovation. Well, not laziness exactly, but an inclination for finding the solution of least effort. As Lee (2002) points out, it is a "natural disposition that results in being economic".
Case in point: a little while ago I needed to create an email driven registration confirmation subsystem, and I didn't want to go through the trouble of creating and managing a table of users with pending registrations. I had this idea: what if one could encode all the information required to complete the registration into the actual confirmation link? The downside is that you end up with a long and not so pretty URL, the upside is that you end up with a new level of ease and flexibility. This is not only for confirming account registration, but also for passing data from emails and between pages etc.
This project consists of a URL Object Serialization component that provides serialization, compression, and encryption of CLR objects so that they can be embedded within URLs, a user-account purging component that performs the periodic removal of unconfirmed user accounts, and a demonstration website that shows the use of the components in an ASP.NET user-account confirmation system.
Serializing CLR Objects to Query Strings
The RFC Specification specifies that URLs consist of only US ASCII characters. Any characters that are not present in the ASCII character set must be encoded. For characters within the UTF-8 character set this is done by using a percentage symbol and two hexadecimal digits. For us though, we won't use escape encoding, instead we will use Base64 encoding. Base64 encoding consists of the characters A�Z, a�z, and 0�9 (Wikipedia, 2008). This is the format that the HttpServerUtility.UrlTokenEncode
uses to make data transmissible within a URL.
As an aside, note that IIS also supports non-standard %u encoding, allowing all Unicode characters to be represented, which is more than the UTF-8 escape encoding described in the standard (Ollmann, 2007). We do not, however, make use of this fact. Instead we stick to the Base64 encoding.
URL Length Limitations
When generating URLs, we must be aware that some browsers and Web servers have a limit on URL length. URLs using the GET method in Internet Explorer are limited to 2,083 characters. The POST method also limits the URL length to 2,083, but this does not include query string parameters (http://support.microsoft.com/kb/208427). This point is important when intending to serialize large object graphs, or instances with a lot of member data. Safari, Firefox, and Opera (version 9 and above) appear to have no such limit. Older browsers such as Netscape 6, support around 2,000 characters.
As far as Web servers go, IIS supports up to 16,384 characters. For those using Mono and Apache, however, Apache supports up to 4,000 characters (Boutell, 2006).
So, the short story is, if you wish to maintain compatibility with most browsers, then you should ensure that all URLs remain under 2,000 characters. This gives us about 8000 bytes or 7.8 KBs to work with. Not too shabby.
Serializing a CLR Object for URL Embedding
The process of serializing an object to a URL is comprised of 3 stages. Firstly, we serialize the object using a BinaryFormatter
. We then compress the resulting byte
array using a GZipStream
. After which we use the HttpServerUtility
to URL encode the bytes. This results in a Base64 encoded string that can be used as a query string parameter value.
The following diagram illustrates the URL object serialization and deserialization processes in more detail. We see that the compression and encryption strategies are used to place the serialized object into a format that is readily transmissible.
Figure: URL Object Encode/Decode sequence.
The UrlEncoder
serializes the object using a BinaryFormatter
. The resulting byte array is then compressed using the ICompressionStrategy
.
The following class diagram shows the composition of the UrlEncoder
. We can see that it is comprised of an ICompressionStrategy
and an IEncryptionStrategy
; both of which may be replaced at runtime with alternate custom strategies.
Figure: UrlEncoder
class diagram.
The default CompressionStrategy
class uses asymmetric encryption to encrypt the byte array via a RijndaelManaged
instance. Compression is performed as shown in the following excerpt:
public byte[] Compress(Stream stream)
{
using (MemoryStream resultStream = new MemoryStream())
{
using (GZipStream writeStream = new GZipStream(
resultStream, CompressionMode.Compress, true))
{
CopyBuffered(stream, writeStream);
}
return resultStream.ToArray();
}
}
static void CopyBuffered(Stream readStream, Stream writeStream)
{
byte[] bytes = new byte[bufferSize];
int byteCount;
while ((byteCount = readStream.Read(bytes, 0, bytes.Length)) != 0)
{
writeStream.Write(bytes, 0, byteCount);
}
}
The HttpServerUtility
is used to encode the resulting bytes into a Base64 string that we can then insert into a URL. The order of the sequence, i.e. to compress and then to encrypt, was chosen because unencrypted data is more amenable to compression. This is because, once encrypted, patterns within the data are reduced, and the data is more random in appearance thus reducing the effectiveness of compression, which of course relies on patterns within the data.
The following shows the encoding process within the UrlEncoder
:
public string Encode(object data)
{
if (data == null)
{
throw new ArgumentNullException("data");
}
BinaryFormatter formatter = new BinaryFormatter();
byte[] dataBytes;
using (MemoryStream stream = new MemoryStream())
{
formatter.Serialize(stream, data);
dataBytes = stream.ToArray();
}
byte[] compressedBytes;
using (MemoryStream stream = new MemoryStream(dataBytes))
{
compressedBytes = compressionStrategy.Compress(stream);
}
byte[] encryptedBytes = encryptionProvider.Encrypt
(compressedBytes, encryptionPassPhrase);
return HttpServerUtility.UrlTokenEncode(encryptedBytes);
}
To recover the serialized object, we simply reverse the process: Decode -> Decrypt -> Uncompress -> Deserialize.
public object Decode(string value)
{
if (value == null)
{
throw new ArgumentNullException("value");
}
byte[] decoded = HttpServerUtility.UrlTokenDecode(value);
byte[] unencrypted = encryptionStrategy.Decrypt(decoded, encryptionPassPhrase);
byte[] uncompressedBytes;
using (MemoryStream stream = new MemoryStream(unencrypted))
{
uncompressedBytes = compressionStrategy.Decompress(stream);
}
BinaryFormatter formatter = new BinaryFormatter();
object deserialized;
using (MemoryStream stream = new MemoryStream(uncompressedBytes))
{
deserialized = formatter.Deserialize(stream);
}
return deserialized;
}
URL Object Serialization: A Practical Example
As mentioned above, the reason why I came up with the URL object serialization was to implement a fire and forget user account confirmation system.
The example website included in the download demonstrates the use of the user-account confirmation system.
Figure: User-account confirmation.
A user begins by registering his or her account. Once the user submits the data via the CreateUserWizard
, an object containing information regarding the user's account is encoded and sent in an email to the user. The user then proceeds to click on the link (containing the encoded object) in the email, directing the user to the confirmation page. The confirmation page decodes the object and completes the registration by setting the user's account to IsApproved
. The following sequence diagram provides an overview of this process.
Figure: User registration sequence.
When a user completes the first step, via the CreateUserWizard
, in creating an account, the IsApproved
property of the new account is set to false
. This differs from the default behaviour: IsApproved
is true
once the account is created. The default behavior is altered by using the Visual Studio Properties window for the CreateUserWizard
control, as shown in the following image:
Figure: CreateUserWizard
designer properties.
The IsApproved
property of the new account remains false
until the account is confirmed via email. That�s where our URL Object Serialization component comes in again.
Once the user navigates back to the CompleteRegistration
page, the URL encoded EmailConfirmation
instance is deserialized, as shown in the following excerpt:
protected void Page_Load(object sender, EventArgs e)
{
if (!IsPostBack)
{
string cipherText = Request.QueryString["Data"];
if (cipherText == null)
{
ShowConfirmationFailed();
return;
}
EmailConfirmation confirmation;
UrlEncoder encoder = new UrlEncoder(Settings.PassPhrase);
try
{
confirmation = (EmailConfirmation)encoder.Decode(cipherText);
}
catch (Exception ex)
{
Page.Trace.Write("Default", "Unable to deserialize confirmation: "
+ cipherText, ex);
ShowConfirmationFailed();
return;
}
if (confirmation.UserId == Guid.Empty)
{
Page.Trace.Write("User trying to confirm registration failed. "
+ "The guid UserId is empty. providerUserKey: "
+ cipherText);
ShowConfirmationFailed();
return;
}
MembershipUser user = Membership.GetUser(confirmation.UserId);
if (user == null)
{
Page.Trace.Write("User attempted confirmation of registration "
+ "and MembershipUser was null. UserId: "
+ confirmation.UserId);
ShowConfirmationFailed();
return;
}
switch (confirmation.ConfirmationType)
{
case ConfirmationType.UserRegistration:
if (user.IsApproved)
{
ShowUserAlreadyConfirmed();
return;
}
user.IsApproved = true;
Membership.UpdateUser(user);
break;
}
ShowConfirmationSuccess();
bool rememberUser = Request.Cookies[FormsAuthentication.FormsCookieName] != null;
if (rememberUser)
{
FormsAuthentication.SetAuthCookie(user.UserName, true);
}
if (!string.IsNullOrEmpty(confirmation.ContinueUrl))
{
Panel_Continue.Visible = true;
HyperLink_Continue.Text = confirmation.ContinueTitle;
HyperLink_Continue.NavigateUrl = string.Format("{0}?Data={1}",
confirmation.ContinueUrl, cipherText);
}
}
}
A nice feature of this approach is that we are able to record arbitrary information in the URL encoded object, such as the page that the user was attempting to navigate to, before being rudely interrupted with a registration required demand. It almost makes for a light-weight workflow.
User Purging Subsystem
What do you do when someone signs up and never completes their registration? It's kind of a DOS attack against new registrants. The username will be unavailable until the account is purged.
To solve this problem, we have a class named UserPurger
that periodically removes user accounts that have not been confirmed.
The UserPurger
uses the provider model. As part of its configuration, an IUserPurger
class is specified. The IUserPurger
implementation determines how users are removed from the system.
Figure: UserPurging
class diagram.
In the provided example, and with the AspNetMembershipUserPurger
, we use ASP.NET Membership to remove unconfirmed users. If you use some other user management system, then simply provide your own implementation of the IUserPurger
interface.
The configuration for the UserPurger
is located in the web.config.
<!---->
<UserPurging
defaultProvider="AspNetMembershipUserPurger"
purgeOlderThanMinutes="5.5"
periodMinutes="5">
<providers>
<clear />
<!---->
<add name="AspNetMembershipUserPurger"
type="Orpius.Web.AspNetMembershipUserPurger,
Orpius.Web.UserPurging" />
</providers>
</UserPurging>
The UserPurger
class is static
and thus retains the same lifetime as the Web application. Why not use an IHttpModule
? An IHttpModule
may be periodically recycled, or multiple instances pooled, thus giving it a lifetime, on almost all occasions, less than that of the application. The UserPurger
relies on a timer to schedule purges, and therefore would not work well if that were the case.
My rule of thumb is: if you need something to hang around then use a static
class or an instance stored in the application context, not an IHttpModule
.
Testing
I have included a Unit Testing project as part of the download. Apart from the User account confirmation example, the unit test demonstrates the serialization of a much larger object instance with a child instance.
[TestMethod()]
public void SerializeTest()
{
string parentName = "Parent";
string childName = "Child";
UrlEncoder target = new UrlEncoder("Password");
SerializableTestClass data = new SerializableTestClass()
{ Name = parentName };
data.Child = new SerializableTestClass() { Name = childName };
string serialized = target.Encode(data);
SerializableTestClass deserialized =
(SerializableTestClass)target.Decode(serialized);
Assert.AreEqual(parentName, deserialized.Name);
Assert.AreEqual(childName, deserialized.Child.Name);
}
Conclusion
Serializing objects to URLs is a novel approach to passing data to and from ASP.NET applications, and while there exist URL length constraints in some browsers, such constraints do not prohibit its use in scenarios where object graphs are not overly large and contain only a moderate amount of member data. This approach provides a secure yet flexible way to encapsulate and relay private workflow information to and from clients.
I hope you find this project useful. If so, then I'd appreciate it if you would rate it and/or leave feedback below. This will help me to make my next article better.
References
- Lee, X. 2002, Laziness, Perl, and Larry Wall
Retrieved 26 January 2008 from Xah Lee Web
- Berners-Lee, T. & Masinter, L. & McCahill, M. 1994, RFC Specification
Retrieved 26 January 2008 from The Internet Engineering Task Force
- Wikipedia, 2008, Percent-encoding
Retrieved 26 January 2008 from Wikipedia.org
- Wikipedia, 2008, Base64
Retrieved 26 January 2008 from Wikipedia.org
- Ollmann, G. 2007, URL Embedded Attacks
Retrieved 24 January 2008 from technicalinfo.net
- Boutell, 2006, WWW FAQs: What is the maximum length of a URL?
Retrieved 24 January 2008 from boutell.com
History
January 2008