Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Understanding the Random URLs and Creating them in ASP.NET

0.00/5 (No votes)
27 Nov 2014 1  
This article is for the concept of Random URLs and how they can be generated in ASP.NET for creating Random URLs for your application.

Introduction

This article is about random URLs that help the application owners create small URLs that can be saved in databases or passed on to the next application as a short text instead of long boring URLs.

Most of the companies use this technique to provide short links to their or any other resource file the user would like to add, such as Twitter, Facebook and Tumblr, etc. They allow the users to send a link to a document that might be present at:

http://www.example.com/post/category/postId/some-other-text-from-post

.. to just the following link, as:

http://www.examp.le/pSij284Fj

Now let us understand the concept of this, and how we can implement this feature in our ASP.NET websites. But remember, this is a pure C# code, and none of the code from ASP.NET namespace or framework was used, which means you can use it in your .NET applications such as WPF, Win Forms, etc. too.

Longer URLs - The Problem

URL is a link to a specific document present on the internet, that a user can anytime access using any software that can access internet service. Sometimes, these URLs are long enough to irritate the users, such as the ones that contain all of the information appended to the URL itself. Take an example of the following URL:

http://www.example.com/page/category/sub-category/month-date/some-general-fancy-text

Now this is really a long text, and most of the URLs are even longer than this one and span over more than 500 characters. Oh well, browsers don't like these much longer URLs and sometimes Internet Explorer would create troubles with mailto: URLs with more than 500 characters.

Not only browsers hate these long URLs, even the Search engines hate these longer URLs and it is also noted that Google dosen't even care to index the URLs that 1855 characters or so. - Reference.

So it's not just we, the users, who hate these longer URLs which are either not visit-able again because we don't remember them at all after closing the tab or should go to the History to check them back - but it is the softwares too, which prevent usage of these many characters in the URLs.

Why it is not preferred

There is usually more than one reason to not want to use these longer URLs. But every one would have their own pet peeve and so do I have. I do not like to fill up my URL bar with a lot of data, I prefer to get only the website's domain and the location in the sitemap where I am at - the Page only.

Somehow, it is also good to send shorter URLs than larger ones. Each character in the URL takes up a byte of data (char's size is 1byte) and might be 2 byte if you're using Unicode (to be noted, in Windows, a character is always Unicode; in .NET framework, and will always use 2 bytes of data to represent a Unicode character). char overview in .NET can be read here.

For a good networking, and for smooth transferring of data between computers, and devices it is better to send less overhead data so that the actual message can reach faster. 2048 characters, having 2 bytes each would be almost 4kb of data just to identify the resource. 4kb data would be added to the overhead section in the data packet created to be transferred into the network stream and will have no use but just the address to identify the web resource the user would use, or the data must be sent at - for a high speed internet user this won't be a trouble, but for mobile users or the users with slow speed internet data package it will be a great problem because they first have to just send the data of the resource and then anything else can be performed and so on.

Mathematics of total data by SoMad: 2048 * 2 Bytes = 4096 Bytes = 32768 bits = 4kb - Thanks!

What might be done

Usually, shortening the URLs is not done, and the long URLs are sent but they must be minimized, for that most of the companies, specially the 140-charactered blogging website "Twitter" needed this feature, where they would store the long URL in their system, and will show just that short URL embedded in their Tweet, this way, any URL no matter how long it is would be just a few character long - 15 characters on Twitter, and for others maybe less such as Goo.gl (Google URL shortener) and fb.me (Facebook URL for short URLs) etc. companies.

This enables people to write short URLs such as, http://exm.pl/AfzaalBlog and the user would be able to remember, that this would redirect to my blog. This URL is better, because:

  1. It is short.
  2. It will use less data while transferring data.
  3. It is easy to remember.
  4. It is more semantic and makes sense of where the user actually is going to - more like a 2 - 4 words sentence.

These are a few reasons why people should use short URLs, not only to send the data fast (usually short URLs and long URLs don't seem to be any different on fast internet connections but make a huge difference once on slow connection).

Random URLs

If above said thing, "short urls are better" is correct, then why do people even want to use the Random URLs? Which even causes a havoc for them to create a random URL for each and every visitor and so on.

Why use Random URLs

A short URL is a better URL, above mentioned stats explain it. But usually, there are a bunch of users, who might want you to create a URL for all of their resource files, that would take a time for the user to fill in the form to get a ShortURL on your server, and then sign a petition to agree to the terms and take that URL in his hands to use it as a short URL. Well, that is a long process - and includes a little exaggeration too.

For that, a website might want to create a set of function, which creates an alphanumeric string that is Random - means that it get generated for each and every user for each and every attempt by him to get a new URL. It remains Random by checking against any other previously created such string that might be similar to this one.

The number of the URLs created by these alphanumeric text representation depends on the number of total characters allowed in the URL. You might know the concept of combination and permutation in which the characters and all of their partner characters are aligned in the string such that they always represent something new. For example, a simple string of 10 characters, which can hold 9 numbers, can grow upto, oh well I am not that much perfect in Mathematics but I can tell you that it will be enough to capture a huge company's documents resources without duplicating any link at all.

You can make this even stronger by using lower and capital case characters in your URL string too. This way, it will make your URLs 3 times stronger, and the mixture would be stronger enough to contain enough URLs that none of your users would complain for getting an error such as, "Sorry all out of URLs".

Making a stronger mixture - Really Random URL

This section was started in comments - Since a Random URL needs to be Random and the concept was to be generated short urls that do not span more than 7 - 10 characters. The real thing was to make these short urls random in real life too and not just a string that is used in the URLs. 

In this example, I have added two special characters to make the random-ness a little powerfull, similarly the URL has to be enough random so that in near future or even far future the URL generation is kept enough random to minimize any error of duplication. A duplication in URL generation will cause a Stop! in the application because no more short URLs will be generated as the application's main logic to generate URLs will go down. 

In the above logic, if you pay attention to the condition where character or number value selection is being made, you can see numbers will be selected only as a ratio of 1 : 3, 25% numbers and 75% chance of a character. Why did I do that in this, how does that affect the URL in real?

Actually the string we're going to create is a set of permutations of the 26 alphabets, that are used as characters, and not only those 26, two more special characters (-, _) which will add a little less chance of no-duplication to the logic. Now there is another list that contains the numbers. Mixing them up with create a Random URL. But for how long? That depends on the mixture, ever saw a chemist working? He choses the mixtures in a specific amount to create a compound that is a perfectly stable one. 

Similarly, adding a few numerics to the alphabet set will create a good result and a huge size of the random URLs that would be generated - a statistic or mathematics student would help us finding a good permutation of these lists - helping us to minimize any chance of duplication in our application. If the numbers are increased to be 50%, there will be a 50/50 chance of number and characters making it more vulnerable to be duplicate soon. Similarly, 0% numbers (or a rare chance of them) in the application would also cause the characters to start duplication after (a lot of) URLs being added to the database and your application logic would need to be changed or updated with a stronger one. 

One last thing to make another layer of removing chances of duplication is to check the case-sensitivity of the string. For example B is not equal to b. You would be checking each character of the string, to match the character of the other string to match it. Or in other way, you can use the .Equal() method that is available in all objects (because it is inherited from System.Object) to check whether two strings are equal in characters and their case too. For example the following code, 

// creating the variables
string a = "Love for all Hatred for none!";
string b = "LOVE FOR ALL HATRED FOR NONE!";
string c = "love for all hatred for none!";
string d = "Love for all Hatred for none!";
        
// testing the objects, whether they're equal or not.
if(a.Equals(b)) {
   Console.WriteLine("a matches b.");
} else {
   Console.WriteLine("a doesn't match b");
}
if(a.Equals(c)) {
   Console.WriteLine("a matches c");
} else {
   Console.WriteLine("a doesn't match c");
}
if(a.Equals(d)) {
   Console.WriteLine("a matches d");
} else {
   Console.WriteLine("a doesn't match d");
}

// Output of the program
// a doesn't match b
// a doesn't match c
// a matches d

.. shows the example of how to check for the case-sensitive strings to create another layer of security against duplication. Fiddle can be tested here.

Generating Random URLs in ASP.NET

ASP.NET allows you to create a random string using the GUID, that you can use to create a long unique string that is claimed to "never duplicate". But, you can also create a short snippet of code that will continually give you string with a Random set and sequence of characters.

Code for function

First step, we're going to generate a Random URL that is dependant on the alphabets (characters) and the numeric data (numbers). So what we would be doing would be, to create two lists, one would contain all of the character data for our URL and other one would contain all of the numbers in it 0-9.

In the characters, I have added "-" and "_" just to add a more chance of no-duplication, you can add anything else that you want to be valid in your URL and it will add an extra layer of no-duplicate URL in your applications.

Second thing is that you would be needing to generate the Random numbers, you can only generate integer values characters cannot be created using the Random class. So we will be making a use of the Indexers that let us select any one element from a collection; List is a collection of elements of type T, and then we will concat it to the string of the actual URL that we're going to write in the stream; or anywhere else.

// List of characters and numbers to be used...
string URL = "";
List<int> numbers = new List<int>() {1, 2, 3, 4, 5, 6, 7, 8, 9, 0};
List<char> characters = new List<char>() 
{'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', '-', '_'};

// Create one instance of the Random
Random rand = new Random();
// run the loop till I get a string of 10 characters
for (int i = 0; i < 11; i++) {
    // Get random numbers, to get either a character or a number...
    int random = rand.Next(0, 3);
    if(random == 1) {
        // use a number
        random = rand.Next(0, numbers.Count);
        URL += numbers[random].ToString();
    } else {
        // Use a character
        random = rand.Next(0, characters.Count);
        URL += characters[random].ToString();
    }
}

Using a separate Class

Remember to write this function inside a separate class from where you will be calling it, you can even specify it to have only one member, that is GetURL() method and make it static so that no one can create an instance of it and so on.

This will minimize any chances of code repetition - which is off-topic on this article, in your code which is against rules of programming. If you would create a class, the class would look like this:

// Required namespaces
using System;
using System.Collections.Generic;
using System.Web;

/// <summary>
/// RandomURL class generates Random URLs for applications.
/// </summary>

public class RandomURL
{
    // List of characters and numbers to be used...
    private static List<int> numbers = new List<int>() {1, 2, 3, 4, 5, 6, 7, 8, 9, 0};
    private static List<char> characters = new List<char>() 
    {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 
    'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'A', 'B', 
    'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 
    'Q', 'R', 'S',  'T', 'U', 'V', 'W', 'X', 'Y', 'Z', '-', '_'};

    public static string GetURL () {    
        string URL = "";
        Random rand = new Random();
        // run the loop till I get a string of 10 characters
        for (int i = 0; i < 11; i++) {
            // Get random numbers, to get either a character or a number...
            int random = rand.Next(0, 3);
            if(random == 1) {
                // use a number
                random = rand.Next(0, numbers.Count);
                URL += numbers[random].ToString();
            } else {
                random = rand.Next(0, characters.Count);
                URL += characters[random].ToString();
            }
        }
        return URL;
    }
}

... you will notice that now the members (method and both the Lists) are now static. Since we're never going to change the value again unless we're going to re-run the same code; or putting it simple - there won't be multiple instances of the same class at a time, so we would put them as statics and use them without creating an instance of it each time it is used.

Result of this code

I ran the same code, in the function phase as well as in the class phase and it worked! The result for my ASP.NET website's web page, that showed the URL generated, was like the following one:

Result

Each attempt of running the code would provide you with a Random URL that you can use to send in your short messages or perform other tasks in application.

Saving and Re-using the Original URL

This topic was started in comments - "How to get the original URL back for re-use". Well, it was a good topic to talk about. And it forced me to change the entire project and add a few more code blocks to it and the feature to get the original URL from the database and use the Redirect feature to do a practical of this article too.

Saving the URL

First stage is to store the URL in the database, you can design your database to store the URL and the short URL that you're going to use, in this process. The following schema can be used, please know that the ID is not required in this method.

Database Schema

Important things to note here are as follows:

  1. Long URL's size is set to 4000 - max, because this can be long as it can be.
  2. Short URL is set to 50, because it would be a short sized URL and its size must be short.
  3. ID is not required - as always mentioned.

Now you can simply just write the back-end code to save the data, and generate a new URL to be associated with the long URL in your application.

Note: I didn't change the class or the C# code, I just updated UI and the logic of the application to save the URLs and created a new page to redirect the user to the main URL that was shortened by the application. So do not confuse whether to use the above posted code or not. This one is a new feature.

Create an HTML form in the web page, to accept the URLs from the User.

<form method="post">
    <input type="text" name="url" />
    <input type="submit" value="Submit" />
</form>

Enough, now do the server-side coding, that would save the URL in the database.

string URL = "";
string longUrl = "";
if(IsPost) {
    longUrl = Request.Form["url"];
    // Change the UI of the web page.
    // Open the connection
    var db = Database.Open("StarterSite");

    // Get the URL
    URL = RandomURL.GetURL();
    if(db.Query("SELECT * FROM RandomUrls WHERE ShortUrl = @0", URL).Count() > 0) {
        // Generate a new URL because the previous one had a match.
        URL = RandomURL.GetURL();
    }

    // Now the URL is unique, so save it...
    db.Execute("INSERT INTO RandomUrls (UrlString, ShortUrl) VALUES (@0, @1)", longUrl, URL);
}

A little explanation about the above code is that it will run when the Request is POST. If it is, it will capture the URL that was posted by the user in the form, and will look into the database, if the ShortURL generated has a match, it will create a new RandomURL (it won't be done usually - but a lot often) and then it will save both the URLs in the database.

Once executed, it will save the URLs like this:

URLs in the Database.

You can attach your own conditions with it too so that the long URL is also checked to exist in the database and so on.

Extracting the original URL and redirecting the user

The main part of this process is the extraction of the real URL to which the user will be redirected to. It is as simple as a query to the database, first of all create a new ASP.NET page, where you can write the code to query the database. I queried the database, and wrote the list of the URLs like this:

List of URLs

You have saved the ShortURL string that was created in your database, that is associated with the long URL inside the database. You can pass that ShortURL to the page, and perform other different tasks on it, in this article the task of redirection is going to happen. Write this code in your new ASP.NET page:

// Get the short url
var shortUrl = UrlData[0];

// find it in the database
var db = Database.Open("StarterSite");
var found = db.Query("SELECT * FROM RandomUrls WHERE ShortUrl = @0", shortUrl).Count() > 0;
if(found) {
    Response.Redirect(db.Query
    ("SELECT * FROM RandomUrls WHERE ShortUrl = @0", shortUrl).First().UrlString);
} else {
    Response.Redirect("~/");
}

The above code will look into the database and will find the actual long URL that was shortened. I named the file to be redirect so in my application the URL was passed as:

Redirect page URL

Since this page was connected to Google, the result was this page:

Google page redirect

This way, you can simply just redirect the user to the actual web page's URL that was long and was shortened.

Tip: Why use only one instance of Random

I found it a bit interesting to share what I said in the article and in the following Points of Interest section too, to "use only one Random instance and use the .Next() method to create a new Random number". The important key point to note is that these Random numbers are generated using a Seed, if you pass that Seed to the constructor of the object, it will use it to create a Random number. Otherwise, it will use a seed for your time instance of the System, and as the code is executed the same time - not to set any debugging features on, or any other fancy hack, all of the instances that you use would contain the same Random number.

To minimize this error, you should use one Random number, and keep changing its value using the .Next method and pass the value range you want to get the Random number in between.

Points of Interest

Creating a simple class which would generate these Random URLs would be easy, it will also prevent the violation of "Repeating the code" rule in your application.

Creating only one Random instance would be better - once created, keep using the .Next() method to create a new Random for that instance in each and every function.

History

  • First post on this topic

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here