In SQL, the columns of a table are defined to store a specific kind of value such as numbers, dates, or text; these are called data types. With over thirty types of SQL server data types to choose from, data types can seem intimidating, but in reality, there are just a few commonly used in databases.
Once you understand data types in general, it isn’t a complicated subject. You’ll also come to understand why there is a need for data types and the reason there so many varieties.
In this article, we’re going to go over the most common datatypes and the best practices for their use.
All the examples for this lesson are based on Microsoft SQL Server Management Studio and the AdventureWorks2012
database. You can get started using these free tools using my Guide Getting Started Using SQL Server.
What are SQL Server Data Types?
Before we get into the data types used in SQL server, I think it is important for everyone to understand what a data type is, and why they are needed for use in computers.
For our example, we’ll talk about a person. If you were thinking about information you could use to describe a person you may think of collecting their name, birthdate, weight, and number of children. For each of these attributes, the data generally falls into several categories.
Names are stored as text, birthdates as calendar dates, and weight as decimal values, and number of children as integers. Each of these categories of values is a data type.
Data types define the characteristics of the data that can be stored in a location such as a database column. A data type defines the possible set of values that are accepted. For example, a type of INT
which stands for integer in SQL server, can only accept whole numbers, decimal values are not allowed.
Here is a listing of some people and their ages:
Age – Just Text Values
Can you see the issue with these ages? They’re all in an inconsistent format. Not only are they hard to read, but sorting and calculating the data is difficult. By using data types, which enforce the expected type of data to be stored, we can achieve a much better result. Here is the same table with the ages shown as integers.
Ages conforming to INT data type
By defining age as an integer, you’re setting up the expectation that data entered are whole numbers.
Also, without getting into technical details, you can see storing the age as an integer takes up much less space than the age in the first table. This may not seem like a huge consideration with small databases, but when working with data on smartphones or “big data” scenarios, these considerations add up.
Once the system understands the data type, it can then sort the data in an order that makes sense as well as perform calculations.
Here are some reasons why data types are important:
- The data is stored in the database in a consistent and known format.
- Knowing the data type allows you to know which calculations and formulations you can use on the column.
- Data types affect storage. Some values take up more space when stored in one data type versus another. Take our age tables above for example.
- Data types affect performance. The less time the database has to infer values or convert them the better. “Is December, 32, 2015 a date?”
Commonly Used SQL Server Data Types
In SQL, each table’s columns are meant to hold a specific type of data. Examples of these types are text and numeric data. This is an important property to the table and is called the column’s data type. You find that the data type is one of the most defining attributes for a column. It alone can determine whether a value is valid and to be stored in a column.
There are over thirty different data types you can choose from when defining columns. Many of these are set up for very specific jobs such as storing images, and others more suitable to general use.
In this article, we’re going to cover the seven data types you’ll most frequently encounter in your everyday use of SQL. These are:
INT
VARCHAR
, NVARCHAR
DATETIME
DECIMAL
, FLOAT
BIT
INT – Integer Data Type
The integer data type is used to store whole numbers. Examples include -23, 0, 5, and 10045. Whole numbers don’t include decimal places. Since SQL server uses a number of computer words to represent an integer, there are maximum and minimum values which it can represent. An INT
datatype can store a value from -2,147,483,648
to 2,147,483,647
.
Practical uses of the INT
data type include using it to count values, store a person’s age, or use as an ID
key to a table.
But INT
wouldn’t be so good to keep track of a terabyte hard drive address space, as the INT
data type only goes to 2 billion and we would need to track into the trillions. For this, you could use BIGINT
.
The INT
data type can be used in calculations. Since DaysToManufacture
is defined as INT
, we can easily calculate hours by multiplying it by 24
:
SELECT Name,
DaysToManufacture,
DaysToManufacture * 24 as HoursToManufacture
FROM Production.Product
Here you can see the results:
Results of INT calculations
There are many operations and functions you can use with integers which we’ll cover once we dig into functions.
VARCHAR and NVARCHAR – Text Values
Both VARCHAR
and NVARCHAR
are used to store variable length text values. “VARCHAR
” stands for variable length character.
The number of characters to store in a VARCHAR or NVARCHAR are defined within the column. For instance, as you can see in the following column definition from the object explorer, the product name is defined to hold fifty characters.
VARCHAR definition shown in SSMS Object Explorer
What makes VARCHAR
popular is that values less than fifty characters take less space. Only enough space to hold the value is allocated. This differs from the CHAR
data type which always allocates the specified length, regardless of the length of the actual data stored.
The VARCHAR
datatype can typically store a maximum of 8,000 characters. The NVARCHAR
datatype is used to store Unicode text. Since UNICODE characters occupy twice the space, NVARCHAR
columns can store a maximum of 4,000 characters.
The advantage NVARCHAR
has over VARCHAR
is it can store Unicode characters. This makes it handy to store extended character sets like those used for languages such as Kanji.
If your database was designed prior to SQL 2008, you’ll most likely encounter VARCHAR
; however, more modern databases or those global in nature tend to use NVARCHAR
.
DATETIME – Date and Time
The DATETIME data type is used to store the date and time. An example of a DATATIME
value is:
1968-10-23 1:45:37.123
This is the value for October 23rd, 1968 at 1:45 AM. Actually, the time is more precise than that. The time is really 45 minutes, 37.123 seconds.
In many cases, you just need to store the date. In these cases, the time component is zeroed out. Thus, November 5th, 1972 is:
1972-11-05 00:00:00.000
A DATETIME
can store dates from January 1, 1753, through December 31, 9999. This makes the DATETIME
good for recording dates in today’s world, but not so much in William Shakespeare’s.
As you get more familiar with the various SQL built-in functions, you’ll be able to manipulate the data. To give you a glimpse, we’ll use the YEAR
function to count employees hired each year. When given a DATETIME
value, the YEAR
function returns the year.
The query we’ll use is:
SELECT YEAR(HireDate),
Count(*)
FROM HumanResources.Employee
GROUP BY YEAR(HireDate)
ORDER BY YEAR(HireDate)
And here are the results
Using YEAR function on DATETIME data type.
The benefit is the DATETIME
type ensures the values are valid dates. Once this is assured, we’re able to use a slew of functions to calculate the number of days between dates, the month of a date and so on.
We’ll explore these various functions in detail in another blog article.
DECIMAL and FLOAT – Decimal Points
As you may have guessed, DECIMAL
and FLOAT
datatypes are used to work with decimal values such as 10.3.
I lumped DECIMAL
and FLOAT
into the same category, since they both can handle values with decimal points; however, they both do so differently:
If you need precise values, such as when working with financial or accounting data, then use DECIMAL
. The reason is the DECIMAL datatype allows you to define the number of decimal points to maintain.
DECIMAL
DECIMAL data types are defined by precision and scale. The precision determines the number of total digits to store; whereas, scale determine the number of digits to the right of the decimal point.
A DECIMAL
datatype is specified as DECIMAL(precision,scale)
.
A DECIMAL
datatype can be no more than 38 digits. The precision and scale must adhere to the following relation
0 <= scale <= precision <= 38 digits
In the Production.Product
table, the weight column’s datatype is defined as DECIMAL(8,2)
. The first digit is the precision, the second the scale.
Weight is defined to have eight total digits, two of them to the right of the decimal place. We’ll see the following sample query to illustrate this data type.
SELECT DISTINCT Weight
FROM Production.Product
WHERE Weight BETWEEN 29.00 and 189.00
ORDER BY Weight DESC
The results follows:
Example of DECIMAL Data Type
FLOAT
Where DECIMAL
datatypes are great for exact numbers, FLOAT
s are really good for long numeric values. Though a DECIMAL
value can have 38 digits total, in many engineering and scientific applications, this is inadequate. For scientific applications where extreme numeric values are encountered, FLOAT
rises to the top!
FLOAT
s have a range from – 1.79E+308 to 1.79E+308. That means the largest value can be 179 followed by 306 zeros (large indeed!).
Because of the way float
data is stored in the computer (see IEEE 754 floating point specification), the number stored is an extremely close approximation. For many applications, this is good enough.
Because of the approximate behavior, avoid using <>
and =
operators in the WHERE
clause. Many a DBA has been burned by the statement.
WHERE mass = 2.5
Their expectations are dashed when mass is supposed to equal 2.5
, but really, in the computer it is stored as 2.499999999999999
; therefore, not equal to 2.500000000000000
!
That is the nature of floating points and computers. You and I see 2.499999999999999
and think for practical purposes it is 2.5
, but to the computer, we're off just a bit.
BIT – Boolean or Yes/No values
There’s times when you just need to store whether something “is” or “is not.” For instance, whether an employee is active. It is in these cases that the BIT datatype comes to its own. This data type be one of three states: 1
, 0
, or NULL
.
The value of 1
signifies TRUE
and 0 FALSE
.
In this query, we’re listing all salaried position job titles:
SELECT DISTINCT JobTitle
FROM HumanResources.Employee
WHERE SalariedFlag = 1
Here are the results:
BIT by BIT we learn More SQL!
We could have also used ‘True
’ instead of 1
. Here is the same example using ‘True
’.
SELECT DISTINCT JobTitle
FROM HumanResources.Employee
WHERE SalariedFlag = 'True'
And the opposite using ‘False
’:
SELECT DISTINCT JobTitle
FROM HumanResources.Employee
WHERE SalariedFlag = 'False'
I tend to stick with 1
and 0
, since it is easier to type, but if you’re going for readability, then ‘True
’ and ‘False
’ are good options.
Finally
In this article, I’ve just given you a taste of some of the more popular data types. Don’t limit your awareness to just these. There are many more data types you can use in SQL Server. Rather than reiterate their names, allowed values, and ranges, please go here for excellent reference material.
The post Commonly used SQL Server Data Types appeared first on Essential SQL.