Deep Dive: SQL Data Types
In this deep dive we'll analyze the key SQL data types, including how you can use them to configure each of the columns in your database.
Guide Tasks
  • Read Tutorial
  • Watch Guide Video
Video locked
This video is viewable to users with a Bottega Bootcamp license

Excellent job in getting SQL installed and up and running. We also walked through how to set up our database schema and how to create tables. We didn't really go into detail yet on data types and that's what we're going to do in this deep dive.

So we're going to talk about all of the key data types that you're going to be using in SQL. We will use varchars which are really strings. What would you use with words, decimals, floats, and integers? We're going to talk about how to pick and choose which data type is right for your database.

So let's get started on the deep dive into SQL data types. The very first thing to look at is all of the available data types that are out there. Now these are the main ones for each version of SQL such as Oracle's, MS SQL, and MySQL. They each have a few variations and some are more customized versions. They essentially all relate to one of these data types.

medium

The datatypes are characters, var cars, booleans, small ints, integers, decimals, floats, date time, clobs, and blobs. We're going to talk about each one of these data types and we're also going to walk through examples on them. Some of the examples would be when you would use them and when you probably shouldn't.

large

Starting off with characters which in most variations of SQL are called chars. Chars is just short for character. Now a character is a fixed length string. Which means that whatever number of characters that you tell the database that you want in that specific column. It is going to have to reserve that in memory.

Now that doesn't mean that you need to take up that full amount. However it's considered a best practice because let's say that you want to have a character set or a column that has 10 characters in it. The way that SQL is going to work is it's going to take up 10 characters in that column in regards to memory space.

Regardless if you have one character in there or if you have all ten. So when you would want to use a char data type is when you have a pretty good idea of the exact length that is going to be filling up that column. So like I said it takes 100 percent of the space allocated and you have from 1 to 255 characters in order to define it. So looking over on the right hand side are the “proper use” of the char data type.

Something such as a 10 digit phone number where you always know there's got to be exactly 10 digits or a social security number. Putting something like that in there is also going to be a good idea because you know the Social Security and phone numbers are always the exact same length.

Now improper use is you wouldn't put a straight integer in there. Notice how we don't have any quotation marks on either side. That means that that is an integer that would break it and that would cause an error. Even though the next one has Jon Snow’s name in it. It is not going to be a good idea to use because if you have a column that you put names in it. It’s a pretty good chance that those names are all going to be a different set of character links. So with that in mind you don't want to do it.

large

Any type of column that is going to vary in character length really needs a special type of data type. It's one we're going to be using for most of this course and that is the varchar. Now the varchar is just short for variable characters so it's a variable length string. This is one of the most popular data types in all of SQL.

In many scenarios you don't know the number of characters that are going to go into a column. So in those cases a varchar is a great option. Now the one thing to keep in mind is if you compare a varchar with a regular character it is going to be slower because it has dynamic memory allocation.

What that means is because SQL doesn't know how many characters to put up. It has to dynamically generate and allocate that amount of space so it doesn't know if the next varchar it hits is going to be three characters or if it's going to be a book and it's 64,000 characters. That’s because SQL has to be able to adjust on the fly and that's what our varcars do.

So the proper use is pretty much any type of string when you don't know how long it's going to be. Assuming that type of string is under 65,535 characters. If it's over that we'll talk about a different data type to use later on in proper use any type of integers. Notice we don't have quotation marks.

Then also the other two a set of phone numbers or Social Security numbers something where you know that column is always going to have the exact same number of characters in that column. Then that's when a character data type is a better idea.

large

Next on the list is the boolean data type. Now booleans are just true and false values. You just call the boolean data type and then the use here is you have four different variations are really two for each. You have true and false where all of the letters are lowercase and then you have true or false where everything is capitalized.

Those are your only two options when you're using the boolean data type. Now the improper use. This is something that if you are not familiar with it you may think that a zero or one would work. In some programming languages a zero or a one is actually considered true or false.

In fact in machine languages that's exactly the way it works. However in SQL that is not going to work and it's going to throw an error. You only have true and false and those are your only two options. You also can't put something like T and F, yes and no, string or a number. It's pretty basic. This is a great data type whenever you want to do something such as saying “is this post archived or is it not archived?”. You know something that is always going to be true or false.

large

Next on the list is the small int data type. So a small int is going to be any integer that ranges from the number 32,768 all the way through negative 32768. Decimals will not work and if you look in the proper use side. You can pass a decimal to the small int column such as doing 42.5 but it's automatically going to be converted down to 42.

Now you can have positive and negative numbers up to that 32,768. If you put any kind of string in there you're going to get an error. If you have a number one higher than the max or one lower than the max negative, you're going to get an error. Then if you have a number that's bigger than the small int. The integer data type might be a better fit. This goes much larger. So with this one you can go up to 2,147,483,648.

large

The next datatype we will be covering is integer. So a int is a pretty decent size of numbers when you're performing your calculations and storing some type of integer value. The integer data type is a very popular one and you will be using that one quite a bit. Now in the proper use and in the improper use I put commas to show what the numbers are.

Inside of an actual database you would never put the commas because that would make it look like a string and that wouldn't work. I put them in there simply so you could actually count the numbers a little bit more easily. On the left hand side is how you would actually inject it into the database.

Also another thing to keep in mind integers are called either full integers and they're also called int’s. Now every single one of our numeric data types we've looked at so far such as the integer or the small int. Whenever it comes across a decimal value it automatically truncates it.

large

Whenever we need to have decimals. Well that is where the decimal data type comes into play. So here we have a very, very high level control over decimals. It works with two main components which are precision and scale. Now if you've never heard of that then don't worry, they're not really that scary of terms and we'll walk through exactly what that means.

So precision as you can see there on the bottom left hand side. The precision is essentially one argument and scale is the second one.
The precision is going to be the total number of digits that are stored inside of that column. The scale is going to be how many decimal places that you're going to allocate for it.

So if you look over at the “proper use” side. On the top side you have four total digits for 9.834. You can see that the three represents how many digits are right of the decimal point. On the other example it's 7 and 2 and right there you have 42215.85. That's because the precision is going to be for seven characters and two of those are going to be to the right of the period. Now this is a way of being able to have very strict control over the type of decimals that you're using.

If you're using a financial type of application then the Decimal data type is something you're going to be using a lot of. If you picked the wrong data type when it comes to performing calculations, so if you use the float data type which is what we're going to talk about next. It is not anywhere near as precise.

You could end up in a situation where you could be running calculations for a financial application and those financial numbers are not accurate because of what you did. It was not precise because you were dealing with numbers that were approximated with decimals and it's going to be exact. Anytime you're dealing with financial data or any time as to be to the exact value then decimals the way you want to go.

So I walked through the proper use, improper use obviously any type of string. This is only for decimal type data and it can only have 38 digits. It can't go over 38. Now if you need to have a number over 38 then you can go with the floating point.

large

Floating point includes decimals but it also allows you to have very large numeric values. In fact it gets so big you can have a number of 179 followed by 306 zeros. It's a very large number. That is a great option as long as you don't need very detailed, or I should say perfect, financial values when it comes to computation.

Floating points uses something that is called Floating Point approximation. That means that when you're dealing with decimals you're going to have very subtle differences. A great example is in the “improper use” column there. You can see where 9.5 equals 9.5 and it may not actually be equal as far as a floating data point sees it.

So in other words if you run a Querrey and you say I want this value of 9.5 to bring me all the values that are 9.5. The way that the float data type works it may actually be storing one of those numbers that you think is 9.5. It may be storing one of those as something like 9.4999999.

So that it's very subtle but it can definitely trip up users especially if you're expecting perfectly accurate data. So that's something to keep in mind. It's a pitfall specially for not just new developers for any developers working with financial data and large computations. Definitely play around with both of those. Run some comparisons to see which one is the right fit for your exact scenario.

large

Next we're going to talk about the date time data type. So specifically in the SQL you have the ability not just to store dates and times as strings, but instead you can actually put them as date objects. So you have date and time values and the range for those right now in most databases.

This one specifically for a MS SQL is January 1st 1753 through December 31 9999. That's a pretty big range. So if you have a question on what happens if you need to store a date that is older than that. That is where you can get in to storing dates as strings. When you want to be able to perform dates and sort through them, and then be able to say I want you to bring me all the dates starting from January 1st of last year up till now.

Then sort them by date and time which is something that would be a very natural thing to do when you're working with timestamps in your database. This is a great data type to use.

Now the “proper use” is there on the right hand side. It is the four digit year followed by a dash followed by the month, followed by a dash followed by the day, then the full timestamp all the way into seconds, Then also percents of seconds. Now there are many times you don't need the full date time.

In that case you can just store the date and then the time is all at zero. Both of those are very common implementations of date time. Many of the SQL databases don't just have a datetime object or data type. They also have a time data type and even just a date by itself data type. If you want to become very specific. Also for “improper use” don't store anything that is a string just a regular integer or anything.

If you notice you may not see what the key difference is on that bottom “Improper use” example. That is actually an improperly formatted date. So if you throw this into SQL you're going to get some very buggy behavior because it has the month, day, and then year. Every SQL engine has a specific format that it can use to parse the date.

large

Next on the list is the clob. Now clob stands for character large object. So we talked about var chars being able to go up nearly to about 67,000 characters. a slob is what you use when you need something even bigger. So if you're to store the collection of the Library of Congress or something just massive and you needed to do it as a single record which would be a horrible idea.

But needless to say if you needed to do some very very large storage of textual data a clob is a great option. I can tell you from experience this is something you'll use most likely very rarely. It's pretty rare that you need to store text data that is up to two gigabytes in size.

That is a gigantic text record but that is there in case you need to use it. So when would you use it? Very large strings. You would not use it pretty much for any small strings. Blog posts and all those kind of things can fit inside of of varchar very nicely. You wouldn't store any integers or any values like that

large

Last on the list of what we're going to cover is the blob. A blob that stands for binary large object and a blob is where you can store very large binary base data. In fact it can go up to two gigabytes in size. The most standard use of a blob is when you want to store an actual image in the database. It's pretty rare that you store images in the database.

The common pattern now is to upload an image to a server or to your local system. Then you would simply have a reference with a url to that image and then you can go find it whenever you need it to display on a Website or in a mobile application. What a blob does is a blob actually takes that image and sticks it inside of your database.

So whenever you call the database you have access to the file itself. Now there are times where that is necessary. In fact I've used a the blob data type probably more often than I've used the clob. The reason is because there are times when you don't want to or you may not have access to a third party system.

So you need to be able to store that in the local database and that is an important one to be aware of. Just know you have to be able to have the image and be able to actually treat it like a true piece of binary based data. So that is the full list of your main data types in SQL.

Resources