|   | ![]() |
|
SQL Server 7.0 Datatypes 1 2 3 4 5 6 SQL Server DatatypesChosing datatypes for your tables is one of the most basic decisions you will make in database design. Many people discount this part of database planning and take a "close enough" approach. The decisions you make at this stage have far reaching effects that you might not have intended. The difference in a single character can have a staggering effect on performance and storage requirements. It is absolutely vital that you do a little extra research and select the most appropriate data type in terms of the range of values it will contain, the storage space required, and the impact within the database. To illustrate the proper selection of datatypes, I'll outline a few real world examples that I have seen implemented on systems and the effects that it had. One large and well known commercial application that I ran across utilized a varchar(34) column as the primary key across all tables in the database. It further used this in multiple columns within some tables as tracking specific attributes. Now you might look at this and wonder why anyone would select that datatype. The best answer I could come up with was marketing. What was being stored in this column was a randomly generated value that was simliar to the ID created for each OLE object registered on your system. The thought was good, because you could store a very large number of these without ever running out of values. The reality was a completely different story. They should have selected an integer for this column instead. An integer would have given them over 2 billion distinct values if only the positive side was used. This is more than sufficient. All of the DBMS manufacturers will claim they can store many more rows than this in a single table. The reality is that no DBMS on the planet can manage and maintain 2 billion rows in a single table. I originally saw this application, because a client was having very large performance and blocking problems. The blocking problems were due simply to extremely poor application architecture. The performance problems were due entirely to the datatype selection. This varchar(34) column was replaced with an integer across the database. By making this one simple change, the performance of all queries across the system increased by almot 97% and the storage space required was reduced by more than 60%. Why? Each integer comsumes 4 bytes of space. Each varchar(34) comsumed 34 bytes of space. Many people might say that a number is a number, so you might as well choose the largest one just in case. This type of thinking is absolutely false and leads to a similar problem. I was brought in by a major software vendor to fix some performance problems in an internal application. There were lots of things that were done within the SQL to get much better performance. Datatypes did factor into this. Many of the tables in this database utilized an identity column. That was good. The bad part was that each one of them was a numeric(18,0), but they only contained integer values. By changing the numeric(18,0) to an integer column, the performance increased by approximately 30% and the storage required decreased by approximately 20%. These problems are not limited to numeric types of columns. Everyone has heard about all of the year 2000 issues. If you haven't, you should really take a break and look outside. Has anyone heard about the June 6, 2079 problem? Most likely not. MS SQL Server and Sybase databases are effected by this. What does June 6, 2079 have to do with datatypes? June 6, 2079 is the maximum value that can be stored in a smalldatetime datatype. Don't get caught by this. If you are designing an application that will be tracking near point in time values, then a smalldatetime would be appropriate since it requires half the storage space as a datetime datatype. If you are designing any application for banking, accounting, insurance, financial markets, or utilizing any forecasting, you will seriously limit the lifetime of that application by using a smalldatetime datatype. Accountants routinely forecast and depreciate over 40 years. This means many applications are already pushing dates out as far as 2040. Some financial forecasting applications will push out as far as 50 or 60 years for calculating retirement income and benefits. Would you want an application that is planning and forecasting your retirement to be limited to June 6, 2079? A smalldatetime leaves very little room for these types of applications. Now that you are throughly apprised of the effect that datatypes can have on your environment, which ones should you use? The simple answer to that is plain common sense. Use the most appropriate datatype for the job. It might sound flippant, but it is the way I design databases. If you are storing numeric data that doesn't have any decimals, then you need an integer datatype. If you are storing character data whose width varies widely, then you need a varchar datatype. When designing your tables, you simply have to be completely aware of the effect a specific datatype will have on your environment. Datatype selection simply boils down to the type of data you need to store and the minimum amount of space it can be stored in. The datatypes available in SQL Server 7.0 will not come as a surprise, because they are all relatively standard across all database platforms. |
All content on this site, except where noted, represents an original work of Michael R. Hotek and is protected by applicable copyright laws. The SQL Server FAQ is the sole work of Neil Pike. No page, portion of a page, or download may be used for commercial purposes in whole or in part without the express, written permission of the applicable author.