Char & varchar are the string or character data types in SQL Server. We use them to store the string data which includes letters. numbers, symbols, special characters, etc. Char is a fixed width data type while Varchar is a variable-length dataType. In this tutorial, we will learn what is char & varchar data types, how to use them. Also, learn similarities and differences between them.
Table of Contents
Char
Char is a fixed width data type.
The syntax for declaring char variable is char(n), where n defines the string size in bytes.
Note that n is string size in bytes and not the number of characters. The number of characters may vary depending on the code page used.
For Example, when we declare as char(10), The string will occupy 10 bytes of storage. The value of n can be from 1 to 8000 bytes.
Data stored in the char column always occupies the full width of the column. This is irrespective of the size of the string. If we insert a string, whose size is less than the column width, then the SQL server will pad the string with extra blank spaces.
The number of characters that you can store in char(n) is n characters, if single-byte encoding character sets such as Latin (SQL_Latin1_General_CP1_CI_AS) is used, which is the default for SQL Server.
For Multibyte character sets such as Japanese or Chinese, the number of characters is less than n characters.
The following example, creates the table with two char
columns.
1 2 3 4 5 6 | create table charExample ( col_char1 char(2) , col_char2 char(10) ) |
Varchar
Varchar (for variable character) is a flexible width data type.
The syntax for declaring char variable is varchar(n), where n defines the string size in bytes and not the number of characters. Internally varchar variable will also take extra 2 bytes to store the length information of the string. Hence the size of the varchar in the disk is n+2 bytes.
For Example, when we declare as varchar(10), The string will occupy 10 + 2 bytes of storage. The value of n can be from 1 to 8000 bytes.
Unlike the char data type, varchar does not pad the string with blank spaces. Hence the data stored in the varchar column always occupies only the number of bytes that are needed to store the string. Hence they are more storage efficient than the char strings.
The number of characters that you can store in varchar(n) is n characters, if single-byte encoding character sets such as Latin (SQL_Latin1_General_CP1_CI_AS) is used, which is the default for SQL Server.
For Multibyte character sets such as Japanese or Chinese, the number of characters is less than n characters.
The following example, creates the table with two varchar
columns.
1 2 3 4 5 6 | create table varcharExample ( col_varchar1 varchar(2) , col_varchar2 varchar(10) ) |
Varchar(max)
Varchar(max) is similar to varchar except that you can store a string of size up to 2^31-1 bytes (2 GB). The actual space occupied in the disk is the same as the size of the string in bytes + 2 bytes for length information.
Char & Varchar are limited to 8000 bytes of storage space
The SQL Server stores data in units of pages. Page size is 8 KB or 8192 bytes. Out of which 192 bytes the data is used to metadata information related to the page header, row offset, etc. That leaves us 8000 bytes to store the data. Since column data cannot exceed a page (otherwise it will be part of two pages), the maximum size you can use for a column is 8000 bytes.
But in the case of varchar(max), the SQL server checks if the size of the string. If it is less than 8000 bytes then it stores it in the page itself. But if it is more than 8000 bytes then it moves it to a separate page known as LOB_DATA and stores the pointer in the page. In this way, varchar(max) columns can store data up to 2 GB.
Char Vs Varchar
Char | Varchar |
---|---|
Used to store a string of fixed length | Used to store a string of variable length |
The syntax is char(n), where n is the string size in bytes to use. | The syntax is varchar(n) where n is the maximum number of string size in bytes to use. |
If the string length is less than the size of the column then the char adds trailing spaces to the data | The string is stored as it is. Does not add trailing spaces to the data |
Storage size is n bytes. i.e the same as the size of the column and not that of the string | The Storage size is the actual size of the string + 2 bytes. The additional 2 bytes to store length information |
The performance of the char is better than varchar. | Performance is slower compared to the char as the SQL Server needs to calculate the size of the string |
The maximum data that you can store is 8000 bytes | The maximum data that you can store is 8000 bytes |
The number of characters you can store in a char(n) is n characters if a single-byte character set is used, which is the default in SQL Server | The number of characters you can store in a varchar(n) is n characters if a single-byte character set is used, which is the default in SQL Server |
A Multibyte Character Sets such as Japanese Kanji or Chinese can occupy more than 2 bytes. In such cases, the maximum storage size in bytes will remain as n. But the number of characters that can be stored may be smaller than n | A Multibyte Character Sets such as Japanese Kanji or Chinese can occupy more than 2 bytes. In such cases, the maximum storage size in bytes will remain as n+2 bytes. But the number of characters that can be stored may be smaller than n |
Starting with SQL Server 2019, the char & varchar may also use two bytes per character if we use the UTF-8 enabled collation.
Which one to Use
If String is always going to contain full Length like country code, Zip Codes, etc. then use char data type.
If the string length is less than or equal to 4, it is better to use char. This is because the varchar will always add extra 2 bytes irrespective of its size.
In all other use cases, use the varchar
References
Read More