Data Types
In a relational database, every column is assigned a specific data type. This tells the database what kind of data it can store in that column, which is crucial for two main reasons:
- Data Integrity: It ensures that only valid data is stored. For example, you can't accidentally store a person's name in a column meant for a birth date.
- Storage Efficiency: By knowing the data type, the database can allocate the exact amount of space needed to store that information.
- Predictable Size: An
INTEGERtypically takes a fixed 4 bytes. If the database knows it is storing an integer, it reserves exactly that much space. Storing "1937" as text instead of an integer would require more space and more processing power to perform calculations. - Optimized Retrieval: When data has a predictable size, the database can calculate exactly where a specific record is located on the disk. This allows for much faster searches and data retrieval.
- Compact Storage: Specialized types like
BOOLEAN(True/False) can be stored extremely efficiently (often as a single bit), whereas storing the word "TRUE" as text would take much more space.
- Predictable Size: An
Example: The Library Database
Let's examine the data types for our Authors and Books tables.
Authors Table
| AuthorID | AuthorName | Nationality |
|---|---|---|
| 101 | J.R.R. Tolkien | British |
| 102 | George Orwell | British |
| 103 | Frank Herbert | American |
The data types for the columns in this table would be:
AuthorID:INTEGERAuthorName:VARCHAR(100)(A variable-length string with a max of 100 characters)Nationality:VARCHAR(50)
Books Table
| BookID | Title | AuthorID | Genre | PublishedYear |
|---|---|---|---|---|
| 1 | The Hobbit | 101 | Fantasy | 1937 |
| 2 | 1984 | 102 | Dystopian | 1949 |
The data types for these columns would be:
BookID:INTEGERTitle:VARCHAR(255)AuthorID:INTEGER(This must match the type ofAuthorIDin theAuthorstable)Genre:VARCHAR(50)PublishedYear:INTEGER
Common SQL Data Types
Different database systems support different data types, but most share a common set. Here are some of the most frequently used types:
Character Strings
VARCHAR(n): A variable-length string that can hold up toncharacters. Use this for text where the length varies, like names or titles.CHAR(n): A fixed-length string ofncharacters. If the string is shorter thann, it will be padded with spaces. Good for codes of a known, fixed length (e.g., state abbreviations like 'CA', 'NY').TEXT: For holding very long strings of text, like the body of an article or a detailed description.
Numeric Types
INTEGER(orINT): A whole number. Can be signed (positive or negative).DECIMAL(p, s)orNUMERIC(p, s): A fixed-point number withptotal digits andsdigits after the decimal point. Ideal for currency (DECIMAL(10, 2)), where precision is critical.FLOATorREAL: A floating-point number, used for scientific calculations where absolute precision is less important than a wide range of values.
Date and Time Types
DATE: Stores a date (year, month, day). Example:2025-09-24.TIME: Stores a time (hour, minute, second). Example:15:30:00.TIMESTAMPorDATETIME: Stores both a date and a time. Often used to track when a record was created or last modified.
Large Object Types (LOBs)
BLOB(Binary Large Object): Used for storing large amounts of binary data, such as images, audio files, or videos.- Usage Note: While databases can store these files, it is often better to store them in a dedicated file system or cloud storage (like AWS S3) and only store the file path (a
VARCHAR) in the database. Storing large files directly in the database can significantly increase backup sizes, slow down database performance, and make scaling more expensive.
- Usage Note: While databases can store these files, it is often better to store them in a dedicated file system or cloud storage (like AWS S3) and only store the file path (a
CLOB(Character Large Object): Used for storing very large blocks of character data. WhileVARCHARhas limits (often 4,000 to 65,000 characters), aCLOBcan hold gigabytes of text, such as entire books.
Boolean Type
BOOLEAN: Can holdTRUE,FALSE, or sometimesNULL.
Specialized and Modern Data Types
Modern databases often include additional types to handle complex data requirements:
JSON/JSONB: Stores semi-structured data (like objects and arrays). This allows you to combine the flexibility of NoSQL with the structure of a relational database.UUID(Universally Unique Identifier): A 128-bit number used as a unique identifier. Unlike anINTEGERthat increments, aUUIDis unique across all tables and databases, making it ideal for distributed systems.ENUM: A list of allowed values. For example, astatuscolumn might only allow'draft','published', or'archived'.GEOMETRY/GEOGRAPHY: Used for spatial data, such as GPS coordinates, points on a map, or boundary lines.INTERVAL: Stores a duration of time (e.g., "5 hours" or "2 days") rather than a specific date or time.
Example Table with Various Data Types
Here is an Employees table that demonstrates the use of several data types:
| EmployeeID | FirstName | LastName | HireDate | IsActive | Salary |
|---|---|---|---|---|---|
INTEGER |
VARCHAR(50) |
VARCHAR(50) |
DATE |
BOOLEAN |
DECIMAL(10, 2) |
| 1 | Alice | Smith | 2022-03-15 | TRUE | 75000.00 |
| 2 | Bob | Johnson | 2023-01-20 | TRUE | 82550.50 |
| 3 | Charlie | Brown | 2021-11-01 | FALSE | 68000.00 |