When designing a database it’s a good idea to follow some sort of naming convention. This will involve a little planning in the early design stages but can save significant time when maintaining the finished system.
For a system that is built to be deployed from the command line, you may have a set of SQL scripts that build every table, view, index, constraint etc from scratch. When updating or refactoring the design it can be an enormous advantage to search and replace one text item for another. However for this to be at all possible the names have to be unique.
An example, you have the tables
T_Audit_staff
T_Staff
If you want to search/replace the name of T_Staff to T_HR, this will fail because it will also match T_Audit_staff
If you had chosen the name T_Aud_staff then this search/replace will work.
In reality few people will blindly run Search and Replace across a large code base, but having reasonably unique names for things does make it faster to find and update all the references to it. If you have a table called staff with a staff column and a staff index and staff foreign keys, it will be harder to maintain.
Whatever convention you decide to follow, keep it simple or no one will use it consistently.
- Table names are plural, field name is singular
Joe Celko, world-famous database expert and one of the authors of the SQL-89 and SQL-92 standards, states in 'SQL for Smarties', that table names should be plural because they represent collections, not individual entities.
Collective nouns which name a group of people, creatures, or objects can also work well as table names.
Some examples: Catalogue, Family, Library, People, Staff, Store, Team.While this is partly a stylistic choice, SQL is a language and following english grammar norms will simplify writing queries.
Plural table names are less likely to conflict with reserved keywords.
- Table names should not contain spaces or special characters, words should be split_up_with_underscores.
While it is possible to use mixed case, typically pascal case, most databases are case insensitive (and the SQL language is case-insensitive) so what happens is people get lazy and write SQL (or use automated query tools) and you end up with all lower-case or all upper-case SQL that is hard to read.
Whether your table for Order Items is called ORDER_ITEMS or OrderItem or Order_Items is less important than that you do it the same way throughout the same database.
The maximum length of a table name is limited to between 30 and 128 characters, it varies with database vendor. If portability is important then keep table names shorter than about 60 characters.
If the table name contains serveral words, only the last one should be plural:
APPLICATIONS
APPLICATION_FUNCTIONS
APPLICATION_FUNCTION_ROLES
- Prefix for all tables?
There are pros and cons to adding a prefix or suffix to identify tables:
Pros: If most access will be made via VIEWS then prefixing all the tables with T_ and all the views with V_ keeps things organised neatly, you will never accidentally query the wrong one.
Cons: Suppose, your naming convention is to have the '_TAB' suffix for all tables. According to that naming convention, the APPLICATIONS table would be called APPLICATIONS_TAB. If as time goes by, your application gets a second login, perhaps for auditing, or for security reasons. To avoid code changes, you will then have to create a View or Synonym that points at the original tables and is confusingly called APPLICATIONS_TAB.If all your table names are plural and fields singular, then there is less chance of anyone confusing a table and field name, so less need for an indicator that this is a table.
Field names should not contain spaces or special characters, words should be split_up_with_underscores.
Column names, should be a noun describing the field or object. Avoid using words that are just data types.
Ideally each field name should be unique within the database schema. This makes it easy to search through a large set of code (or documentation) and find all occurences of the field name.
The convention is to prefix the fieldname with a 2 or 3 character contraction of the table name e.g.PATIENT_OPTIONS would have a field called po_patient_option
PATIENT_RELATIVES would have a field called pr_relative_name
APPOINTMENTS would have a field called ap_date
An argument against this is that every field should always be explicitly qualified by the table, but people are lazy or use tools which write unqualified SQL and so that won't always happen.
Also having a prefix makes it easy to spot an incorrect table/column combination
Spot the error:
SELECT
PATIENTS.pa_patient_pk,
PATIENTS.pa_patient_name,
PATIENTS.ap_appointment_date,
APPOINTMENTS.ap_clinic_name
FROM
PATIENTS, APPOINTMENTS;In a large schema you will often find two tables having similar names which could result in the same prefix. You can avoid this by thinking carefully about the name & prefix you give each table and documenting prefixes as they are used.
One advantage of adding this prefix is that you are very unlikely to choose a reserved word by accident.
For very complex systems (thousands of tables) consider alternatives e.g. a prefix/suffix to identify the Application module.Keeping names short: When running an SQL expression, particularly dynamic SQL in production, the sum of the sizes of all such expressions can start to add up. When all table and field names are reasonably short then more of them will fit into a given area of RAM.
Indicate by appending _pk
e.g.
PATIENTS would have a primary key called pa_patient_id_pk
REGIONS would have a primary key called re_region_id_pkAnd so on…the name of the primary key field being a singular version of the table name. Other tables containing this as a foreign key would omit the _PK
so
CLINIC_ATTENDANCE might then have a foreign key called ca_patient_id
or alternatively: ca_patient_id_fk
Tables with Compound PK's, use _ck in place of _pk
Notice that where several tables use the same PK as part of a compound foreign key then the only unique part of the FK fieldname will be the table prefix.If all your primary keys are id's, which they tend to be, then you can simplify this by replacing the _id_pk suffix with just _pk
One possible downside of using this suffix is that if you ever decide to change the primary key to a different column, then you will have some refactoring to do. However this is rare and would still involve refactoring even if the column did not have a _pk suffix.
View names are plural, field name is singular
View names should not contain spaces or special characters, words should be split_up_with_underscores.While it is common to prefix (or suffix) all views with V_ or VW_, an argument can be made that neither are really needed.
Giving your Views friendly easy names will promote their use by developers and end-users, this in turn will mean fewer badly written queries and more use of 'shared SQL' which will improve the cache hit ratio.
For very large systems, it can make sense to prefix tables/views with the application module name, so a database holding data for both Widget Production and Human Resources data might prefix everything with either HR_ or WP_
Name the Primary Key index as idx_<TableName>_pk
e.g.
PATIENTS would have a primary key index called idx_patients_pkName a Unique Index as idx_<TableName>_uk
e.g.
PATIENTS would have a unique index called idx_patients_ukWhere more indexes are added to the same table, append a numeric:
idx_<TableName>_##
Where ## is a simple number
e.g.
PATIENTS would have additional indexes called idx_patients_01, idx_patients_02,…
Primary and Unique constraints will be explicitly named i.e. not given random auto-generated names. Constraint names often appear in error messages so make it something that will be helpful.
Name the Primary Key Constraint as pk_<TableName>e.g.
PATIENTS would have a primary key index called pk_patientsName a Foreign Key Constraint as fk_<TableName>
e.g.
PATIENTS would have a Foreign Key constraint called fk_patientsIn general each constraint should have a similar name to the index used to support the constraint.
Without getting carried away, you can also apply a suffix to non key fields where this is helpful in describing the type of data being stored.
e.g.
A column used to store boolean (Yes/No) values can be given a _yn suffix: retired_yn, superuser _yn, driver_yn
In lookup tables a good way to identify the main text field is to name it as a singular version of the tablename:asset_types.at_asset_type_pk (Primary Key) asset_types.at_asset_type (Text field) asset_types.at_network_yn (boolean)
Type all SQL statements in lowercase, being consistent with capitalisation improves the caching of SQL statements. A common variant is to put only SQL keywords in capitals.
SELECT em_employee_pk, em_employee, ab_start_date FROM employees em, absences ab WHERE absences.ab_employee_id=employees.em_employee_pk;You already have a unique prefix worked out for every table, so use the same thing when an ALIAS is required, this makes the SQL much easier to read.
Listing tables in the FROM / WHERE clause in the desired join order, will give the Query Optimiser less work to do.
Name Data files so that the name includes the FileGroup/instance and the database.
For non-trivial databases, each filename should end with a two digit numeric value starting with 01, that is incremented by 1 for each new datafile added.File extensions, use the platform norms for file extensions e.g. Microsoft SQL database:
Primary data file = .mdf
Secondary data file = .ndf
Transaction log file = .ldfOthewise use the generic extension ".dbf"
Avoid naming these according to time periods.
For a partitioned datawarehouse, try to adopt a strategy of recycling the Filegroup/tablespace names.
Naming conventions can become the subject of endless debate, no scheme is going to be perfect.
Does your naming convention make names both consise and readable?
Is the naming convention documented somewhere that everyone can find?Write and maintain a data dictionary for all data elements, rather than just dumping the data dictionary into a text document or an Entity relationship diagram - you should also define the business meaning of each data item.
If your existing project already has a standard approach to naming database objects then keep using it.
“We are braver and wiser because they existed, those strong women and strong men... We are who we are because they were who they were. It's wise to know where you come from, who called your name” ~ Maya Angelou
SQL formatter - Online SQL beautifier
Standard ISO-11179 - Rules for defining data elements
List of Naughty Strings - Strings which have a high probability of causing issues.
IT System Documentation Levels