How-to: Database Naming Conventions

When designing a database it’s a good idea to follow some sort of naming convention. This will involve a little planning in the early design stages but can save significant time when maintaining the finished system.

For a system that is built to be deployed from the command line, you may have a set of SQL scripts that build every table, view, index, constraint etc from scratch. When updating or refactoring the design it can be an enormous advantage to search and replace one text item for another. However for this to be at all possible the names have to be unique.

An example, you have the tables

T_Audit_staff
T_Staff

If you want to search/replace the name of T_Staff to T_HR, this will fail because it will also match T_Audit_staff
If you had chosen the name T_Aud_staff then this search/replace will work.

In reality few people will blindly run Search and Replace across a large code base, but having reasonably unique names for things does make it faster to find and update all the references to it. If you have a table called staff with a staff column and a staff index and staff foreign keys, it will be harder to maintain.

Whatever convention you decide to follow, keep it simple or no one will use it consistently.

Table Names

Table names are plural, field name is singular

Joe Celko, world-famous database expert and one of the authors of the SQL-89 and SQL-92 standards, states in 'SQL for Smarties', that table names should be plural because they represent collections, not individual entities.

Collective nouns which name a group of people, creatures, or objects can also work well as table names.
Some examples: Catalogue, Family, Library, People, Staff, Store, Team.

While this is partly a stylistic choice, SQL is a language and following english grammar norms will simplify writing queries.

Plural table names are less likely to conflict with reserved keywords.

Table names should not contain spaces or special characters, words should be split_up_with_underscores.

While it is possible to use mixed case, typically pascal case, most databases are case insensitive (and the SQL language is case-insensitive) so what happens is people get lazy and write SQL (or use automated query tools) and you end up with all lower-case or all upper-case SQL that is hard to read.

Whether your table for Order Items is called ORDER_ITEMS or OrderItem or Order_Items is less important than that you do it the same way throughout the same database.

The maximum length of a table name is limited to between 30 and 128 characters, it varies with database vendor. If portability is important then keep table names shorter than about 60 characters.

If the table name contains serveral words, only the last one should be plural:
APPLICATIONS
APPLICATION_FUNCTIONS
APPLICATION_FUNCTION_ROLES

Prefix for all tables?

There are pros and cons to adding a prefix or suffix to identify tables:

Pros: If most access will be made via VIEWS then prefixing all the tables with T_ and all the views with V_ keeps things organised neatly, you will never accidentally query the wrong one.

Cons: Suppose, your naming convention is to have the '_TAB' suffix for all tables. According to that naming convention, the APPLICATIONS table would be called APPLICATIONS_TAB. If as time goes by, your application gets a second login, perhaps for auditing, or for security reasons. To avoid code changes, you will then have to create a View or Synonym that points at the original tables and is confusingly called APPLICATIONS_TAB.

If all your table names are plural and fields singular, then there is less chance of anyone confusing a table and field name, so less need for an indicator that this is a table.

Some early databases would perform faster when accessed via a view, so it became common to access everything via a set of views, mapped 1:1 with the tables.
That is no longer needed with modern systems.

Column/Field Names

Field names should not contain spaces or special characters, words should be split_up_with_underscores.

Column names, should be a noun describing the field or object. Avoid using words that are just data types.

Ideally each field name should be unique within the database schema. This makes it easy to search through a large set of code (or documentation) and find all occurences of the field name.

The convention is to prefix the fieldname with a 2 or 3 character contraction of the table name e.g.

PATIENT_OPTIONS would have a field called po_patient_option

PATIENT_RELATIVES would have a field called pr_relative_name

APPOINTMENTS would have a field called ap_date

An argument against this is that every field should always be explicitly qualified by the table, but people are lazy or use tools which write unqualified SQL and so that won't always happen.

Also having a prefix makes it easy to spot an incorrect table/column combination

Spot the error:

SELECT
  PATIENTS.pa_patient_pk,
  PATIENTS.pa_patient_name,
  PATIENTS.ap_appointment_date,
APPOINTMENTS.ap_clinic_name
FROM
  PATIENTS, APPOINTMENTS;

In a large schema you will often find two tables having similar names which could result in the same prefix. You can avoid this by thinking carefully about the name & prefix you give each table and documenting prefixes as they are used.

One advantage of adding this prefix is that you are very unlikely to choose a reserved word by accident.

For very complex systems (thousands of tables) consider alternatives e.g. a prefix/suffix to identify the Application module.

Keeping names short: When running an SQL expression, particularly dynamic SQL in production, the sum of the sizes of all such expressions can start to add up. When all table and field names are reasonably short then more of them will fit into a given area of RAM.

Primary Key Fields

Indicate by appending _pk

e.g.

PATIENTS would have a primary key called pa_patient_id_pk

REGIONS would have a primary key called re_region_id_pk

And so on…the name of the primary key field being a singular version of the table name. Other tables containing this as a foreign key would omit the _PK

so
CLINIC_ATTENDANCE might then have a foreign key called ca_patient_id
or alternatively: ca_patient_id_fk

Tables with Compound PK's, use _ck in place of _pk

Notice that where several tables use the same PK as part of a compound foreign key then the only unique part of the FK fieldname will be the table prefix.

If all your primary keys are id's, which they tend to be, then you can simplify this by replacing the _id_pk suffix with just _pk

One possible downside of using this suffix is that if you ever decide to change the primary key to a different column, then you will have some refactoring to do. However this is rare and would still involve refactoring even if the column did not have a _pk suffix.

View names

View names are plural, field name is singular

View names should not contain spaces or special characters, words should be split_up_with_underscores.

While it is common to prefix (or suffix) all views with V_ or VW_, an argument can be made that neither are really needed.

Giving your Views friendly easy names will promote their use by developers and end-users, this in turn will mean fewer badly written queries and more use of 'shared SQL' which will improve the cache hit ratio.

For very large systems, it can make sense to prefix tables/views with the application module name, so a database holding data for both Widget Production and Human Resources data might prefix everything with either HR_ or WP_

Index names

Name the Primary Key index as idx_<TableName>_pk

e.g.
PATIENTS would have a primary key index called idx_patients_pk

Name a Unique Index as idx_<TableName>_uk

e.g.
PATIENTS would have a unique index called idx_patients_uk

Where more indexes are added to the same table, append a numeric:

idx_<TableName>_##

Where ## is a simple number

e.g.
PATIENTS would have additional indexes called idx_patients_01, idx_patients_02,…

Conventions that attempt to use the column name as part of the index name become unmanageable as soon as you have multiple columns appearing in multiple indexes.

Constraint names

Primary and Unique constraints will be explicitly named i.e. not given random auto-generated names. Constraint names often appear in error messages so make it something that will be helpful.

Name the Primary Key Constraint as pk_<TableName>

e.g.
PATIENTS would have a primary key index called pk_patients

Name a Foreign Key Constraint as fk_<TableName>

e.g.
PATIENTS would have a Foreign Key constraint called fk_patients

In general each constraint should have a similar name to the index used to support the constraint.

Other Fields

Without getting carried away, you can also apply a suffix to non key fields where this is helpful in describing the type of data being stored.
e.g.
A column used to store boolean (Yes/No) values can be given a _yn suffix: retired_yn, superuser _yn, driver_yn

In lookup tables a good way to identify the main text field is to name it as a singular version of the tablename:
asset_types.at_asset_type_pk   (Primary Key)
asset_types.at_asset_type         (Text field)
asset_types.at_network_yn         (boolean) 

SQL

Type all SQL statements in lowercase, being consistent with capitalisation improves the caching of SQL statements. A common variant is to put only SQL keywords in capitals.
SELECT
   em_employee_pk,
   em_employee,
   ab_start_date
FROM
   employees em,
   absences ab
WHERE
   absences.ab_employee_id=employees.em_employee_pk;
You already have a unique prefix worked out for every table, so use the same thing when an ALIAS is required, this makes the SQL much easier to read.

Listing tables in the FROM / WHERE clause in the desired join order, will give the Query Optimiser less work to do.

Data Files

Name Data files so that the name includes the FileGroup/instance and the database.

For non-trivial databases, each filename should end with a two digit numeric value starting with 01, that is incremented by 1 for each new datafile added.

File extensions, use the platform norms for file extensions e.g. Microsoft SQL database:

Primary data file = .mdf
Secondary data file = .ndf
Transaction log file = .ldf

Othewise use the generic extension ".dbf"

FileGroup/Tablespace names

Avoid naming these according to time periods.
For a partitioned datawarehouse, try to adopt a strategy of recycling the Filegroup/tablespace names.

Final thoughts

Naming conventions can become the subject of endless debate, no scheme is going to be perfect.

Does your naming convention make names both consise and readable?

Is the naming convention documented somewhere that everyone can find?

Write and maintain a data dictionary for all data elements, rather than just dumping the data dictionary into a text document or an Entity relationship diagram - you should also define the business meaning of each data item.

If your existing project already has a standard approach to naming database objects then keep using it.

“We are braver and wiser because they existed, those strong women and strong men... We are who we are because they were who they were. It's wise to know where you come from, who called your name” ~ Maya Angelou

SQL formatter - Online SQL beautifier
Standard ISO-11179 - Rules for defining data elements
List of Naughty Strings - Strings which have a high probability of causing issues.
IT System Documentation Levels