Big and small organizations alike use relational databases to store, manage, and analyze critical information. But what makes a database “relational?” A relational database organizes data in predefined relationships, letting you easily understand how your data is connected.
A well-designed database provides several benefits:
- The database structure is easy to modify and maintain. Workflows rarely stay the same forever–you’ll likely have to make some adjustments to your core relational data model in the future. Fortunately, a well-designed database ensures that any modifications you make to fields in one table will not adversely affect other tables.
- It’s easier to find the information that you need. With a consistent, logical database structure (that avoids duplicate fields and tables), it’s much easier to query your database.
- You can avoid redundant, duplicate, and invalid data. This data can undermine the validity of your database, but you can design your relational database to minimize the risks posed by low-quality data.
- You can avoid situations where you are missing required data. If you can identify ahead of time which types of data are most critical to your workflow, you can structure your database in such a way that it enforces proper data entry, or alerts users when records are missing critical data.
If you build a house without finalizing the blueprints, odds are when it’s finally built the house will have questionable structural integrity. Similarly, taking time to think carefully about the design of your relational database before implementing it can save you a lot of trouble in the long run. In this tutorial, we’ll cover several design principles that you can follow to help you build better databases.
What is a "well-designed" database?
Good database design is important. But what does it actually mean for a database to be well-designed?
Fast fact: The first person to devise an abstract model for database management was Edgar Frank Codd, an English computer scientist who joined IBM in the 1940s.
A well-designed database enforces data integrity
Data integrity refers to the accuracy, completeness, and consistency of the data in your database.
Data integrity includes three specific technical aspects of a relational database’s structure:
- Entity integrity (or table-level integrity) ensures that a table has no duplicate records, and that the values of the table’s primary keys are all unique and not null.
- Domain integrity (or field-level integrity) ensures that the purpose of every field is clear and identifiable, and that the values in each field are valid, consistent, and accurate.
- Referential integrity (or relationship-level integrity) ensures that the relationships between pairs of tables are sound, so that the records in the tables are synchronized whenever data is entered into, updated in, or deleted from either table.
A well-designed database maintains data integrity by implementing the processes and standards proposed during the design phase.
A well-designed database enforces relevant business rules
Every organization does its work a little differently, and as such, each organization has its own business rules. Ideally, your team’s database should enforce your unique business rules.
Let’s say your content marketing team requires that timestamps for video files be noted in milliseconds, and each asset is assigned an eight-character alphanumeric code. If the video files were noted in seconds instead of milliseconds and then entered into the company database, it could create a major snag in video production—and a mess for a database administrator to clean up later.
How to design your relational database, step by step
If this all sounds unfamiliar or overwhelming, don’t worry—there’s a systematic process you can follow so your relational database management system (or RDBMS) follows good design principles, is well-suited to your organization’s needs, and avoids common pitfalls.
Step 1: Define your purpose and objectives
Before beginning your database design journey, understand why you’re making it.
Are you making this database to manage transactions? To store customer IDs? To solve a specific organizational problem? Whatever the case, it’s worth taking the time to identify the exact purpose of the database you’ll be creating.
You may even want to work with stakeholders, executive leadership, and end users to jointly write out a mission statement for your database, like: “The purpose of the Mingei International Museum database is to maintain the data for our art collection,” or “Zenbooth’s database will store all of the data for our manufacturing resource planning.”
Additionally, you should define the objectives that the end users of the database will have: which specific tasks will the end users need to perform to accomplish their work? Develop an explicit list of objectives—like “Know the status and location of each of the pieces of art in our collection at all times,” or “Maintain a complete customer table that shows records for each of our clients.” This will help you determine an appropriate structure, or database schema, for your information as you work through this design process.
Step 2: Analyze data requirements
Before you design your database, you’ll need to assess how your team currently does its work, and identify what kind of data is most important to that work.
You can do this by closely examining existing processes and by interviewing team members—both management and end users. Some questions to ask as you conduct your research:
- How is your organization currently collecting data? Are you using spreadsheets? Paper templates? Another database? Find the most complete samples of work that you can, and look through them to find as many different attributes as you can. For example, your editorial calendar might currently be living in a spreadsheet, and have columns for “Author,” “Due Date,” “Editor,” and so on.
- How is your organization currently presenting data? What kinds of reports does your organization use? PDFs? Slide decks? Web pages? Carefully examine any types of presentations that incorporate data from your current data collection methods and use them to identify potential fields.
- How are your team members currently using data? Talk to team members—both management and end users—to identify their current data use patterns and case studies, as well as any gaps in the current system. You can ask questions like, “What types of data are you currently using?” and have them review the samples you collected. These interviews can also illuminate plans for the future growth of the organization, which will give you insight into the type of relational database model that would be the best fit.
Step 3: Create a list of entities and a list of attributes
The next steps are to extract a list of entities and a list of attributes from the research you’ve compiled.
In the context of relational databases, an entity is an object, person, place, event, or idea—like “clients,” “products,” “projects,” or “sales reps.” These entities will eventually turn into your tables later on in the design process.
Start by picking out entities from your research and putting them in a list. For example, if you were developing a talent database for a big record label, your entities list might look something like this:
Next, create a separate list containing the relevant attributes for each of the entities you’ve identified. Attributes are the defining characteristics of those entities, like “name,” “quantity,” “address,” “phone number,” or “genre.” These attributes will become the fields for your tables.
Think of entities as nouns, and attributes as the adjectives that describe those nouns. Again, for the talent database example, your attributes list might look something like this:
- Artist Name
- Agent Name
- Agent Phone Number
- Agent Email Address
- Venue Name
- Venue Address
- Gig Dates
- If multiple attributes have different names but actually represent the same concept, consolidate them into one. For example, if you have both “Product No.” and “Product Number” on your list, you should remove one of them.
- If multiple attributes have similar names but actually represent different concepts, rename the attributes to be more specific. For example, you could rename two different “Name” attributes into the more specific “Artist Name” and “Venue Name.”
After refining your lists, it’s a good idea to review them with the team members you interviewed to confirm that you’ve accounted for every necessary type of data.
Step 4: Model the tables and fields
After listing your entities and attributes, use them to design the structure of your relational database. Your list of entities will become separate tables in your base, and the list of attributes will become the fields for these tables.
Take your lists and assign each of the attributes to your tables. For example, after we finish assigning our listed attributes to our new tables, our talent management database-in-planning might look something like this:
Next, you want to figure out how to name your records in each table. This requires that you pick an appropriate primary field.
A primary field is a major component of ensuring data integrity, as it uniquely identifies each record within a table and is used to establish different types of relationships between tables. And in Airtable, the primary field is always the first column in any table (and it can't be deleted, moved, or hidden).
Each table’s primary field should meet the following criteria:
- It must contain unique identifiers. This will prevent you from creating duplicate records and redundancy within a table.
- It cannot contain null values. A null value is the absence of a value, and as such, you cannot use a null value to identify a record.
- It should not be a value that will need to be modified often. Ideally, primary field values will remain relatively static over time and only be changed under rare circumstances.
- Ideally, it uses the table name as part of its own name. While not strictly necessary, having the table name in the primary field name can make it easier to identify the table from which the primary field originated. For example, “Employee Name” would be obviously identifiable as coming from the related table, “Employees.”
Let’s return to our talent management database example. For the “Artists” table, the “Artist name” field is already a pretty good candidate for the primary field, as it’s pretty unlikely that your record label will sign two artists with the same name. We can also pick “Venue name” as the primary field for the “Venues” table.
For the other tables, however, it would probably be better to make new fields that concatenate values from existing fields. In the “Agents” table, we might make a new field—“Agent full name”—that concatenates the values of the “Agent given name” and “Agent surname” fields.
For the “Gigs” table, an artist could perform at the same venue on multiple occasions, so we should make a new field that gives a unique name to the specific combination of an artist at a venue on a specific date.
You could potentially concatenate the name of the artist, the venue, and the date to create values like “2 Linkz at the Gotham City Metro Club 02/13/2019,” but that can get long and unwieldy fast. Alternatively, you could try creating a new field—“Gig code”—with unique alphanumeric code values (like “E0023”).
A poorly designed database makes it difficult to access the information you need and can jeopardize the accuracy of your data. If you take the time to define your goals, research how your organization collects data, and identify what you want to store, you can create a structured database that anyone in your org can navigate and use.