Attributes are the characteristics that describe entities and relationships. For example, a Student entity may be described by attributes including:
date of birth
An Invoice entity may be described by attributes including:
A common convention for naming attributes is to use singular nouns. Further, a naming convention may require one of:
All characters are in upper case.
All characters are in lower case.
Only the first character is in upper case.
All characters are lower case but each subsequent part of a multipart name has the first character capitalized
Using the last convention mentioned, some examples of attribute names:
lastName for last name
empLastName for employee last name
deptCode for department code
prodCode for product code
invNum for invoice number
In practice a naming convention is important, and you should expect the organization you are working for to have a standard approach for naming things appearing in a model. A substantial data model will have tens, if not into the hundreds, of entity types, many more attributes and relationships. It becomes important to easily understand the concept underlying a specific name; a naming convention can be helpful.
There are many ways we can look at attributes including whether they are atomic, composite, single-valued, etc. We consider these next.
A simple, or atomic, attribute is one that cannot be decomposed into meaningful components. For example, consider an attribute for gender – such an attribute will assume values such as Male or Female. Gender cannot be meaningfully decomposed into other smaller components.
As another example consider an attribute for product price. A sample value for product price is $21.03. Of course, one could decompose this into two attributes where one attribute represents the dollar component (21), and the other attribute represents the cents component (03), but our assumption here is that such decompositions are not meaningful to the intended application or system. So we would consider product price to be atomic because it cannot be usefully decomposed into meaningful components.
Similarly, an attribute for the employee’s last name cannot be decomposed, because you cannot subdivide last name into a finer set of meaningful attributes.
Exercises (Set 1)
Consider that a Human Resources system must keep track of employees. If we are only including atomic attributes, what attributes would you include for the employee’s name. Some possibilities are first name, last name, middle name, fullname.
In some large organizations where there are several buildings and floors we see room numbers that encode information about the building, floor, and room number. For example in case the room 3C13 stands for room 13 on the third floor of the Centennial building. Suppose we need to include Room in an ERD. How would you represent the room number given that you must include atomic attributes only?
Consider an attribute such as employee name which is to represent an employee’s complete name. For example, suppose an employee’s name is John McKenzie; the first name is John and the last name is McKenzie. It is easy to appreciate that one user may only need employee last names, and another user may need to display the first name followed by the last name, and yet another user may display the last name, a comma, and then the first name. If it’s reasonable for one to refer to the complete concept of employee name and to its component parts, first name and last name, then we can use a composite attribute. An attribute is composite if it comprises other attributes. To show that an attribute is composite and contains other attributes we show the components as attribute ovals connected to the composite as in:
Attributes can be composite and some of its component attributes may be composite as well (see exercise 3).
Exercises (Set 2)
How would you use a composite attribute to model a phone number.
Consider the previous exercise set. Show how we can include room number as a composite attribute that has multiple components.
Consider an address attribute. Show that this can be modeled as a multi-level composite attribute where the component attributes include street, city, province, country and where street includes apartment number, street number, street name.
We characterize an attribute as being single-valued if there is only one value at a given time for the attribute.
Consider the Employee entity type for a typical business application where we need to include a gender attribute. Each employee is either male or female, and so there is only one value to store per employee. In this case, we have an attribute that is single-valued for each employee. Single-valued attributes are shown with a simple oval as in all diagrams up to this point. In all of our examples so far, we have assumed that each attribute was single-valued.
Exercises (Set 3)
A college or university will keep track of several addresses for a student, but each of these can be named differently: for example, consider that a student has a mailing address and a home address. Create an ERD for a student entity type with two composite attributes for student addresses where each comprises several single-valued attributes.
Consider a marriage entity type and attributes marriage date, marriage location, husband, wife. Each marriage will only have one value for each of these attributes. Illustrate the marriage entity and its single-valued attributes in an ERD .
Now, suppose someone proposes to track each employee’s university degrees with an attribute named empDegree. Certainly many employees could have several degrees and so there are multiple values to be stored at one time. Consider the following sample data for three employees: each employee has a single employee number and phone number, but they have varying numbers of degrees.
BA, BSc, PhD
For a given employee and point in time, empDegree could have multiple values as is the case for the last two employees listed above. In this case we say the attribute is multi-valued.
Multi-valued attributes are illustrated in an ERD with a double-lined oval.
We can use multi-valued attributes to (at least) document a requirement, and at a later time, refine the model replacing the multi-valued attribute with a more detailed representation. The presence of a multi-valued attribute indicates an area that may require more analysis; multi- valued attributes are discussed again in Chapter 10.
Exercises (Set 4)
Consider the employee entity type.
- Suppose the company needs to track the names of dependents for each employee. Show the dependent name as a multi-valued attribute.
- Modify your ERD to show empDependentName as a composite multi-valued attribute comprising first and last names and middle initials.
Create an ERD that avoids the multi-valued attribute empDegrees in the previous example.
Consider including another entity type and a relationship for keeping track of degrees.
If an attribute’s value can be derived from the values of other attributes, then the attribute is derivable, and is said to be a derived attribute. For example, if we have an attribute for birth date then age is derivable. Derived attributes are shown with a dotted lined oval.
Sometimes an attribute of one entity type is derived from attributes from other entity types. Consider the attribute for the total of an Invoice. A value of InvTotal is derivable; it can be computed from invoice lines. Someone who implements a database and applications that access the database would need to decide whether the value of a derivable attribute should be computed when the entity is stored or updated versus computing the value (on-the-fly) when it is needed.
Exercises (Set 5)
Consider an educational environment where the institution tracks the performance of each student. Often this is called the students overall average, or overall grade point average. Is such an attribute a derived attribute? How is its value determined?
Consider a library application that needs to keep track of books that have been borrowed. Suppose there is an entity type Loan that has attributes bookID, memberID, dateBorrowed and dateDue. Suppose the due date is always 2 weeks after the borrowed date. Show Loan and its attributes in an ERD.
Some attributes, or combinations of attributes, serve to identify individual entities. For instance, suppose an educational institution assigns each student a student number that is different from all other student numbers. We say the student number attribute is a key attribute; student numbers are unique and distinguish students.
In an ERD, keys are shown underlined:
We define a key to be a minimal set of attributes that uniquely identify entities in an entity type. By minimal we mean that all of the attributes are required – none can be omitted. For instance, a typical key for an invoice line entity type would be the combination of invoice number and invoice line number. Both attributes are required to identify a particular invoice line.
It is not unusual for an entity type to have several keys. For instance suppose an educational institution has many departments such as Mathematics, Physics, and Computer Science. Each department is given a unique name and as well the institution assigns each one a unique code: MATH, PHYS, and CS. Both attributes would be underlined to show this in the ERD:
Exercises (Set 6)
Suppose a company that sells products has a product entity type with the following attributes: prodNum, prodDesc, prodPrice. Suppose all three attributes are single-valued and that prodNum is a key attribute - each product has a different product number. Illustrate this information in an ERD.
Consider a banking application where each account is identified first by an account number and then by its type (Savings, Chequing, and Loan). This scheme allows the customer to remember just one number instead of three, and then to pick a specific account by its type. Other attributes to be considered are the date the account was opened and the account’s current balance. Draw an ERD for the entity type Account with the attributes account number, account type, date opened, current balance. What is the key of the entity type? Is there an attribute that is likely a derived attribute? Show these attributes appropriately in the ERD.
Sometimes we have attributes that distinguish entities of an entity type from other entities of the same type, but only relative to some other related entity. This situation arises naturally when we model things like invoices and invoice lines. If invoice lines are assigned line numbers (1, 2, 3, etc.), these line numbers distinguish lines on a single invoice from other lines of the same invoice. However, for any given line number value, there could be many invoice lines (from separate invoices) with that same line number.
A partial key (also called a discriminator) is an attribute that distinguishes instances of a weak entity type relative to a strong entity. Invoice line number is a partial key for invoice lines; each line on one invoice will have different line numbers. Using the Peter Chen notation the discriminator attribute is underlined with a dashed line:
Later when relationships are covered it will be clearer that attributes for relationships can be discriminators too. Consider that a library has books that members will borrow. Any book could be borrowed many times and even by the same member. However when a member borrows the same book more than once the date/time will distinguish those events. Consider the following ERD for this case:
Exercises (Set 7)
Consider an educational institution that has departments and where each department offers courses. Suppose departments are assigned unique identifiers and so deptCode is a key for the department entity type. Courses are identified within a department by a course number; course numbers are unique within a department but not across departments. So, History may have a course numbered 215, and English could have a course numbered 215 too. In order to identify a particular course we need to know the department and we need to know the course number. Illustrate an ERD including department and course entity types. Include attributes for the Department (department code and department name), and for Course (course number, title, and description).
Consider a company that owns and operates parking lots. Develop an ERD with two entity types Parking Lot and Space and where:
- The address of a parking lot serves to identify the lot.
- Each space within a lot is rented at the same monthly rental charge.
- Each parking space is known by its number within the lot (within a lot these always start at 1).
- Each parking space is rented out to at most one vehicle. The vehicle’s identifier must be recorded. The identifier comprises a province code and license plate number.
When a key specified for an entity is meaningless to the entity and to end-users (it doesn’t describe any characteristic of an entity), the key is referred to as a surrogate key. A key that is not a surrogate key is often referred to as a natural key. Often a surrogate key is just a simple integer value assigned by the database system.
When database designs are implemented surrogate keys can be useful to simplify references from one table to another (referential integrity) and the associated joins when tables are referenced in queries.
Exercises (Set 8)
Assuming you have experience with some database system, what data type would you use for surrogate keys?
Non-key attributes are attributes that are not part of any key. Generally, most attributes are simply descriptive, and fall into this category. Determining key and non-key attributes is an important modeling exercise, one that requires careful consideration. Consider an Employee entity type that has attributes for first name, last name, birth date; these attributes would serve to describe an employee but would not serve to uniquely identify employees.
People may join an organization and their name is not likely unique for the organization; we expect many people in a large organization to have the same first name, same last name, and even the same combination of first and last name. Names cannot usually be used as a key.
However, names chosen for entities such as departments in an organization could be keys because of the way the company would choose department names - they wouldn't give two different departments the same name.
Exercises (Set 9)
Consider an employee entity type and its attributes, and decide which attributes are key attributes and which ones are non-key attributes. Illustrate with an ERD.
A birthdate attribute would appear for many entity types – for example students, employees, children. What is a birthdate likely to be: key or non-key?
Consider a library and the fact that books are loaned out to library members. Dates could be used heavily for the date a book was borrowed, the date the book was returned, and the due date for a book. Consider an entity type Loan that has attributes book identifier, member identifier, date borrowed, date due, date returned. What combination of attributes would be a key? Which attributes are key attributes? Which attributes are non-key attributes?
When a database design is implemented one of the important things to know for each attribute of an entity type is whether or not that attribute must have a value. For example when a book is borrowed from a library the date the book is borrowed is known, but the returned date is not known. Sometimes you will not know the value of an attribute until a certain event occurs.
Consider an educational environment and when a student registers for a course. The date the student registers would be known, but the grade is yet to be determined.
When an entity is created but some attribute does not have a value we say it is null. Null represents the absence of a value; null is different from zero or from blank.
To complete the analysis for a database design it is necessary to determine what constitutes a valid value for an attribute. A domain for an attribute is its set of valid values which includes a choice of datatype, but a full specification of domain is typically more than that.
For instance, analysis for student identifiers may lead one to state that a student identifier is a positive whole number of exactly 7 digits with no leading zeros. The analysis of requirements for person names may lead one to state that the values stored in a database for a first name, last name, or middle name will not be more than 50 characters in length, and that names will not have any spaces at the beginning or end.
For each attribute one must determine its domain. More than one attribute can share the same domain. Knowing the underlying domains in your model is important. They help to complete your analysis, they are indispensable for coding programs, and they are useful for defining meaningful error messages.
Attribute domains are not usually shown in an ERD. Rather, domains are included in accompanying documentation which can be referred to when the database is being implemented.