The data dictionary

The data dictionary

25.1 Purpose

A data dictionary is a collection of data about the data. Its purpose is to rigorously define each and every data element, data structure, and data transform.

25.2 Strengths, weaknesses, and limitations

A data dictionary helps to improve communication between analysts and users and between technical personnel by establishing a set of consistent data definitions. If programmers develop data descriptions from a common data dictionary, several potentially serious module interface problems can be avoided. At a higher level, different systems must often be linked or interfaced, and a common set of data definitions helps to minimize misunderstandings.

By highlighting already existing data elements, a data dictionary helps the analyst avoid data redundancy. If all programs using a given data element are cross-referenced in the data dictionary, assessing the ripple effects of a change in the data is simplified.

25.3 Inputs and related ideas

The first step in creating a data dictionary is to identify the system’s data elements and composites, a key objective of the information gathering phase of the system development life cycle (Part II). The data dictionary is an important adjunct to several analysis tools, such as data flow diagrams (Chapter 24), entity-relationship diagrams and data normalization Creating a data dictionary is an important step in designing and developing traditional files or a database ( The data dictionary often serves as a foundation for the requirements specificationData structures are described in Inverted-L charts and Warnier-Orr diagrams are useful for visualizing a data structure.

25.4 Concepts

A data dictionary is a collection of data about the data in which each and every data element, data structure, and data transform is rigorously defined.

25.4.1 Data elements

The data dictionary defines each data element, assigns it a meaningful name, specifies both its logical and physical characteristics, and records information concerning how it is used. Table 25.1summarizes the type of information that might be recorded in the data dictionary. Figure 25.1 shows a few partial (generic) data dictionary entries.

25.4.1.1 Data names

It is important to follow a consistent standard when assigning data names. For example, an organization might use the rules imposed by its primary programming language, database management system, data dictionary software, or CASE product.

Some data elements are known by two or more names. This often happens when different groups use the same data for different purposes or when several analysts work concurrently on the system. Rather than creating redundant data dictionary entries, resolve any differences in the definitions of the equivalent data elements, merge them, and record the alias name on the primary description.

If two clearly different data elements have similar names, change at least one of them because similar names can be confusing.

25.4.1.2 Definitions

A good definition precisely indicates the data element’s purpose and clearly distinguishes it from the system’s other data elements. Examples are useful, particularly for identifying exceptions to a general rule.

25.4.2 Data structures or composites

Data structures (Chapter 43), also called group or composite data items, are defined by showing the data elements and substructures that comprise them. The symbols depicted in Table 25.2 (or their equivalents) are sometimes used to document (or partition) composite items. Figure 25.2shows how the data on a sales receipt might be defined using the symbols. Inverted-L charts (Chapter 27) and Warnier-Orr diagrams (Chapter 33) are other tools for visualizing a data structure.

Note that a data structure can contain both composite items and data elements. In the data dictionary, composite items are decomposed or partitioned down to the data element level, and each data element is fully defined (as described earlier).

25.4.3 Keys and relationships

In a database, an entity is a thing about which data are stored and an occurrence is a single instance of an entity composed of data elements (or attributes).

25-01
Figure 25.1  Some typical data dictionary entries.

25-02
Figure 25.2  Documenting a data structure.

Physically, entities map to files, occurrences map to records, and attributes map to fields.

Occurrences (records) are composite data structures. In addition to the attributes that make up the composite, the key (the attribute or group of attributes that uniquely distinguishes one occurrence of the entity) is documented in the data dictionary.

A database is composed of a set of related files (or entities). Typically, the files are linked (or related) by storing an entity’s key in the related entity. These relationships are also documented in the data dictionary.

25.4.4 Transforms

A transform is a process or operation that modifies data. Many data dictionary systems allow the analyst to name, define, and record data about the transforms in the data dictionary.

25.5 Key terms
Alias —
An alternate name for a data element.
Attribute —
A property of an entity.
Composite —
A set of related data elements.
Data dictionary —
A collection of data about the data.
Data element —
An attribute that cannot be logically decomposed
Data structure —
A set of related data elements
Database —
A set of related files
Entity —
An object (a person, group, place, thing, or activity) about which data are stored
Field —
A data element physically stored on some medium
File —
A set of related records
Foreign key —
A key to some other entity stored with the target entity
Key —
The attribute or group of attributes that uniquely distinguishes one occurrence of an entity
Meta-data —
The contents of the data dictionary
Occurrence —
A single instance of an entity
Record —
The set of fields associated with an occurrence of an entity
Relationship —
A link between two data structures
Transform —
A process or operation that modifies data.
25.6 Software

Numerous data dictionary software packages are commercially available. Some are associated with a specific database management system; others are more general. Most provide data entry support. Some can prepare at least part of the entry from programmer source code or generate source code directly from the data dictionary. Data usage reports and queries are common features. Additionally, CASE software (Chapter 5) often incorporates a data dictionary within the CASE repository.

25.7 References
1.  Atre, S., Data Base: Structured Techniques for Design, Performance, and Management, John Wiley & Sons, New York, 1980.
2.Davis,  W. S., Business Systems Analysis and Design, Wadsworth, Belmont, CA, 1994.
3.Kroenke, Database Processing, SRA, Chicago, 1977.
4.Lomax,  J. D., Data Dictionary Systems, NCC Publications, Rochelle Park, NJ, 1977.

Comments

Popular posts from this blog

The Conversion Cycle:The Traditional Manufacturing Environment

The Revenue Cycle:Manual Systems

HIPO (hierarchy plus input-process-output)