Data Mapping Best Practices

The healthcare industry collects vast amounts of electronic data. These data are captured in a wide variety of formats using various collection methods. Using and reusing health data for multiple purposes can maximize efficiency, minimize discrepancies and errors caused by multiple data entry processes, reduce costs of data acquisition and storage, and support health information exchange and interoperability. When data are collected in a specific format or coding system and the same or similar information is needed for a different purpose, data maps from one system to another facilitate the reuse of data.

In order for the data to be useful for all their intended purposes, semantic interoperability is required to achieve meaningful exchange across settings, data sets, and standards. Maps are one approach organizations are considering to achieve this goal.

The International Organization for Standardization's preferred definition of mapping is "the process of associating concepts or terms from one coding system to concepts or terms in another coding system and defining their equivalence in accordance with a documented rationale and a given purpose."1 The term "coding system" is used to depict encoded data, generically including clinical terminologies, administrative codes, vocabularies, classification systems, and any type of schema used to represent data in health information systems.

This practice brief defines key data mapping concepts and outlines best practices related to the development and use of data maps.

Basic Mapping Concepts

Mapping features a number of terms such as source, target, forward, and reverse or backward maps (e.g., General Equivalence Mappings). In order to understand mapping, it is important to understand these terms.

The source is the origin of the map or the data set from which one is mapping. The target is the data set in which one is attempting to find equivalence or define the relationship.

Like other types of maps, there is a starting place and a final destination for each data item linked in the process. A map linking codes from ICD-9-CM to ICD-10-CM indicates that ICD-9-CM is the source and ICD-10-CM the target. This type of map is called a forward map because it maps an older version of a code set to a newer version. A reverse map links two systems in the opposite direction, going from the newer version of a code set to the older version.2

Each map produces unique results due to the disparity between the two versions of the systems. It is important to know the map's direction (forward or reverse) since the results depend on the direction.

Mapping Equivalence Examples

Single Mapping between ICD-9-CM and SNOMED CT with Equivalent Concepts in Both Systems

ICD-9-CM Code

ICD-9-CM Name

Equivalence

Snomed CT Code

Snomed CT Name

001.0

Cholera due to vibrio cholerae

Equal

63650001

Cholera

Code 001.0 has a single relationship to a single SNOMED CT concept. The SNOMED CT concept "Cholera" (63650001) is clinically equivalent as it contains the attribute "causative agent" as Vibrio cholerae.


Single Mapping between ICD-9-CM and SNOMED CT with Related Concepts in Both Systems

ICD-9-CM Code

ICD-9-CM Name

Equivalence

Snomed CT Code

Snomed CT Name

282.7

Other hemoglobinopathies

Related

80141007

Hemoglobinopathy

The concepts in each system vary, creating an approximation. In this case the closest concept in SNOMED CT is hemoglobinopathy since the type is not identifiable in the ICD-9-CM classification.


Map Based on the Rules and Guidelines of the Source and Target Systems

ICD-9-CM Code

ICD-9-CM Name

Equivalence

Snomed CT Code

Snomed CT Name

774.6

Fetal/Neonatal jaundice (NNJ)

387712008

Neonatal jaundice

Code 774.6, Unspecified fetal and neonatal jaundice, specifically excludes jaundice in preterm infants, which has its own code of 774.2, Neonatal jaundice associated with preterm delivery. The SNOMED CT concept of "Neonatal jaundice associated with preterm delivery" (73749009) is therefore excluded as a potential mapping target.


One-to-Many Mappings between ICD-9-CM and SNOMED CT

ICD-9-CM Code

ICD-9-CM Name

Equivalence

Snomed CT Code

Snomed CT Name

070.23

Viral hepatitis B with
hepatic coma, chronic,
with hepatitis delta

Related

235869004


26206000

Chronic viral hepatitis B with hepatitis D

Viral hepatitis B with hepatic coma

Code 070.23 requires two SNOMED CT codes for correct representation.


Two Single Mappings between ICD-9-CM and SNOMED CT

ICD-9-CM Code

ICD-9-CM Name

Equivalence

Snomed CT Code

Snomed CT Name

008.3

Proteus (Mirabilis/Morganii) enteritis

Related

30493003

Intestinal infection due to Proteus mirabilis

008.3

Proteus (Mirabilis/Morganii) enteritis

Related

36529003

Intestinal infection due to Morganella morganii

Code 008.3 has relationships to two SNOMED CT concepts. Either of the SNOMED CT codes could be properly linked to the classification. In this case it is important to consider how the resulting map affects its use.


ICD-9-CM and Local Code Mappings to a Single SNOMED CT Target

ICD-9-CM Code

ICD-9-CM Name

Equivalence

Snomed CT Code

Snomed CT Name

002.9

Paratyphoid fever

Equal

85904008

Paratyphoid fever

002.9ZZ

Paratyphoid fever (severe)

Related

85904008

Paratyphoid fever

In this case 0029.ZZ is a local code mapped to a related SNOMED CT code as SNOMED CT concepts do not capture the severity of the condition.

Mapping Relationships

A map should describe how the source and target are related. Degree of equivalency, match rating, and a rating scale using numbers with designated values are used to describe the relationship between the source and the target.

Equivalence describes the relationship between the source and target and informs users how close or distant the two systems are. A map's degree of equivalence affects its utility and reliability.

While one source code may map to one target code, the two codes may not have the exact same meaning. This is especially true when mapping a terminology to a classification. Map developers must identify the degree of equivalence for each map and document how it was determined. When using maps for clinical care the designation of equivalence is a critical element so all ambiguity of the closeness of the match is eliminated.

From a data integrity perspective it is equally important for the statement of equivalence to be easily understood and consistently applied by all map users. Because the map developer generates this documentation, the exact terms may vary; however, the general concept is the same. Common terms include:

  • No match, no map, no code
  • Approximate match, approximate map, related match
  • Exact match, exact map, equivalent match, equivalent map, equal

The International Organization for Standardization's technical report illustrates a 1 to 5 rating scale to determine the degree of equivalence.3 For example, "no match or map" means a concept exists in one of the coding systems without a similar concept in the other system. In the rating scale, "1" represents equivalent meaning, while "5" indicates that no map is possible between the source and target.

Other designations associated with mapping identify how many concepts in each system are necessary to achieve the closest approximation as possible. This may be referred to as "cardinality of the map." Frequently used terms to describe the elements of the map set include:

  • One to one: one concept is mapped for both the source and target system
  • One to many: one concept in the source is mapped to multiple concepts in the target
  • Many to one: multiple concepts in the source are mapped to one in the target
  • Many to many: multiple concepts are mapped in both the source and target system

For instance, one to one expresses the degree of equivalency between two systems, indicating a single concept in the target system has a relationship to one concept in the source system.

"Mapping Equivalence Examples" above provides equivalence examples for data maps. The first example illustrates a map from ICD-9-CM (a classification system) to SNOMED CT (a clinical terminology) to support the conversion of information captured in ICD to populate an electronic health record system using SNOMED CT. It indicates the degree of equivalency as "equal" clearly between the source and target system. The subsequent examples illustrate other types of equivalency.

Identifying the degree of equivalency within a map facilitates proper use of data map types for the intended result.

Data Map Used by Healthcare Providers Offering Laboratory Services

A map for healthcare providers that offer laboratory services is typically a proprietary map linking local data (service codes) with national standards (CPT and LOINC codes).

Service CodE

Description

CPT Code

LOINC Code

Charge

123456789

Hemoglobin Glysylated (A1c)

83036

55454-3

$15

Map Types

There are three general types of maps:

  • Standards development organization maps. These maps are either created or adopted by a standards development organization or in cooperation between organizations. An example of this type of cooperative project is the current harmonization agreement between the International Health Terminology Standards Development Organisation and the World Health Organisation. These organizations are working together to ensure maps between SNOMED CT and ICD-10 are properly developed and SNOMED CT and the classification codes in the 10th revision of ICD-10 are linked where possible.
  • Government-recognized maps. These maps may be developed by a standards development organization or other authorized source; however, with government recognition these maps move from a voluntary standard to a government standard. Examples of this type of map are the ICD-9-CM to ICD-10-CM/PCS General Equivalence Mappings (GEMs), which were added to the HIPAA code set standards.
  • Proprietary or customized maps. These maps are usually developed within an organization or by consultants working for them to meet their needs. In some instances they are developed as a product for purchase to meet specific needs. A proprietary map could consist of any data mapping important to achieve specific objectives. An example is local (organization-specific) terms or vocabularies mapped to standard terminologies or classifications.

Healthcare organizations use many types of maps. Some include clinical content, while others address links between administrative data elements and standard code sets. The sidebars above and below provide examples of general types of maps familiar to most HIM professionals.

If an organization uses local, homegrown, or proprietary coded data elements, it is recommended those data elements map to a known standard data element, as shown in sidebar "Data Map Used by Healthcare Providers Offering Laboratory Services," above. This will enable data sharing and interoperability. A full migration to standard data elements should be considered to discontinue the use of local codes.

Multiple maps may exist between vocabularies, classification systems, data dictionaries, and EHR content standards. Multiple maps require careful management to keep track of versions and the purpose of each map. Failure to use the right map or apply updates when the source or target system changes can create data integrity problems.

Maps Related to Patient Demographics

These examples show maps that support administrative functions using current Health Level Seven (HL7) standards. The following table shows the current HL7 standard for designated patient race in collecting demographic data.

Permissible Value

Domain Meaning Name

1002-5

American Indian or Alaska Native

2028-9

Asian

2054-5

Black or African American

2076-8

Native Hawaiian or Other Pacific Islander

2106-3

White

2131-1

Other Race

Source: United States Health Information Knowledgebase: Health Level Seven


A Data Map to Support Administrative Functions including Patient Attributes or Identification
Using the current HL7 standard referenced above, a map is created to the alternative data set.

Permissible Value

Domain Meaning Name

R1

American Indian/Alaska Native

R2

Asian

R3

Black/African American

R4

Native Hawaiian or Other Pacific Islander

R5

White

R9

Other Race

UNKNOW

Unknown/Not Specified

Source: United States Health Information Knowledgebase: Division of Healthcare Finance and Policy


A Data Map to Support Administrative Functions That Use the Current HL7 Standards
This map demonstrates how local data concepts may be mapped to a standard. This is another form of a proprietary map.

Local Data value

local Meaning Name

Mapped Result Closest to the Standard

NATAM

Native American

American Indian or Alaska Native

eski

Eskimo

American Indian or Alaska Native

negro

Negro (Black)

Black or African American

poly

Polynesian

Native Hawaiian or Other Pacific Islander

cauc

Caucasian

White

MULTI

Mixed Race

No Map

This example illustrates the necessity of guidelines or rules for maps. Guidelines or rules are needed to build a useful map to the standard. In this instance the guidance is provided by the United States Office of Management and Budget.

Determining Appropriate Maps

There is no one-size-fits-all map for healthcare data. Such a map would be like a single interstate that attempts to link all the towns and cities across the United States.

Just as drivers must select the proper road map for a trip, healthcare professionals must select the correct data maps to manage health information. When evaluating maps users must consider the following principles to ensure the maps they select provide results they expect:

  • Every map must have a defined purpose. Mapping for clinical decision support will necessarily be different than mapping for reimbursement, which would be different than mapping for public health surveillance. Users should choose the map that matches their needs.
  • Authoritative maps save development costs. Maps supported by standards development organizations or mandated by government agencies usually have been validated and tested to ensure they work for the purposes for which they were developed. This saves the cost of creating and testing locally developed maps. It is important to keep in mind the purpose for which these maps were developed before considering using them for other purposes.
  • Maps that are not standards-based or mandated must be validated before being adopted. Maps that are proprietary or locally developed may not have been validated by objective third parties.
  • Every map must include guidelines or rules (heuristics) that govern its creation and use. These rules and guidelines should be consistent with the stated purpose of the map and include guidance developed by authoritative sources of the systems.
  • An identified organization, department, or individual should be in charge of implementing, maintaining, and updating the map. Usually a team of qualified personnel is responsible for maintenance. This unit should be given the resources to acquire the necessary skills and knowledge, which helps ensure consistent content and use of the map.
  • Maps must be updated when the source and targets are updated. This may require updating maps multiple times per year. Each update must be clearly identified as a different version, and documentation should detail the revisions for both the source and target versions.
  • The map update process, including timing, must be clearly defined and documented. A reimbursement mapping for ambulatory care settings lagging 30 days behind the source or target update could negatively affect the financial state of an organization

Additional items may need to be evaluated depending upon the uses of the map.

The Importance of Map Maintenance

The graphic below demonstrates the importance of updating maps. Prior to October 2005, ICD-9-CM code 799.0 was mapped to SNOMED concept 70067009.

In October 2005 the valid ICD-9-CM code was expanded to five digits, 799.01. An existing rule of the map required that all valid billable codes must have a map to SNOMED CT, and therefore new ICD-9-CM code 799.01 was mapped to 70067009.

In November 2010, SNOMED CT retired the target concept (70067009), and ICD-9-CM code 799.01 was re-mapped to 66466001.

Mapping Update


Prior to October 2005


Asphyxia (799.0)

ICD-9 to SNOMED mapping


Asphyxia (70067009)

October 2005

Asphyxia (799.0)
Asphyxia and hypoxemia (799.0)

Asphyxia (799.01)

ICD-9 to SNOMED mapping

ICD-9 to SNOMED mapping

Asphyxia (70067009)

November 2010

Asphyxia (799.01)

ICD-9 to SNOMED mapping

ICD-9 to SNOMED mapping

Asphyxia (70067009)

Asphyxiation (66466001)

Decision to Map

After evaluating existing maps, an organization may decide to create its own customized map. If it does, it must first clearly define the business use case of the map.

The use case is a scenario describing how intended users will interact with the map, and it should include "the 'actors,' priorities, pre- and post-conditions (including input and output), flow of events, user interface issues, and more."4 Examples of different use cases include a map for clinical decision support or a map to support billing and reimbursement.

The use case may address issues such as:

  • What problem is the map trying to solve
  • How the organization will create, use, and maintain the map
  • What cost center pays for the map

Organizations that choose to create their own data maps must also carefully consider the resources necessary for maintenance and updating once the initial mapping has been completed. Various systems update on asynchronous cycles, and the effort required to maintain maps is often underestimated. For example, ICD-9-CM has the potential to update twice a year, in April and October; CPT may update twice a year, also. However, terminologies such as LOINC, RxNorm, and SNOMED CT have their own update schedules.

The sidebar at left demonstrates the importance of maintaining maps. It illustrates how routine updates to ICD-9-CM and SNOMED CT over the years required changes within the map.

Keeping a map current can prove to be unwieldy and expensive in times of lean business practices. An organization may choose not to create or use maps due to time, resource, or financial constraints.

However, when an organization decides to develop or implement a map, there are fundamental steps that must be taken to ensure reliable, expected results.

General Mapping Steps

Organizations that decide to map should take the following basic steps. Additional process steps may be deemed necessary in order to attain useful, reproducible, and understandable maps.

Develop a business case first. Questions to ask include:

  • What is the reason for the project?
  • What is the expected business benefit?
  • What are the expected costs of the project?
  • What are the expected risks?

Define a use case for how the content will be used within applications. Questions to ask include:

  • Who will use the maps?
  • Is the mapping between standard terminologies or between proprietary (local) terminologies?
  • Are there delivery constraints or licensing issues?
  • What systems will rely on the map as a data source?

Develop rules (heuristics) to be implemented within the project. Questions to ask when developing the rules include:

  • What is the version of source and target schema to be used?
  • What is included or excluded?
  • How will the relationship between source and target be defined (e.g., are maps equivalent, related, etc.)?
  • What procedures will be used for ensuring intercoder/inter-rater reliability (reproducibility) in the map development phase?
  • What parameters will be used to ensure usefulness? (For example, a map from the SNOMED CT concept "procedure on head" could be mapped to hundreds of CPT codes, making the map virtually useless.)
  • What tools will be used to develop and maintain the map?

Plan a pilot phase to test the rules. Maps must be tested and deemed "fit for purpose," meaning they are performing as desired. This may be done using random samples of statistically significant size. Additional pilot phases may be needed until variance from the expected result are resolved. Reproducibility is a fundamental best practice when mapping.

Develop full content with periodic testing throughout the process. Organizations should perform a final quality assurance test for the maps and review those data items unable to be mapped to complete the mapping phase.

Organizations should release the map results to software configuration management where software and content are integrated. They should then perform quality assurance testing on the content within the software application (done in a development environment). They can then deploy the content to the production environment, or go-live.

Communicate with source and target system owners when issues are identified with the systems that require attention or additional documentation for clarity.

Whether an organization decides to create a map or chooses to use a map created outside of the organization, maps should be validated by an objective, qualified third party.

Map Validation

Organizations should follow the understandable, reproducible, and useable principle to develop appropriate data maps. This principle stipulates that:

  • The links between data elements should be understood by the user without benefit of a user guide or 100-page manual.
  • The process to develop the data links must be straightforward enough to be reproduced so that the same results occur no matter who (human) or what (machine or software program) is creating the links.
  • The map is not valuable if it is not useable for the use case it was designed to support.
  • The validation of maps must be conducted by an entity that has not been involved with the map development or has a financial or political interest in its use.

Just as there are steps to create a map, there are best practice steps for validation. Organizations should begin the validation by asking these characteristics questions of the map:

  • Is the map easy to understand?
  • Can the results be reproduced?
  • Is the use case support evident in the map results?

Organizations should then examine and compare the use case or the purpose of the map for consistency. A map designed for a specific purpose may not be suitable for a different purpose, so a comparison with the stated use case using a proper sample of map records is necessary.

Organizations should also complete additional general validation reviews, including:

  • Using authoritative sources from the standards development organization to link the codes when the map is between two official standards (e.g., SNOMED CT to ICD or LOINC to CPT)
  • Drawing a statistically valid sample from the map record population to review the validity of the map results
  • Reviewing map heuristics
  • Enlisting a qualified person without access to the previous work to perform the mapping independently (blind comparison)
  • Comparing results and explaining any discordance in detail in a full report

Organizations should perform an "in use" review and validation check, including:

  • Comparing map performance in translating the source to the target for its intended purpose (e.g., does it produce the same results every time the map record is used?)
  • Assessing the concordance between software programs using the map and noting any discordance
  • Presenting a full report of findings with a qualified panel from the organization using the maps

Vendor-Developed Maps

It is now common for terminology developers and distributors to develop their own mappings or to extend standards development organization or government maps in what the vendor terms a "value add." EHR vendors may also develop and incorporate maps into their products. Before an organization uses a vendor-developed map, it must evaluate the map according to the mapping principles outlined above, just as it would any nonauthoritative map.

In addition, an organization's due diligence should include requesting references from organizations currently using the map, as well as knowing who and how many other organizations use the map. Vendor-developed maps can help enormously with map implementation and use if the organization understands the product and assesses it with the framework of their needs and requirements.

There are no standards for health data maps, nor is there a certification program for health information or health data maps such as exists for many other technology standards. Thus, it is up to users to ensure that the maps are appropriate for their purposes and meet their needs.

Notes

  1. International Organization for Standardization. "Mapping of Terminologies to Classifications." 05-31-2010 ISO TC 215/SC N, ISO 2010.
  2. Giannangelo, Kathy. Transitioning to ICD-10-CM/PCS: The Essential Guide to General Equivalence Mappings (GEMs). Chicago, IL: AHIMA, 2011.
  3. International Organization for Standardization. "Mapping of Terminologies to Classifications."
  4. Imel, Margo, Kathy Giannangelo, and Brian Levy. "Essentials for Mapping from a Clinical Terminology." 2004 IFHRO Congress and AHIMA Convention Convention Proceedings, October 2004. Available online in the AHIMA Body of Knowledge at www.ahima.org.

Resources

AHIMA. "Putting the ICD-10-CM/PCS GEMs into Practice." Journal of AHIMA 81, no. 3 (Mar. 2010): 4652. Available online in the AHIMA Body of Knowledge at www.ahima.org.

Foley, Margaret, et al. "Translation Please: Mapping Translates Clinical Data between the Many Languages That Document It." Journal of AHIMA 78, no. 2 (Feb. 2007): 3438. Available online in the AHIMA Body of Knowledge at www.ahima.org.

Prepared By

June Bronnert, RHIA, CCS, CCS-P
Jill Clark, MBA, RHIA
Jane Cook, CPC
Susan Fenton, PhD, RHIA
Rita Scichilone, MHSA, RHIA, CCS, CSS-P
Margaret Williams, AM
Pat Wilson, RT(R), CPC

Acknowledgments

Kathy Giannangelo, MA, RHIA, CCS, CPHIMS, FAHIMA
AHIMA House of Delegate Best Practice and Standards Team
AHIMA Quality and Secondary Data Practice Council


The information contained in this practice brief reflects the consensus opinion of the professionals who developed it. It has not been validated through scientific research.

Indicates an AHIMA best practice. Best practices are available in the AHIMA Compendium, http://compendium.ahima.org.


Article citation:
AHIMA. "Data Mapping Best Practices." Journal of AHIMA 82, no.4 (April 2011): 46-52.