Delving into Computer-assisted Coding

Appendix G: Glossary of Terms

The terms defined here are meant to elucidate those used within the context of the computer-assisted coding practice brief and appendices. Online glossary resources are also provided to assist the reader in locating and understanding terms contained within additional articles and papers referenced in this practice brief.

Artificial intelligence (AI): Computational techniques to automate tasks that require human intelligence and the ability to reason.

Automated NLP engine: The part of a natural language processing application that performs the syntactic (structure, morphology) and semantic (meaning) processing of the text or speech.

Clinical coding engine: The part of a clinical coding application that contains the logic (algorithms) and knowledge (relationships) necessary to perform the encoding of concepts.

Code-assist model: A model of applying a computer-assisted coding tool that involves an initial screen against clinical documentation by the software tool, producing a preliminary set of draft codes, which are then reviewed, edited, and revised by a human coder to generate the final set of codes.

Cognitive linguistics: The study of the relationship between language and the human mind. Workers in this field seek to understand language as it relates to models of human thinking, interpreting language in light of the social and psychological contexts in which it is generated and understood. Cognitive linguistics can be contrasted with computational linguistics, in which workers use algorithmic or computer-based approaches to interpret language. Natural language processing typically employs the techniques of computational linguistics, although cognitive linguistics may also inform the process.

Computer-assisted coding (CAC): The use of computer software that automatically generates a set of medical codes for review/validation and/or use based upon clinical documentation provided by healthcare practitioners.

Computer linguistic competence: A useful concept for organizing a description of different types of applications, ranging from simple key word look-up (on the end of low competence) to semantic and contextual interpretation using a grammar and vocabularies (on the end of high competence).

Corpus: A large body of natural language text used for accumulating statistics on natural language text. The plural is corpora.

Electronic health record (EHR): The current term used to refer to computerization of health record content and associated processes.

Electronic medical record (EMR): A term that may be treated synonymously with computer-based patient record and/or electronic health record; often used in the US to refer to an electronic health record in a physician office setting or a computerized system of files (often scanned via a document imaging system) rather than individual data elements.

Free text: Alphanumeric data that are unstructured, typically in narrative form. Unstructured data are not processed uniquely by the computer system without application of natural language processing tools. Free text provides the benefit of expressivity and flexibility. However, information that is recorded as free text is significantly more difficult to use for data analysis, aggregation, and comparison. (See also structured data.)

Granularity: The level of detail.

Health Level Seven (HL7): An ANSI-accredited standards development organization created in the 1980s to develop standards for healthcare computer applications to share data. (See www.hl7.org.)

Informatics: A field of study that focuses on the use of technology for improving access to and utilization of information. Health informatics is the systematic study of information in the healthcare delivery system—how it is captured, retrieved, and used in making decisions—as well as the tools and methods used to manage this information and support decisions.

Knowledge management: Capturing, organizing, and storing knowledge and experiences of individual workers and groups within an organization and making this information available to others in the organization.

Levels of knowledge (in NLP): Natural language is described as encompassing the various levels of knowledge as follows:

  • Phonetic: constructing words from basic sounds
  • Morphological: constructing words from subunits (e.g., friend + ly = friendly)
  • Syntactic: constructing sentences from words
  • Semantic: deriving meaning from sentences
  • Pragmatic: adding meaning from a sentence's context
  • World: background, cultural, common sense information that adds meaning to a sentence

Linguistics: The scientific study of language, which may be undertaken from many different aspects, for example, sounds (phonetics) or structures of words (morphology) or meanings (semantics).

Machine learning: Subspecialty of artificial intelligence concerned with developing methods for software to learn from experience or extract knowledge from examples in a database.

Natural language processing (NLP): A range of computational techniques for analyzing and representing naturally occurring text (free text) at one or more levels of linguistic analysis (e.g., morphological, syntactic, semantic, pragmatic) for the purpose of achieving human-like language processing for knowledge-intensive applications. (See also Levels of knowledge in NLP.)

Normalization: A formal process to standardize various representational forms so that expressions that have the same meaning will be recognized by computer software as synonymous in a data search. This may involve elimination of various kinds of punctuation signs and coordinating conjunctions, conversion of letters from lower to upper case, and so on. The process must be formalized with consistent rules applied systematically to maintain data integrity. For example, "fracture of heel" may be normalized as "heel fracture."

Semantic grammar: A formal definition of a language that uses concepts from a particular domain of discourse to specify acceptable expressions in that language. This is distinct from a syntactic grammar, in which a language is defined in terms of the parts of speech that comprise it. For example, a semantic grammar intended for use in the interpretation of chest x-ray reports may specify that valid expressions involving the word "lung" may include reference to laterality (e.g., left, right, or both).

Semantics: The meaning of a word or term. (See also Levels of knowledge in NLP.)

Statistical NLP: A group of techniques relying on mathematical statistics and used in natural language processing, for example, to find the most likely lexical categories or parses for a sentence. Often, the techniques are based on frequency information collected by analyzing large corpora of sentences in a single language, to find out, for example, how many times a particular word ("dog," perhaps) has been used with a particular part of speech. The sentences in the corpus have usually been tagged in some way (sometimes manually) so that the information about the part of speech, each time each word is used, is known. Statistical NLP may also be referred to as Boolean NLP.

Structured data: Documentation of discrete data using controlled vocabulary rather than narrative text.

Structured input: A form of data entry that captures data in a structured manner (e.g., point-and-click fields, pull-down menus, structured templates, macros).

Syntax: The format or structure of data. (See also Levels of knowledge in NLP.)

Unstructured data: See free text.

References

Amatayakul, Margaret K. Electronic Health Records: A Practical Guide for Professionals and Organizations, 2nd Edition. Chicago: AHIMA, 2004.

Coiera, Enrico. "Health Informatics Glossary," In Guide to Health Informatics, 2nd Edition, 1997. Available online at www.coiera.com/glossary.htm.

Open Clinical. Available online at www.openclinical.org.

Wilson, Bill. "The Natural Language Processing Dictionary." Updated June 25, 2004. Available online at www.cse.unsw.edu.au/%7Ebillw/nlpdict.html.

Additional Online Resources

Health Level Seven. "Glossary of Terms." January 2002. Available online at www.hl7.org.

Partnership for Health Information Standards, Canadian Institute for Health Information. "Glossary of Terms." Available online at www.cihi.ca or http://secure.cihi.ca/cihiweb/en/partner_glossary_e.html

Tabar, Pamela. "The Latest Word." Healthcare Informatics Online, January 1998. Available online at www.healthcare-informatics.com/issues/1998/01_98/glossary.htm.

van Bemmel, J. H., and Mark A. Musen, Editors. "Handbook of Medical Informatics." Web Site Version 3.3, 1999. Available online at www.mieur.nl/mihandbook/r_3_3/glossary/us/glossary.htm.

Wilson, Bill. "The AI Dictionary." Updated June 15, 2004. Available online at www.cse.unsw.edu.au/%7Ebillw/aidict.html.


Article citation:
AHIMA e-HIMTM Work Group on Computer-Assisted Coding. "Delving into Computer-assisted Coding. Appendix G: Glossary of Terms" Journal of AHIMA 75, no.10 (Nov-Dec 2004): web extra.