Methods Inf Med 1998; 37(03): 260-265
DOI: 10.1055/s-0038-1634526
Original Article
Schattauer GmbH

Automatic Coding of Diagnostic Reports

L. M. de Bruijn
1   Dept of Medical Informatics, Maastricht University
,
A. Hasman
1   Dept of Medical Informatics, Maastricht University
,
J. W. Arends
2   Dept of Pathology, Academic Hospital Maastricht, The Netherlands
› Author Affiliations
Further Information

Publication History

Publication Date:
14 February 2018 (online)

Abstract

A method is presented for assigning classification codes to pathology reports by searching similar reports from an archive collection. The key for searching is textual similarity, which estimates the true, semantic similarity. This method does not require explicit modeling, and can be applied to any language or any application domain that uses natural language reporting. A number of simulation experiments was run to assess the accuracy of the method and to indicate the role of size of the archive and the transfer of document collections across laboratories. In at least 63% of the simulation trials, the most similar archive text offered a suitable classification on organ, origin and diagnosis. In 85 to 90% ofthe trials, the archive's best solution was found within the first five similar reports. The results indicate that the method is suitable for its purpose: suggesting potentially correct classifications to the reporting diagnostician.

 
  • REFERENCES

  • 1 Blois MS. Information and Medicine: the Nature of Medical Descriptions. Berkeley: University of California Press; 1984
  • 2 Hall PA, Lemoine NR. Comparison of manual data coding errors in two hospitals. J Clin Pathol 1986; 39: 622-6.
  • 3 Sager N, Bross IDJ, Story G, Bastedo P, Marsh E, Shedd D. Automatic encoding of clinical narrative. Comput Biol Med 1982; 12: 43-56.
  • 4 Sager N, Friedman C, Lyman MS. Medical Language Processing: Computer Management of Narrative Data. Reading MA: Addison-Wesley; 1987
  • 5 Sager N, Lyman M, Tick U, Nhan NT, Bucknall CE. Natural language processing of asthma discharge summaries for the monitoring of patient care. Proc Annu Symp Comput Appi Med Care 1993; 265-8.
  • 6 Sager N, Lyman M, Nhan NT, Tick LJ. Medical language processing: applications to patient data representation and automatic encoding. Meth Inform Med 1995; 34: 140-6.
  • 7 Baud RH, Rassinoux AM, Scherrer JR. Natural language processing and semantical representation of medical texts. Methods Inf Med 1992; 31: 117-25.
  • 8 Baud RH, Lovis C, Alpay L, Rassinoux AM, Scherrer JR, Nowlan A, Rector A. Modelling for Natural Language Understanding. In: Safran C. (ed) Proc Annu Symp Comput Appi Med Care. 1993; 289-93.
  • 9 Rassinoux AM, Michel PA, Juge C, Baud R, Scherrer JR. Natural language processing of medical texts within the HELIOS environment. Comput Methods Programs Biomed 1994; 45 Suppl S79-96.
  • 10 Rassinoux AM, Wagner JC, Lovis C, Baud RH, Rector A, Scherrer JR. Analysis of medical texts based on a sound medical model. Proc Annu Symp Comput Appi Med Care 1995; 27-31.
  • 11 van Rijsbergen CJ. Information Retrieval (2nd ed). London: Butterworths; 1979
  • 12 Saltón G, McGill MJ. Introduction to modern information retrieval. New York: McGraw-Hill; 1983
  • 13 Saltón G. Developments in automatic text retrieval. Science 1991; 253: 974-80.
  • 14 De Bruijn LM, Hasman A, Arends JW. Classification of diagnoses that are described in natural language. Int J of Technology Management. (in press).
  • 15 Sparck Jones K. Index term weighting. Information storage and retrieval 1973; 9: 619-33.
  • 16 Harter SP. A probabilistic approach to automatic keyword indexing, Part 1: On the distribution of specialty words in a technical literature, Part 2: An algorithm for probabilistic indexing. J of the Am Soc for Information Science 1975; 26: 197-206 and 280-9.