Open Access
CC BY 4.0 · Methods Inf Med 2024; 63(05/06): 183-194
DOI: 10.1055/a-2590-6348
Original Article

TCMSF: A Construction Framework of Traditional Chinese Medicine Syndrome Ancient Book Knowledge Graph

Authors

  • Ziling Zeng*

    1   Materia Medica Research Center, Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, People's Republic of China
  • Lin Tong*

    2   The Ancient Book Resources Research Office, Institute of Information on Traditional Chinese Medicine, China Academy of Chinese Medical Sciences, Beijing, People's Republic of China
  • Bing Li

    3   Materia Medica Digital Intelligence Center, Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, People&s Republic of China
  • Wenjing Zong

    4   Integrated Research Center for Chinese Materia Medica, Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, People's Republic of China
  • Qikai Niu

    4   Integrated Research Center for Chinese Materia Medica, Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, People's Republic of China
  • Sihong Liu

    5   Department of Special Collections Research and Development, Institute of Information on Traditional Chinese Medicine, China Academy of Chinese Medical Sciences, Beijing, People's Republic of China
  • Lei Zhang

    2   The Ancient Book Resources Research Office, Institute of Information on Traditional Chinese Medicine, China Academy of Chinese Medical Sciences, Beijing, People's Republic of China
  • Jialun Wang

    1   Materia Medica Research Center, Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, People's Republic of China
  • Siqi Zhang

    1   Materia Medica Research Center, Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, People's Republic of China
  • Siwei Tian

    1   Materia Medica Research Center, Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, People's Republic of China
  • Jing'ai Wang

    1   Materia Medica Research Center, Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, People's Republic of China
  • Wei Zhang

    1   Materia Medica Research Center, Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, People's Republic of China
  • Huamin Zhang

    6   Institute of Basic Theory for Chinese Medicine, China Academy of Chinese Medicine Science, Beijing, People's Republic of China

Funding This study was supported by the National Key R&D Program of China (2023YFC3502900).

Abstract

Background

Syndrome is a unique and crucial concept in traditional Chinese medicine (TCM). However, much of the syndrome knowledge lacks systematic organization and correlation, and current information technologies are unsuitable for TCM ancient texts.

Objectives

We aimed to develop a knowledge graph that presents this knowledge in a more orderly, structured, and semantically oriented manner, providing a foundation for computer-aided diagnosis and treatment.

Methods

We developed a construction framework of TCM syndrome knowledge from ancient books, using a pretrained model and rules (TCMSF). We conducted fine-tuning training on Enhanced Representation through Knowledge Integration (ERNIE), Bidirectional Encoder Representation from Transformers pretrained language models, and chatGLM3–6b large language models for named entity recognition (NER) tasks. Furthermore, we employed the progressive entity relationship extraction method based on the dual pattern feature combination to extract and standardize entities and relationships between entities in these books.

Results

We selected Yin deficiency syndrome as a case study and constructed a model layer suitable for the expression of knowledge in these books. Compared with multiple NER methods, the combination of ERNIE and Conditional Random Fields performs the best. By utilizing this combination, we completed the entity extraction of Yin deficiency syndrome, achieving an average F1 value of 0.77. The relationship extraction method we proposed reduces the number of incorrectly connected relationships compared with fully connected pattern layers. We successfully constructed a knowledge graph of ancient books on Yin deficiency syndrome, including over 120,000 entities and over 1.18 million relationships.

Conclusion

We developed TCMSF in line with the knowledge characteristics of ancient TCM books and improved the accuracy of knowledge graph construction.

* These authors contributed equally to this work.




Publication History

Received: 09 May 2024

Accepted: 09 April 2025

Accepted Manuscript online:
17 April 2025

Article published online:
15 May 2025

© 2025. The Author(s). This is an open access article published by Thieme under the terms of the Creative Commons Attribution License, permitting unrestricted use, distribution, and reproduction so long as the original work is properly cited. (https://creativecommons.org/licenses/by/4.0/)

Georg Thieme Verlag KG
Oswald-Hesse-Straße 50, 70469 Stuttgart, Germany