Machine Learning Classification of Psychiatric Data Associated with Compensation Claims for Patient Injuries

Martti Juhola; Tommi Nikkanen; Juho Niemi; Maiju Welling; Olli Kampman

doi:10.1055/s-0043-1771378

Subscribe to RSS

Please copy the URL and add it into your RSS Feed Reader.

https://www.thieme-connect.de/rss/thieme/en/10.1055-s-00035037.xml

Share / Bookmark

Facebook X Linkedin Weibo

Download PDF

CC BY-NC-ND 4.0 · Methods Inf Med 2023; 62(05/06): 174-182
DOI: 10.1055/s-0043-1771378

Original Article

Machine Learning Classification of Psychiatric Data Associated with Compensation Claims for Patient Injuries

Martti Juhola

¹Faculty of Information Technology and Communication Sciences, Tampere University, Tampere, Finland

,

Tommi Nikkanen

¹Faculty of Information Technology and Communication Sciences, Tampere University, Tampere, Finland

,

Juho Niemi

²Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland

,

Maiju Welling

³Patient Insurance Centre, Helsinki, Finland

,

Olli Kampman

²Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland

⁴Department of Psychiatry, Tampere University Hospital, Pirkanmaa Hospital District, Tampere, Finland

⁵Department of Clinical Sciences (Psychiatry), Umeå University, Umeå, Sweden and Västerbotten Welfare Region, Umeå, Sweden

⁶Department of Clinical Sciences (Psychiatry), University of Turku, Turku, Finland

⁷The Wellbeing Services County of Ostrobothnia, Department of Psychiatry, Vaasa, Finland

› Author Affiliations

› Further Information

Abstract
Full Text
References
Figures

PDF Download Permissions and Reprints

Abstract
Introduction
Methods
Results
Discussion
Strengths and Limitations
Conclusion
References

Abstract

Background Adverse events are common in health care. In psychiatric treatment, compensation claims for patient injuries appear to be less common than in other medical specialties. The most common types of patient injury claims in psychiatry include diagnostic flaws, unprevented suicide, or coercive treatment deemed as unnecessary or harmful.

Objectives The objective was to study whether it is possible to form different categories of patient injury types associated with the psychiatric evaluations of compensation claims and to base machine learning classification on these categories. Further, the binary classification of positive and negative decisions for compensation claims was the other objective.

Methods Finnish psychiatric specialist evaluations for the compensation claims of patient injuries were classified into six different categories called classes applying the machine learning methods of artificial intelligence. In addition, another classification of the same data into two classes was performed to test whether it was possible to classify data cases according to their known decisions, either accepted or declined compensation claim.

Results The former classification task produced relatively good classification results subject to separating between different classes. Instead, the latter was more complex. However, classification accuracies of both tasks could be improved by using the generation of artificial data cases in the preprocessing phase before classifications. This preprocessing improved the classification accuracy of six classes up to 88% when the method of random forests was used for classification and that of the binary classification to 89%.

Conclusion The results show that the objectives defined were possible to solve reasonably.

#

Keywords

machine learning - classification - psychiatry - patient injury

Introduction

Adverse events are common in health care. Large international reviews have estimated that around 10% of hospital patients experience an adverse event, and half of the adverse events are preventable.[1] Patient harm has been estimated to be the 14th leading cause of disease burden globally and up to 15% of total hospital expenditure in OECD (Organization for Economic Co-operation and Development) countries results from adverse events.[2] Better understanding about the quality of adverse events in different health care settings is needed to improve the quality and safety of care.[3]

In Finland, all health care providers are obliged to have patient insurance. Patients can claim compensation for injuries incurred in connection with treatment by filing a notice of injury. The notice needs to be done within 3 years from the date the patient got to know of the injury. All notices are handled by Patient Insurance Centre (PIC) based on the legislation. The PIC obtains all necessary clarifications, including patient documents, from the relevant health care providers. Experienced medical experts evaluate the cases. Also, juridical experts are consulted when necessary. PIC administers an extensive patient injury data which has been widely used for medical research. Research has concentrated on surgical specialties like orthopaedics,[4] [5] otorhinolaryngology,[6] and dental care.[7] [8] An article[9] recently raised attention to psychiatric patient injuries which had not been investigated earlier.

Psychiatric treatment does not always go as planned but compared with many other specialties claims for patient injury appear to be less common in psychiatry.[10] [11] [12] Common claims for patient injury in psychiatry include the misdiagnosis and delay of diagnosis, unprevented suicide, involuntary treatment deemed wrongful, and medication deemed harmful.[9] However, there is so far little data on the likelihood of certain types of injuries in psychiatric care and no international comparisons despite existing large coverage statistics in many countries. An accurate classification of individual cases according to the type of injury helps better understand the types of injuries and their distributions in psychiatric care. This type of classification could further help to establish a monitoring system detecting trends in patient injuries with a goal of improving patient safety and preventing adverse outcomes in psychiatric treatment. To the best of our knowledge, this is the first study applying machine learning methods to the data associated with patient compensation claims.

Currently, the statistical data from patient injury claims and compensation decisions made in Finland include information such as the nature of the disease treated, medical specialty, and event descriptions in a free-text form, and there is no specific coding system referring to the type of the injury or contents of the treatment. Classifying such data requires laborious and time-consuming manual work. Machine learning algorithms can classify past and future data efficiently. The application of machine learning in psychiatry has already been studied for the prediction of treatment,[13] [14] prognosis,[15] and diagnosis.[16] This study aimed to develop and test an accurate machine learning algorithm, which could not only help in a classification process but also potentially improve treatment outcomes in the future.

The current study involves two problems applying psychiatric data: the classification of data associated with compensation claim evaluations for patient injuries into six predefined categories and the binary classification of compensation claim decisions into two classes (accepted or declined claim). The original data contained 328 compensation claims and their medical evaluations written by specialists in the specialty of psychiatry. The data used for machine learning originated from specialists' evaluations, including argumentation to support the decisions. In addition, other information was available for specialists such as an applicant's age, sex, and claim decision (accepted or declined, i.e., positive or negative).

#

Methods

The data for the study were collected from the claims register of PIC which approved (7.5.2020) the use of the data. The evaluations were made for all psychiatric patient injury claim decisions between 2012 and 2016 and the corresponding specialists' evaluations were the basis of the original data. The first preprocessing task was the slight cleaning of the data where some cases were removed because of insufficient amount of psychiatric or other medical phrases. Cases with 3 to 15 phrases were included in the final data. Some phrases could be split up in parts such as “falling serious concussion” to “falling” and “serious concussion” (all texts were originally written in Finnish, but their phrases mentioned are translated here). Some phrases were quite similar, for example, “clinical research and treatment procedure” and “clinical research or treatment procedure.” Three investigators (authors J.N. and O.K., and J.V, see Acknowledgments) considered all the complicated phrases and categorized them into six classes. The categorization according to phrases into classes was based on 50 first cases that two investigators (J.N. and J.V.) classified independently. The inter-rater reliability with these 50 cases was 100%. The hypothesis for six classes was based on one investigator's (O.K.) clinical experience with an earlier sample of approximately 80 cases with compensation claims. As indicative phrases, information on the applicant's illness and treatment descriptions and injury details was used. After this first preprocessing task, 308 cases remained in the dataset of the patient compensation claim evaluations.

As the second preprocessing task, all psychiatric or neurological terms or phrases were extracted from the evaluation documents. The phrases chosen from documents contained, for example, diagnoses, symptoms, or otherwise meaningful issues such as “inappropriate medical treatment,” “appropriate care during hospitalization,” “anxiety,” and “medication discontinuation” categorized into phrase groups {“nursing”, “hospital care,” “depression”} or {“drugs and medication (not psychosis)”}. Phrases were divided into different groups. Phrases closely related to each other were later combined. This way all phrases were grouped.

Altogether 35 phrase groups were manually categorized from 1,591 phrases. These groups are shown in [Table 1]. As an example, the phrase group of “hospital care” is described in [Table 2], where some words, for example “hospitalization,” were written more than once because of the declension of Finnish nouns: Finnish term “osastohoito” (ward care literally) and its genitive “osastohoidon” were both translated into “hospitalization.” Also, the synonyms “osastohoito” and “sairaalahoito” (hospital care literally) were translated to be “hospitalization.”

Table 1
Numbers of phrases in phrase groups (translated from Finnish language) as classification attributes
Phrase group	Category	Number of phrases	Phrase group	Category	Number of phrases
1	Patient's demeanor or state	200	19	Tests and treatment together	31
2	Psychosis, delusions	12	20	Diagnostics	43
3	ADHD and other neurological diseases	52	21	Medicines and medication (not psychosis)	64
4	The patient's behavior	39	22	Other psychiatric diagnoses and symptoms	216
5	Interaction in a treatment setting	22	23	Depression	54
6	Brain tumors and other organic neurological diseases and symptoms	23	24	Death, decease	26
7	Intoxicants	22	25	Anxiety, anxiousness	18
8	Bipolar disorder	19	26	Treatment	12
9	Other organic diseases, symptoms	17	27	Involuntary	156
10	Electroconvulsive therapy	154	28	Patient harm	53
11	Neuroleptics and neuroleptic treatment	20	29	Procedure	23
12	Suicide	38	30	Adverse effects	14
13	Hospitalization	40	31	Accident	25
14	Suicidality	44	32	Medicine in general	20
15	Therapy	16	33	Other, unclassified	26
16	Imaging	21	34	Compensation, damages	30
17	Monitoring	9	35	Otherwise related to patient treatment	11
18	Tests and examination	21

Abbreviation: ADHD, attention deficit hyperactivity disorder.

Table 2
Phrase group “hospital care” containing 40 phrases
Phrase	Phrase	Phrase
A short hospital observation period	Acting like this would not have completely avoided hospitalization, but the duration would have been shorter	Acting like this would not have prevented hospitalization
After being discharged from the hospital	After hospitalization	Appropriate care during hospitalization
Appropriate medical treatment	Being left untreated at the psychiatric ward	Dispatchment to the hospital
(1) During hospitalization	(2) During hospitalization	Felt unsafe at the hospital ward
(1) Hospitalization	(2) Hospitalization	(3) Hospitalization
Hospitalization at the psych. ward	Hospitalization at the psychiatric ward	Hospitalization period
Impatient stay and entitled to compensation	In hospital care	In respite care
In the acute psychiatric ward	In the hospital	In the rehabilitation ward
Inpatient stay to maintain general condition	More inpatient stays	On-call hospital care
Psychiatric hospitalization	Psychiatric hospitalization for depression	Psychiatric hospitalization was justified
Psychiatric inpatient stay	Referral to psychiatric hospitalization was justified	Several impatient stays
(1) Treatment at a psychiatric ward	(2) Treatment at a psychiatric ward	Was hospitalized
Was immediately taken to crisis therapy period	Was not admitted to the hospital	Was not given appropriate treatment for shortness of breath during hospitalization
When alone in a hospital room

Finally, normalization by first subtracting the minimum of each attribute from the values of the current attribute and, second, by dividing their differences of each attribute with the difference of the maximum and minimum of this attribute was performed attribute by attribute scaling the values of each attribute to the interval [0, 1]. This was important particularly for classifications applying the k-nearest neighbor searching method. An attribute is the same as phrase group here. An attribute value of a document equals the sum of the number of phrases of the current phrase group present in a document.

Since supervised machine learning methods were applied, in the beginning all cases were manually divided into six different classes. The classes were formed according to the types or contents of medical or otherwise relevant phrases found in the psychiatric evaluation documents. Six categories or classes are characterized in [Table 3].

Table 3
Distribution of the classes
Class	Description	Number of cases
1	Psychosis, involuntary treatment; care or medication deemed unwarranted or harmful in the complaint	84
2	A complaint about a suicide attempt or completed suicide; care is deemed to be insufficient or faulty	38
3	A complaint about diagnostic error or a prolonged diagnostic process	40
4	Harm due to medication or another form of biological treatment, or incorrect medication (not related to psychosis)	87
5	Harm due to some other aspect of treatment, e.g., therapy, problems in communication	32
6	Incidents during hospitalization, e.g., falling down, errors in administering medication	27

For binary classification, data cases were distributed into two classes: accepted (1 or positive) or declined (0 or negative) decisions of compensation claims. There were 36 positive and 272 negative cases.

Since the number of cases was 308, small in the sense of machine learning, and the least class consisted of 27 cases only, K-fold cross-validation with K-value 5 and leave-one-out (LOO) were applied to divide data cases into training and test sets for constructing models. For classification, several methods were used, i.e., k-nearest neighbor searching method with different distance or similarity functions and k-values, linear and quadratic discriminant analysis, Naïve Bayes,[17] [18] [19] and random forests.[20] Random forests were run with the numbers of trees from 10 to 100. Numbers of trees above 100 did not improve results. In the following, the results produced by 10, 30, and 100 trees are given. For k-nearest neighbor searching (k-NN), k-nearest neighbors with numbers k from 3 to 25 were computed using only LOO.

We chose the above machine learning methods since they are appropriate to small datasets as here with 308 cases only, but as many as six classes. More complicated classification algorithms, e.g., neural networks, could require more data to be able to build good models. The chosen methods follow different principles: random forests, nearest neighbor searching with various distance measures, Naïve Bayes based on probabilities, and discriminant analysis. We did not include decision trees, since typically random forests being an ensemble method based on the use of sets of several decision trees are better.

#

Results

The classification accuracies given by the listed methods are presented in [Table 4], where each k-NN result is shown with a k-value that gave the best result for the current k-NN method. The best results were given by random forests with 100 decision trees. Thus, their results are only given in the form of confusion matrix in the following. The confusion matrix of the results of this modelling is presented in [Table 5]. Next, SMOTE algorithm[21] was applied to balance classes by generating artificial cases for other classes than Class 4 comprising the greatest number of 87 cases. SMOTE generates artificial cases by first searching for the nearest neighbors of great enough numbers for original cases in other classes than the majority class. For example, the minority class of 27 cases was extended with 60 artificial cases. SMOTE generates an artificial case randomly on the line between an original case and one of its nearest neighbors. Thereafter, all classes consisted of 87 cases. This improved classification accuracy of random forests with 100 trees (LOO) up to 88%. This modeling increased the true positive rates of Class 2 to 93%, Class 3 to 92%, Class 5 to 91%, and Class 6 to 89%, but decreased those of Class 1 to 85% and Class 4 to 76%. Comparing with [Table 5], the improved results concerned the classes that were originally small, but the slightly worsened results hit the two largest classes.

Table 4
Classification accuracies in decreasing order given by the classifiers built with leave-one-out (LOO) and K-fold cross-validation with K equal to 5
Method	Classification accuracy %	Method	Classification accuracy %
Random forests, LOO, 100 trees	77	Random forests, K = 5, 100 trees	76
Random forests, LOO, 30 trees	74	Random forests, K = 5, 30 trees	74
Random forests, LOO, 10 trees	73	Random forests, K = 5, 10 trees	72
Linear discriminant analysis, LOO	71	Spearman k-NN, k = 9, LOO	71
Cosine k-NN, k = 7, LOO	71	Correlation k-NN, k = 7, LOO	69
Linear discriminant analysis, K = 5	69	Jaccard k-NN, k = 7, LOO	69
Chi-squared distance k-NN, k = 7, LOO	66	Mahalanobis k-NN, k = 25, LOO	66
Hamming k-NN, k = 7, LOO	65	Manhattan (block city) k-NN, k = 25, LOO	63
Euclidean k-NN, k = 5, LOO	63	Minkowski distance k-NN, dimension 3, k = 5, LOO	63
Minkowski distance k-NN, dimension 35, k = 5, LOO	62	Quadratic discriminant analysis, LOO	56
Naïve Bayes, K = 5	51	Quadratic discriminant analysis, K = 5	50
Naïve Bayes, LOO	49	Chebyshev k-NN, k = 3, LOO	46

Table 5
Results of random forest with 100 trees for the original data when the numbers of correctly classified cases are on the diagonal (in bold)
Predicted class
Class	1	2	3	4	5	6	True %	False %
1	75	1	5	2	1	0	89	11
2	3	31	1	2	0	1	82	18
3	8	0	21	6	3	2	53	47
4	4	3	4	67	7	2	77	23
5	0	1	1	9	20	1	63	37
6	7	1	1	3	5	10	37	63
True%	77	84	64	75	56	63
False %	23	16	36	25	44	37

Finally, the binary classification of either accepted or declined compensation claims was run. The class distribution was very imbalanced as the great majority (272 of 308) of the cases had been declined (Class 0). When random forests run with 100 trees (LOO) gave the best result in [Table 4], we also used random forests for the classification of the decisions of compensation claims. These class-specific results are presented in [Table 6] for this binary classification. Random forests lost almost all cases of the minority class, but those of the majority classes were classified almost fully correctly. By modelling with nearest neighbor searching, rather similar results were obtained. Obviously, the very imbalanced class distribution inflicted so that the minority class could not be separated from the majority class. Thus, SMOTE algorithm was also run for this classification by increasing the size of Class 1 up to 272 cases. After having balanced the minority Class 1, its cases were separated much better from those of Class 0. For Classes 0 and 1, 88 and 89% were classified correctly in the extended dataset. Nonetheless, the share of the correctly classified cases of the originally majority class was less than before balancing, which is rather common for binary classification where two classes are “opposing” each other.

Table 6
Results of random forest with 100 trees for the binary classification of the original data
	Predicted class
Correct class	Class	0	1	True %	False %
	0	270	2	99	1
	1	34	2	6	94
	True %	89	50
	False %	11	50

The machine learning classification method showed accurate results in comparison with the clinical judgement. The original data source was a set of psychiatrists' evaluations of the compensation claims for patient injuries in association with psychiatric diseases and disorders. All in all, 35 phrase groups were formed from 1,591 phrases by combining almost fully or at least somewhat conceptually or semantically similar phrases. This was necessary to create suitable attributes (phrase groups) for machine learning, because many phrases existed only once or a few times in the dataset which would not have made a reasonable basis for computation. Besides, there existed also phrase pairs that were completely or virtually identical. We designed six different classes of patient types or characterizations.

Random forests produced the highest classification accuracy of 77% based on the LOO technique for dividing the data into training sets of size n − 1 cases and test sets of single cases. Furthermore, we modified SMOTE algorithm, not using multiples of minority class or other than the majority class as in the basic SMOTE but balancing these classes up to the size of the majority class. This increased the classification accuracy approximately 10%. Ultimately, the binary classification of the declined and accepted claims of the same data was performed. Since 272 were in class “declined” or 0, the binary class distribution was very biased, and the classification of random forests almost lost the cases of Class 1. Running first the modified SMOTE algorithm, however, could level out the two classes generating classification accuracy to 89%.

Finally, in association with random forests we computed receiver operating characteristic curves and area under the curve (AUC) values presented in [Fig. 1] for the classification of six classes before applying SMOTE algorithm and in [Fig. 2] after its use. The AUC values are from 0.899 to 0.962 before SMOTE and higher after it. These were also computed for the binary classification reaching the AUC values of 0.685 for both classes before the use of SMOTE and 0.992 after it. All these results were computed with the random forests of 100 trees and following the LOO principle.

Fig. 1 ROC curves and AUC values for the classification of six classes. AUC, area under the curve; ROC, receiver operating characteristic.

Fig. 2 After generating artificial cases for balancing the class distribution, ROC curves and AUC values for the classification of six classes. AUC, area under the curve; ROC, receiver operating characteristic.

#

Discussion

Obviously, thus far, other than statistical computational methods have hardly ever been applied to psychiatric data according to our information searching with the following examples. Health care claims were studied by applying knowledge discovery for massive data to find fraudulent health care providers by using text mining, social network analysis, and particularly temporal analysis.[22] However, the main results for which computational results were presented concerned only straightforward statistical results such as log-likelihood scores. The types of data were clinical data without describing specialties, patient behavior data, pharmaceutical research data, and health insurance data. Medical malpractice claims of an extensive dataset were studied statistically, with logistic regression, to predict whether a claim is closed with no compensation.[23] In addition, conditionally on the cases of accepted compensations their covariates were studied statistically. Their eight specialties (not psychiatry) were named for only 27% of all 3,179 claims. Claims, liabilities, injuries, and compensation payments of medical malpractice were described with numbers of cases and associated with drugs, different diseases, and different types of hospitals,[24] but no statistical or other computational results were shown. Psychiatry was not mentioned. Workers' compensation claims and payments were studied and described with descriptive statistics containing numbers of cases and their means without any psychiatric cases.[25] Compensation data research of population-based injury data was made where the term data analytics was mentioned.[26] Nevertheless, it consisted merely of two estimations for probabilities of work-related injury claims calculated for the period of approximately 7 years. Compensation claims of psychiatric injury and severity of physical injuries associated with motor vehicle accidents were statistically considered where 19.5% of all 522 cases included a claim for psychiatric injury.[27] This small dataset of 105 patients was analyzed with multivariate logistic regression computing their odds ratios for five different categories, e.g., injury severity score and hospital stay days. Compensation claims are only infrequently studied in the field of psychiatry. Subject to computation means, statistical methods only are applied.

The results of the current study are in line with earlier reports where the rate of compensation claims related to malpractice in psychiatric treatment have been rare compared with other medicine specialties. In an American study, the annual rate for compensation claims for psychiatrists was only 2.6%, whereas in neurosurgery the corresponding rate was almost 20%.[11] In Spain, the annual rate among psychiatrists in Catalonia was found to be 0.9%.[12]

Despite the relatively low claim rates, the treatment flaws might be more common even in psychiatric treatment. For example, both in a Swedish and an American study, adverse events were found in approximately fifth of treatments.[28] [29]

#

Strengths and Limitations

The comprehensive national data with a coverage from the very beginning of electronic database in the Finnish Patient Insurance Center can be regarded as study strengths. The clinician-based classification that was used as a comparison had a 100% agreement rate between researchers, so it can be considered a good validation tool for the data algorithm. Since the database used in the study was completely encrypted and it was not possible to use the entire database for, e.g., text mining, we searched the database for as comprehensive a selection of treatment focus and content-related phrases as possible. The researcher who selected the phrases was trained to use the database and an experienced psychiatrist was acting as a backup in this process. However, it is possible that with the help of text mining we could have obtained a wider sample of phrases, which might have resulted in even better functioning with the algorithm. However, we believe that the most important text contents were included by extracting the phrases.

Obviously, our current study is among the first using machine learning for psychiatric data.

Adverse events in health care are a global concern. Although patient safety improvement efforts have increased in the past 20 years, new ways to enhance the safety of care are needed. Learning from patient injuries requires understanding about injury types and causes. Traditionally, this needs to be done manually case by case and arising trends in the patient injury data may not be recognized. The use of machine learning in the classification of data can solve these problems and sustain an up-to-date classification of injuries and be applied in prospective risk analyses for developing processes in health care systems.

Natural language processing was not used, because this was our first classification study for the current data. In the future, it is, naturally, reasonable to be applied at least for the preprocessing of phrases. Nevertheless, the final consideration, e.g., how to make phrase groups, requires deep psychiatric expertise that is hardly possible to automatize. In the future, it is important to collect more corresponding data, since this would possibly produce better classification results. It could also be possible to attempt to extend this type of classification study to other medical specialties.

#

Conclusion

It can be concluded that the classification into six classes as such is reasonable and possibly useful. Further, particularly using the modified SMOTE algorithm the classification task of six present classes was successful. The binary classification task of the compensation claim decision data was more complex because of its skewed class distribution. Nevertheless, this approach could also be a reasonable approach, but only after having used the modified SMOTE algorithm as described to balance two classes of the current data.

The machine learning classification appears to be a promising method for detecting different types of patient claims and injuries. This kind of modelling could be used in larger long-term data for monitoring and predicting temporal trends and developing indicators of quality for different dimensions in clinical treatment.

#
#

Conflict of Interest

None declared.

Acknowledgment

We wish to thank Dr. Joona Vintturi for participating in the data collection and analysis.

References
1 Rafter N, Hickey A, Condell S. et al. Adverse events in healthcare: learning from mistakes. QJM 2015; 108 (04) 273-277

Crossref PubMed Google Scholar
2 Slawomirski L, Auraaen A, Klazinga N. The economics of patient safety: strengthening a value-based approach to reducing patient harm at national level. Paris: OECD; 2015. Accessed February 22, 2022 at: http://www.oecd.org/els/health-systems/The-economics-of-patient-safety-March-2017.pdf

Google Scholar
3 Jonsson PM, Øvretveit J. Patient claims and complaints data for improving patient safety. Int J Health Care Qual Assur 2008; 21 (01) 60-74

Crossref PubMed Google Scholar
4 Järvelin J, Häkkinen U, Rosenqvist G, Remes V. Factors predisposing to claims and compensations for patient injuries following total hip and knee arthroplasty. Acta Orthop 2012; 83 (02) 190-196

Crossref PubMed Google Scholar
5 Vallila N, Sommarhem A, Paavola M, Nietosvaara Y. Pediatric distal humeral fractures and complications of treatment in Finland: a review of compensation claims from 1990 through 2010. J Bone Joint Surg Am 2015; 97 (06) 494-499

Crossref PubMed Google Scholar
6 Nokso-Koivisto J, Blomgren K, Aaltonen LM, Lehtonen L, Helmiö P. Patient injuries in pediatric otorhinolaryngology. Int J Pediatr Otorhinolaryngol 2019; 120: 36-39

Crossref PubMed Google Scholar
7 Swanljung O, Vehkalahti MM. Root canal irrigants and medicaments in endodontic malpractice cases: a nationwide longitudinal observation. J Endod 2018; 44 (04) 559-564

Crossref PubMed Google Scholar
8 Vehkalahti MM, Swanljung O. Trends in endodontic malpractice claims and their indemnity in Finland in the 2000s. J Dentistry & Oral Health 2017; 4: 103

Crossref PubMed Google Scholar
9 Vintturi J, Niemi J, Welling M, Kampman O. Psykiatristen potilasvahinkojen yleisyys ja luokittelu (in Finnish). Duodecim 2022; 138: 84-90

PubMed Google Scholar
10 Gómez-Durán EL, Martin-Fumadó C, Benet-Travé J, Arimany-Manso J. Malpractice risk at the physician level: claim-prone physicians. J Forensic Leg Med 2018; 58: 152-154

Crossref PubMed Google Scholar
11 Jena AB, Seabury S, Lakdawalla D, Chandra A. Malpractice risk according to physician specialty. N Engl J Med 2011; 365 (07) 629-636

Crossref PubMed Google Scholar
12 Martin-Fumadó C, Gómez-Durán EL, Rodríguez-Pazos M, Arimany-Manso J. Medical professional liability in psychiatry. Actas Esp Psiquiatr 2015; 43 (06) 205-212

PubMed Google Scholar
13 Chekroud AM, Zotti RJ, Shehzad Z. et al. Cross-trial prediction of treatment outcome in depression: a machine learning approach. Lancet Psychiatry 2016; 3 (03) 243-250

Crossref PubMed Google Scholar
14 Patel MJ, Andreescu C, Price JC, Edelman KL, Reynolds III CF, Aizenstein HJ. Machine learning approaches for integrating clinical and imaging features in late-life depression classification and response prediction. Int J Geriatr Psychiatry 2015; 30 (10) 1056-1067

Crossref PubMed Google Scholar
15 Schmaal L, Marquand AF, Rhebergen D. et al. Predicting the naturalistic course of major depressive disorder using clinical and multimodal neuroimaging information: a multivariate pattern recognition study. Biol Psychiatry 2015; 78 (04) 278-286

Crossref PubMed Google Scholar
16 Lin E, Lin CH, Lai YL, Huang CH, Huang YJ, Lane HY. Combination of G72 genetic variation and G72 protein level to detect schizophrenia: machine learning approaches. Front Psychiatry 2018; 9: 566

Crossref PubMed Google Scholar
17 Bishop CM. Pattern Recognition and Machine Learning. New Delhi: Springer Science + Business Media; 2000

Google Scholar
18 Cios KJ, Pedrycz W, Swiniarski RW, Kurgan LA. Data Mining, A Knowledge Discovery Approach. New York, NY: Springer Science + Business Media; 2007

Google Scholar
19 Flach P. Machine Learning, The Art and Science of Algorithms that Make Sense of Data. New York, NY: Cambridge University Press; 2012

Crossref Google Scholar
20 Breiman L. Random Forests. Dordrecht: Kluwer Academic Publishers; 2000

Google Scholar
21 Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 2002; 16: 321-357

Crossref PubMed Google Scholar
22 Chandola V, Sukumar SR, Schryver J. Knowledge discovery from massive healthcare claims data. Paper presented at: Proceedings of the 19th ACM SIGKDD International Conference Knowledge Discovery and Data Mining; August 11, 2013–Aug 14, 2013, Chicago, United States; 2013: 1312-1320

PubMed Google Scholar
23 Bonetti M, Cirillo P, Musile Tanzi P, Trinchero E. An analysis of the number of medical malpractice claims and their amounts. PLoS One 2016; 11 (04) e0153362

Crossref PubMed Google Scholar
24 Li H, Wu X, Sun T. et al. Claims, liabilities, injures and compensation payments of medical malpractice litigation cases in China from 1998 to 2011. BMC Health Serv Res 2014; 14: 390

Crossref PubMed Google Scholar
25 Baidwan NK, Carroll NW, Ozaydin B, Puro N. Analyzing workers' compensation claims and payments made using data from a large insurance provider. Int J Environ Res Public Health 2020; 17 (19) 7157

Crossref PubMed Google Scholar
26 Prang KH, Hassani-Mahmooei B, Collie A. Compensation research database: population-based injury data for surveillance, linkage and mining. BMC Res Notes 2016; 9 (01) 456

Crossref PubMed Google Scholar
27 Large MM. Relationship between compensation claims for psychiatric injury and severity of physical injuries from motor vehicle accidents. Med J Aust 2001; 175 (03) 129-132

Crossref PubMed Google Scholar
28 Nilsson L, Borgstedt-Risberg M, Brunner C. et al. Adverse events in psychiatry: a national cohort study in Sweden with a unique psychiatric trigger tool. BMC Psychiatry 2020; 20 (01) 44

Crossref PubMed Google Scholar
29 Marcus SC, Hermann RC, Frankel MR, Cullen SW. Safety of psychiatric inpatients at the veterans health administration. Psychiatr Serv 2018; 69 (02) 204-210

Crossref PubMed Google Scholar

Address for correspondence

Martti Juhola, PhD

Faculty of Information Technology and Communication Sciences, Tampere University

33014 Tampere

Finland

Email: Martti.Juhola@tuni.fi

Publication History

Received: 08 April 2022

Accepted: 25 May 2023

Article published online:
24 July 2023

© 2023. The Author(s). This is an open access article published by Thieme under the terms of the Creative Commons Attribution-NonDerivative-NonCommercial License, permitting copying and reproduction so long as the original work is given appropriate credit. Contents may not be used for commercial purposes, or adapted, remixed, transformed or built upon. (https://creativecommons.org/licenses/by-nc-nd/4.0/)

Georg Thieme Verlag KG
Rüdigerstraße 14, 70469 Stuttgart, Germany

References
1 Rafter N, Hickey A, Condell S. et al. Adverse events in healthcare: learning from mistakes. QJM 2015; 108 (04) 273-277

Crossref PubMed Google Scholar
2 Slawomirski L, Auraaen A, Klazinga N. The economics of patient safety: strengthening a value-based approach to reducing patient harm at national level. Paris: OECD; 2015. Accessed February 22, 2022 at: http://www.oecd.org/els/health-systems/The-economics-of-patient-safety-March-2017.pdf

Google Scholar
3 Jonsson PM, Øvretveit J. Patient claims and complaints data for improving patient safety. Int J Health Care Qual Assur 2008; 21 (01) 60-74

Crossref PubMed Google Scholar
4 Järvelin J, Häkkinen U, Rosenqvist G, Remes V. Factors predisposing to claims and compensations for patient injuries following total hip and knee arthroplasty. Acta Orthop 2012; 83 (02) 190-196

Crossref PubMed Google Scholar
5 Vallila N, Sommarhem A, Paavola M, Nietosvaara Y. Pediatric distal humeral fractures and complications of treatment in Finland: a review of compensation claims from 1990 through 2010. J Bone Joint Surg Am 2015; 97 (06) 494-499

Crossref PubMed Google Scholar
6 Nokso-Koivisto J, Blomgren K, Aaltonen LM, Lehtonen L, Helmiö P. Patient injuries in pediatric otorhinolaryngology. Int J Pediatr Otorhinolaryngol 2019; 120: 36-39

Crossref PubMed Google Scholar
7 Swanljung O, Vehkalahti MM. Root canal irrigants and medicaments in endodontic malpractice cases: a nationwide longitudinal observation. J Endod 2018; 44 (04) 559-564

Crossref PubMed Google Scholar
8 Vehkalahti MM, Swanljung O. Trends in endodontic malpractice claims and their indemnity in Finland in the 2000s. J Dentistry & Oral Health 2017; 4: 103

Crossref PubMed Google Scholar
9 Vintturi J, Niemi J, Welling M, Kampman O. Psykiatristen potilasvahinkojen yleisyys ja luokittelu (in Finnish). Duodecim 2022; 138: 84-90

PubMed Google Scholar
10 Gómez-Durán EL, Martin-Fumadó C, Benet-Travé J, Arimany-Manso J. Malpractice risk at the physician level: claim-prone physicians. J Forensic Leg Med 2018; 58: 152-154

Crossref PubMed Google Scholar
11 Jena AB, Seabury S, Lakdawalla D, Chandra A. Malpractice risk according to physician specialty. N Engl J Med 2011; 365 (07) 629-636

Crossref PubMed Google Scholar
12 Martin-Fumadó C, Gómez-Durán EL, Rodríguez-Pazos M, Arimany-Manso J. Medical professional liability in psychiatry. Actas Esp Psiquiatr 2015; 43 (06) 205-212

PubMed Google Scholar
13 Chekroud AM, Zotti RJ, Shehzad Z. et al. Cross-trial prediction of treatment outcome in depression: a machine learning approach. Lancet Psychiatry 2016; 3 (03) 243-250

Crossref PubMed Google Scholar
14 Patel MJ, Andreescu C, Price JC, Edelman KL, Reynolds III CF, Aizenstein HJ. Machine learning approaches for integrating clinical and imaging features in late-life depression classification and response prediction. Int J Geriatr Psychiatry 2015; 30 (10) 1056-1067

Crossref PubMed Google Scholar
15 Schmaal L, Marquand AF, Rhebergen D. et al. Predicting the naturalistic course of major depressive disorder using clinical and multimodal neuroimaging information: a multivariate pattern recognition study. Biol Psychiatry 2015; 78 (04) 278-286

Crossref PubMed Google Scholar
16 Lin E, Lin CH, Lai YL, Huang CH, Huang YJ, Lane HY. Combination of G72 genetic variation and G72 protein level to detect schizophrenia: machine learning approaches. Front Psychiatry 2018; 9: 566

Crossref PubMed Google Scholar
17 Bishop CM. Pattern Recognition and Machine Learning. New Delhi: Springer Science + Business Media; 2000

Google Scholar
18 Cios KJ, Pedrycz W, Swiniarski RW, Kurgan LA. Data Mining, A Knowledge Discovery Approach. New York, NY: Springer Science + Business Media; 2007

Google Scholar
19 Flach P. Machine Learning, The Art and Science of Algorithms that Make Sense of Data. New York, NY: Cambridge University Press; 2012

Crossref Google Scholar
20 Breiman L. Random Forests. Dordrecht: Kluwer Academic Publishers; 2000

Google Scholar
21 Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 2002; 16: 321-357

Crossref PubMed Google Scholar
22 Chandola V, Sukumar SR, Schryver J. Knowledge discovery from massive healthcare claims data. Paper presented at: Proceedings of the 19th ACM SIGKDD International Conference Knowledge Discovery and Data Mining; August 11, 2013–Aug 14, 2013, Chicago, United States; 2013: 1312-1320

PubMed Google Scholar
23 Bonetti M, Cirillo P, Musile Tanzi P, Trinchero E. An analysis of the number of medical malpractice claims and their amounts. PLoS One 2016; 11 (04) e0153362

Crossref PubMed Google Scholar
24 Li H, Wu X, Sun T. et al. Claims, liabilities, injures and compensation payments of medical malpractice litigation cases in China from 1998 to 2011. BMC Health Serv Res 2014; 14: 390

Crossref PubMed Google Scholar
25 Baidwan NK, Carroll NW, Ozaydin B, Puro N. Analyzing workers' compensation claims and payments made using data from a large insurance provider. Int J Environ Res Public Health 2020; 17 (19) 7157

Crossref PubMed Google Scholar
26 Prang KH, Hassani-Mahmooei B, Collie A. Compensation research database: population-based injury data for surveillance, linkage and mining. BMC Res Notes 2016; 9 (01) 456

Crossref PubMed Google Scholar
27 Large MM. Relationship between compensation claims for psychiatric injury and severity of physical injuries from motor vehicle accidents. Med J Aust 2001; 175 (03) 129-132

Crossref PubMed Google Scholar
28 Nilsson L, Borgstedt-Risberg M, Brunner C. et al. Adverse events in psychiatry: a national cohort study in Sweden with a unique psychiatric trigger tool. BMC Psychiatry 2020; 20 (01) 44

Crossref PubMed Google Scholar
29 Marcus SC, Hermann RC, Frankel MR, Cullen SW. Safety of psychiatric inpatients at the veterans health administration. Psychiatr Serv 2018; 69 (02) 204-210

Crossref PubMed Google Scholar

Permissions and Reprints

Subscribe to RSS

Share / Bookmark

Machine Learning Classification of Psychiatric Data Associated with Compensation Claims for Patient Injuries

Abstract

Keywords

Introduction

Methods

Numbers of phrases in phrase groups (translated from Finnish language) as classification attributes

Phrase group “hospital care” containing 40 phrases

Distribution of the classes

Results

Classification accuracies in decreasing order given by the classifiers built with leave-one-out (LOO) and K-fold cross-validation with K equal to 5

Results of random forest with 100 trees for the original data when the numbers of correctly classified cases are on the diagonal (in bold)

Results of random forest with 100 trees for the binary classification of the original data

Discussion

Strengths and Limitations

Conclusion

Conflict of Interest

Acknowledgment

References

Address for correspondence

Publication History

References