Research Article| Volume 8, 100018, December 2022

# Machine-learning-based analytics for risk forecasting of anaphylaxis during general anesthesia

Open AccessPublished:September 08, 2022

## Abstract

Perioperative anaphylaxis has a risk of mortality and compromised quality of patient care. It is difficult to design an evaluation system for risk of anaphylaxis using preoperative tests available in clinical practice. To develop a personalized risk forecast platform for general anesthesia-related anaphylaxis, as a first step, we aimed to investigate the feasibility of machine-learning-based classification using clinical features of patients for risk prediction of anesthesia-related anaphylaxis. After data pre-processing, the performance of five classification methods: Logistic Regression Analysis, Support Vector Machine, Random Forest, Linear Discriminant Analysis, and Naïve Bayes), which were integrated with four feature selection methods (Recursive Feature Elimination, Chi-Squared Method, Correlation-based Feature Selection, and Information Gain Ratio), was evaluated using two-layer cross-validation. Seventy-four features, which were defined from 225 participants, were applied for model fitting. Linear Discriminant Analysis in conjunction with Recursive Feature Elimination showed good performance, with accuracy of 0.867 and Matthews correlation coefficient (MCC) of 0.558 with 25 features used in the classification. Logistic Regression in conjunction with Recursive Feature Elimination model also showed adequate performance, with accuracy of 0.858 and MCC of 0.541 with six features used in the classification. This study presents initial proof of the capability of a machine-learning-based strategy for forecasting low-prevalence anesthesia-related anaphylaxis from a clinical perspective. It could provide a basis for establishing an effective risk-scoring and predictive system for perioperative anaphylaxis that would help identify preoperatively whether anaphylaxis will occur and could be used to predict unstable patient states preceding anaphylactic shock.

## Keywords

#### Abbreviations:

CFS (correlation-based feature selection), LDA (linear discriminant analysis), IGR (information gain ratio), MCC (Matthews correlation coefficient), NB (naïve Bayes), RFE (recursive feature elimination), SVM (support vector machine), WBC (white blood cell count)

## Introduction

Anaphylactic reactions – both antigen dependent and non-antigen dependent – are a risk of general anesthesia, although the condition remains poorly characterized in this context. Among the various agents used in general anesthesia, neuromuscular blocking agents, latex, antibiotics, and chlorhexidine are considered the dominant causative agents that may provoke anaphylactic reactions [
• Mertes P.M.
• Tajima K.
• Regnier-Kimmoun M.A.
• Lambert M.
• Iohom G.
• Gueant-Rodriguez R.M.
• Malinovsky J.M.
Perioperative anaphylaxis.
,
• Horiuchi T.
• Takazawa T.
• Orihara M.
• Sakamoto S.
• Nagumo K.
• Saito S.
Drug-induced anaphylaxis during general anesthesia in 14 tertiary hospitals in Japan: a retrospective, multicenter, observational study.
]. Although allergen screening tests, such as skin-prick and intradermal tests, are commonly used in postoperative evaluations to identify the probable agent involved, postoperative analysis alone has not contributed significantly to reducing the incidence of anaphylaxis during general anesthesia [
• Fisher M.M.
• Doig G.S.
Prevention of anaphylactic reactions to anaesthetic drugs.
,
• Laguna J.J.
• Archilla J.
• Doña I.
• Corominas M.
• Gastaminza G.
• Mayorga C.
• Berjes-Gimeno P.
• Tornero P.
• Martin S.
• Planas A.
• Moreno E.
• Torres M.J.
Practical guidelines for perioperative hypersensitivity reactions.
]. The drug-induced basophil activity test has been indicated to have positive predictive value for an anaphylactic reaction during general anesthesia, with specificity of 93.3% [
• Aalberse R.C.
• Kleine Budde I.
• Mulder M.
• Stapel S.O.
• Paulij W.
• Hollmann M.W.
Differentiating the cellular and humoral components of neuromuscular blocking agent-induced anaphylactic reactions in patients undergoing anaesthesia.
,
• Kvedariene V.
• Kamey S.
• Ryckwaert Y.
• Rongier M.
• Bousquet J.
• Demoly P.
• Arnoux B.
Diagnosis of neuromuscular blocking agent hypersensitivity reactions using cytofluorimetric analysis of basophils.
,
• Kalangara J.
• Vanijcharoenkarn K.
• Lynde G.C.
• McIntosh N.
• Kuruvilla M.
Approach to perioperative anaphylaxis in 2020: updates in diagnosis and management.
]. However, it is difficult to include this test in routine preoperative screening. The lack of predictive tests for anesthesia-related anaphylaxis may be due to the low prevalence of the condition, the diversity of agents used in anesthesia, and the high incidence of false-positive and false-negative test results [
• Fisher M.M.
• Doig G.S.
Prevention of anaphylactic reactions to anaesthetic drugs.
]. Suggested risk factors are very complicated and involve several aspects of the patient background such as sex, allergic history, medication, and previous exposure to anesthesia. Therefore, the development of effective preoperative evaluation tools could be expected to reduce the incidence of anesthesia-related anaphylaxis.
Although machine-learning-based analytics are already widely used for various medical data-processing tasks, an approach involving the use of artificial intelligence for the prediction of anesthesia-related anaphylaxis has not yet emerged. Attempts to identify an appropriate model for reliable predication could involve preoperative clinical data as training datasets along with anesthesia records and a specified outcome; in this case, anaphylaxis. This might identify previously unappreciated but useful relationships between clinical features and anesthesia-related anaphylaxis. However, several obstacles exist that may hinder the establishment of an effective algorithm aimed at improving patient management and the decision-making process. These include the fact that patient data would be applied for model-fitting to build learning algorithms but they are too diverse across the dataset for successful learning. Furthermore, there is no one certain risk factor that leads to the outcome, and the data are predictably unbalanced due to the low prevalence of the condition. The present study aimed to investigate the feasibility of machine learning-based predictive analytics for preoperative prediction of the risk of anesthesia-related anaphylaxis.

## Methods

### Study setting and participants

The study was conducted in accordance with the 2013 Declaration of Helsinki [
World medical association declaration of Helsinki: ethical principles for medical research involving human subjects.
]. The Institutional Review Board at Ehime University School of Medicine approved this study (approval no. 1702012) and waived informed consent because of the retrospective design. The data did not contain any direct patient identifiers, and no direct interaction with human subjects was involved.
Two independent researchers manually reviewed all anesthetic charts of cases between January 2017 and January 2019. Because of the extremely low prevalence of anesthesia-related anaphylaxis, the complete preoperative clinical records of the 45 anaphylaxis-positive patients were preferentially collected. Anaphylactic reactions were graded using Ring and Messmer's 4-point grading system [
• Ring J.
• Laubenthal H.
• Messmer K.
Incidence and classification of adverse reactions to plasma substitutes.
,
• Reitter M.
• Petitpain N.
• Latarche C.
• Cottin J.
• Massy N.
• Demoly P.
• Gillet P.
• Mertes P.M.
Fatal anaphylaxis with neuromuscular blocking agents: a risk factor and management analysis.
], where 1 refers to skin symptoms, a mild fever reaction, or both, and grade 4 refers to cardiac or respiratory arrest. The following patient baseline data were collected (a total of 148 variables): (1) basic information such as sex, age, body mass index, Brinkmann index, drinking history, allergic history, history of anesthetic exposure and medical conditions, making a total of 21 variables; (2) ongoing drug treatment (seven variables); (3) preoperative test data such as biochemical screening, complete blood count, plasma coagulation tests, and pre-blood transfusion tests (90 variables); and (4) anesthesia-related agents (30 variables). We also collected clinical data from 247 anaphylaxis-negative patients who underwent general anesthesia between January 2017 and May 2017.

### Feature selection methods

To identify which features contribute most to the classification performance, the following feature selection methods were used: support vector machine (SVM)-based recursive feature elimination (RFE), chi-squared method, correlation-based feature selection (CFS), and information gain ratio (IGR).
The wrapper method of SVM-RFE involves an iterative algorithm that fits SVM classification models by discarding features with a small impact on the classification and selects the smallest subset of features that participate in the best-performing classification model. The iterative procedures include training the SVM classifier, computing the ranking criterion of all features, and removing the feature with the lowest ranking criterion. We used the e1071R package (version 1.7.1) for SVM-RFE [
• Liu W.
• Meng X.
• Xu Q.
• Flower D.R.
• Li T.
Quantitative prediction of mouse class I MHC peptide binding affinity using support vector machine regression (SVR) models.
,
• Heagertu P.
• Liang K.
• Zeger S.
The e1071 Package: Misc Functions of the Department of Statistics.
].
The chi-squared method is a filter method that evaluates features individually by measuring their chi-squared statistic with respect to the class and rank of all features. Chi-squared values are divided into several intervals using an entropy-based discretization method. The CFS method is a filter method that uses a correlation-based heuristic to measure the correlation between attributes and rewords feature subsets in which each feature is highly correlated with the class but uncorrelated with other subset features. Given a feature subset S consisting of k features, the merit of the subset is defined by:
$Meritsk=krcf¯k+k(k−1)rff¯$

where $rcf¯$ is the average value of all feature-classification correlations and$ff¯$ is the average value of all feature-feature correlations. The CFS criterion is defined as follows:
$CFS=max[rcf1+rcf2+…+rcfkk+2(rf1f2+rfifj+…+rfkfl]$

where $rcfi$and $rfifj$ refer to correlations.
The IGR method is an information-theory-based metric of the usefulness of a feature, which utilizes the information gain ratio – the ratio of information gain to intrinsic information – and is often used in decision-tree training. It reduces bias towards multi-valued attributes by taking the number and size of branches into account when choosing an attribute. The FSelector (version 0.21) R package was used for the chi-squared, CFS and IGR methods [

R.R.A.L. Kothoff, FSelector: slecting attributes., (2016) R package version 0.21.

].
Since the CFS method involves automatic selection of ‘optimal’ features, this method was used directly in the classification. The other three methods; RFE, chi-squared, and IGR; output the ranks of all features based on their criterion; therefore, testing of model performance with removal of features (one feature at a time) was conducted for these methods.

### Classification methods

Five classification methods, linear discriminant analysis (LDA), logistic regression (LR), random forest (RF), naïve Bayes (NB), and SVM, were utilized in the present study.
The LR method is widely used and models the log-odds of one of the two classes as a linear combination of independent variables (or features). The glm() function in R was used to construct the LR models in the present study [
• Friedman J.
• Hastie T.
• Tibshirani R.
Regularization paths for generalized linear models via coordinate descent.
].
The RF method is a classification algorithm that uses an ensemble of unpruned decision trees, each of which is built on a bootstrap sample of the training data using a randomly selected subset of variables. Random forest uses bagging (bootstrap aggregation) for combining unstable learners and random variable selection for tree building. The ensemble learning idea (in particular, boosting and bagging) is another hallmark development in the field of machine learning over the past 20 years. The RF classification algorithm has been demonstrated to have excellent predictive performance even with feature variables with high noise, such as biological data [
• Guyon I.
• Weston J.
• Bamhill S.
• Vapnik V.
Gene selection for cancer classification using support vector machines.
]. The RF package randomForest for R was employed in the present study [

A. Liaw, M. Wiener, Classiciation and regression by randomForest, R news 2 (2002) 4.

,

A. Liaw, M. Wiener, Classification and Regression by randomForest., R News 2 (2002) 5.

]. The randomForest (version 4.6-12) and e1071 R packages were used for RF model construction. Values of ntree (ntree = 100, 300, 1000, 3000, and 10,000) were evaluated.
The NB method is a rule generator based on Bayes’ rule of conditional probability. The NB method uses all attributes and allows them to contribute to a decision, assuming they are all equally important and independent of one another. The e1071 R package was used for NB model construction.
As a generalization of Fisher's linear discriminant, LDA is used in statistics, pattern recognition, and machine learning to find a linear combination of features that characterizes or separates two or more classes of objects or events. The MASS (version 7.3-47) R package was employed for LDA model construction in the present study [
• Venebles B.
• Ripley B.D.
Modern Applied Statistics with S-Plus.
].
As an established classification method focusing on the definition of an optimal hyperplane to separate samples of different classes, SVM introduces the ‘margin concept’ to overcome overfitting problems. In this study, the SVM methods implemented in the e1071 R package and selection with the radial basis function (RBF) kernel were used. The RBF kernel function is defined by Eq. (1):
$K(x,y)=e|−γ||x−y|2$
(1)

where x and y are features and γ is the kernel parameter. The kernel parameter γ and SVM penalty parameter C were optimized by nested cross-validation over the γ values ((1/#Features)*(0.01, 0.1, 1/square root (10), 1, square root (10), 10, 100) and the C values (0.01, 0.1, 1/square root (10), 1, square root (10), and 10).

### Two-layer cross-validation design

Each of the feature selection methods was applied in conjunction with each of the classification methods. For each combination involving two adjustable parameters, a two-layer cross-validation scheme was used to establish and evaluate model performance (Fig. 1) [
• Liu W.
• Meng X.
• Xu Q.
• Flower D.R.
• Li T.
Quantitative prediction of mouse class I MHC peptide binding affinity using support vector machine regression (SVR) models.
]. Four-fold inner-layer cross validation was performed to evaluate optimal parameter values. At the outer layer, 5-fold cross-validation was used to estimate overall classification performance. Therefore, the classification model with the best parameter was established on the original training set. Nested cross-validation has been increasingly recognized as an effective procedure that curbs the overfitting problem and objectively evaluates the performance of classification models for unobserved data [
• Liu W.
• Meng X.
• Xu Q.
• Flower D.R.
• Li T.
Quantitative prediction of mouse class I MHC peptide binding affinity using support vector machine regression (SVR) models.
]. The methods implemented in the present study were performed with a locally developed R script.

### Classification performance evaluation metrics

The accuracy and Matthews correlation coefficient (MCC) were used to evaluate the performance of established classification models [
• Byvatov E.
• Schneider G.
Support vector machine applications in bioinformatics.
]. Accuracy and MCC are defined by Eqs. (2) and (3):
$Accurancy=(TP+TN)(TP+FN+FP+TN)$
(2)

$MCC=TP×TN−TP×FN√(TP+FP)(TP+FN)(TN+FP)(TN+FN)$
(3)

where TP stands for true positive, TN for true negative, FP for false positive, and FN for false negative. The model that showed the best performance in predicting the occurrence of anaphylaxis, according to the values of accuracy and MCC, was selected and created as a stand alone R program to be used for further prediction with unseen data.

## Results

### Data pre-processing

The dataset was pre-processed for machine-learning-based analysis, which involved filtering, cleaning, and imputation. We included adult patients over 20 years of age with ≥80% complete preoperative baseline data in the present study. After data cleaning, data sets of 45 anaphylaxis-positive patients were selected for analysis. Among anaphylaxis-negative participants, 12 were eliminated because of incomplete information in the extracted medical records. The resulting dataset included 74 independent variables (Supplementary Table 1) for 280 patients, 235 of whom experienced no anaphylactic reaction and were classified as Group N, and 45 with a reported anaphylactic reaction (grade 1–4) who were classified as Group Y. For two of the discrete variables, “ABO” and “Sex”, a commonly applied “sparse coding” scheme was employed to define multiple features from each variable [
• Deo R.C.
Machine learning in medicine.
]. For the discrete variable “ABO”, which could take one of five possible values; “ABO”, “A”, “B”, “AB” or “O”, we defined five features; namely, “ABO==ABO”, “ABO==A”, “ABO==B”, “ABO==AB” and “ABO==O”. The values for these five features were equal to 0, 1, 0, 0 and 0, respectively, for participants whose “ABO” value was “A”, and 0, 0, 0, 0 and 1, respectively, for participants whose “ABO” value was “O”. For the discrete variable “Sex”, whose value was either “M” or “F”, we defined two features; namely, “Is male” and “Is female”. The values were equal to 0 and 1, respectively, for participants whose “Sex” value was “F”. A total of seven independent variables were defined based on these two discrete independent variables. Notably, since there was only one patient with the value “ABO” under the discrete variable “ABO”, we removed the feature “ABO==ABO”. For the other discrete variables with a value of either “Y” or “N”, we substituted “Y” with “1” and “N” with “0”. A summary of the independent variables before and after data pre-processing is provided in Table 1.
Table 1Summary of independent variables from patient information before and after data cleaning.
Major categoryNumber of independent variables extracted
Before cleaningAfter cleaning
Basic information2120
Ongoing drug treatment77
Biochemical test4114
Complete blood count208
Coagulation test190
Plasma test61
Blood infusion test44
Anesthesia-related agents3020
At the patient level, to make the number of patients in different groups more balanced, 180 patients randomly selected from Group N and all 45 in Group Y were included in the analysis. For each variable, values were scaled across all participants into the scale of [0,1]. The values of independent variables, both those generated from discrete variables and those generated from continuous variables, were therefore precisely within the same range. Blank values were imputted using the mean value of all other participants.

### Testing optimal feature numbers

Of the four feature selection methods applied in the present study, CFS was not considered for construction of the learning model as it selected only one ‘optimal’ feature—'WBC (white blood cell count)’. For the chi-squared and IGR methods, except for the primary feature – ‘WBC’ – the other features were considered to make little contribution to the separation, adding noise rather than contributing to the predictive accuracy of the model.

### Performance of predictive models

A total of 75 features were included in model construction, and two-layer cross-validation-based model assessment was performed. Changes in the performance of each classifier across the number of features remaining based on different feature selection methods were evaluated using accuracy (Fig. 2a–c) and MCC (Fig. 2d–e). Compared with accuracy, MCC is more frequently used in the machine-learning field as a measure of the quality of two-class classifications [
• Boughorbel S.
• Jarray F.
• El-Anbari M.
Optimal classifier for imbalanced data using Matthews correlation coefficient metric.
,
• Matthews B.W.
Comparison of the predicted and observed secondary structure of T4 phage lysozyme.
] as it is a more objective measure of predictive performance with an imbalanced data set. It returns a value between −1 and +1, where +1 indicates perfect prediction, 0 no better than random prediction, and −1 total disagreement between prediction and observation. Table 2 presents the best predictive performance achieved for each classification method in combination with each of the feature selection methods. The SVM+RFE model showed the best performance among the three SVM models, with accuracy of 0.853 and MCC of 0.516 when four features (‘Nicardipine’, ‘Rocuronium’, ‘Ampicillin’, and ‘Bupivacaine’) remained for classification (Supplementary Table 2). The LDA+RFE model showed the best performance of all 12 models, with accuracy of 0.867 and MCC of 0.558 with 25 features used in the classification. The LR+RFE model also showed adequate performance, with accuracy of 0.858 and MCC of 0.541 with six features used in classification. The numbers of true positives, false positives, true negatives and false negatives for each of the three models with good performance, and for each of the five subsets in the 5-fold cross-validation are provided in Table 3.
Table 2Best performance of different classification methods in combination with feature selection methods.
ClassifierFeature selection methodAccuracyMCCRemaining features
SVMRFE0.8530.5164
Chi-squared0.8180.28553
IGR0.8270.34475
NBRFE0.7290.39522
Chi-squared0.8000.35972
IGR0.7110.26420
RFRFE0.8360.40317
Chi-squared0.8360.38468
IGR0.8270.32446
LDARFE0.8670.55825
Chi-squared0.8090.38871
IGR0.7820.33673
LRRFE0.8580.5416
Chi-squared0.8220.3025
IGR0.822.3022
Abbreviations: IGR, information gain ratio; MCC, Matthews correlation coefficient; NB, naïve Bayes; LDA, linear discriminant analysis; LR, logistic regression; RF, random forest; RFE, recursive feature elimination; SVM, support vector machine.
Table 3Validation of model combinations with good performance.
Model combinationSubsetTP (%)FP (%)TN (%)FN (%)
LR:RFE16 (13.3%)2 (4.4%)32 (71.1%)5 (11.1%)
23 (6.7%)6 (13.3%)33 (73.3%)3 (6.7%)
39 (20.0%)2 (4.4%)31 (68.9%)3 (6.7%)
45 (11.1%)2 (4.4%)34 (75.6%)4 (8.9%)
54 (8.9%)2 (4.4%)36 (80.0%)3 (6.7%)
Mean ± SD5.4 ± 2.30 (12.0 ± 5.12 %)2.8 ± 1.79 (6.2 ± 3.98 %)33.2 ± 1.92 (73.8 ± 4.27 %)3.6 ± 0.89 (8.0 ± 1.99 %)
LDA:RFE17 (15.5%)3 (6.7%)32 (71.1%)3 (6.7%)
24 (8.9%)0 (0.0%)38 (84.4%)3 (6.7%)
35 (11.1%)1 (2.2%)33 (73.3%)6 (13.3%)
45 (11.1%)4 (8.9%)35 (77.8%)1 (2.2%)
55 (11.1%)3 (6.7%)31 (68.9%)6 (13.3%)
Mean ± SD5.2 ± 1.09 (11.55 ± 2.43 %)2.2 ± 1.64 (4.89 ± 3.65 %)33.8 ± 2.78 (75.11 ± 6.17 %)3.8 ± 2.17 (8.44 ± 4.81 %)
SVM:RFE12 (4.4%)2 (4.4%)37 (82.2%)4 (8.9%)
28 (17.8%)3 (6.7%)29 (64.4%)5 (11.1%)
35 (11.1%)1 (2.2%)37 (82.2%)2 (4.4%)
47 (15.6%)4 (8.8%)30 (66.7%)4 (8.9%)
53 (6.7%)3 (6.7%)34 (75.5%)5 (11.1%)
Mean ± SD5.0 ± 2.54 (11.11 ± 5.67 %)2.6 ± 1.14 (5.78 ± 2.53 %)33.4 ± 3.78 (74.2 ± 8.4 %)4 ± 1.22 (8.89 ± 2.72 %)
Models with good performance were linear regression:recursive feature elimination, linear discriminant analysis:recursive feature elimination, and support vector machine:recursive feature elimination.
A total of 225 patients were randomly split into five subsets, each containing 45 patients for validation.
Abbreviations: FN, false negative; FP, false positive; LDA, linear discriminant analysis; LR: logistic regression; RF, random forest; TN, true negative; TP, true positive; RFE, recursive feature elimination; SVM, support vector machine.
The features of the LDA:RFE model used for establishment of the final predicative R program and the correspondence between original and scale factors are summarized in Table 4. This R program predicts anaphylaxis during general anesthesia with a label of ‘Yes’ or ‘No’ by inputting previously unseen preoperative patient data.
Table 4Features used in the predication program that showed the best performance (linear discriminate analysis-recursive feature elimination).
FeatureOriginal valueScale factor
A/GX(X−0.4)/1.9
AgeX(X−1.4)/91.6
ALPX(X−12)/1309
AlbuminX(X−0.4)/4.9
AmpicillinY or N1 or 0
CefazolinY or N1 or 0
DiabetesY or N1 or 0
Drinking historyY or N1 or 0
HyperlipidemiaY or N1 or 0
HypertensionY or N1 or 0
InfectionY or N1 or 0
Female sexY or N1 or 0
Male sexY or N1 or 0
Thiamylal sodiumY or N1 or 0
Malignant tumorY or N1 or 0
BupivacaineY or N1 or 0
Metabolic diseaseY or N1 or 0
NaX(X−141.9)/5.1
PhenylephrineY or N1 or 0
NicardipineY or N1 or 0
PLTX(X−7.6)/46.4
FlurbiprofenY or N1 or 0
RocuroniumY or N1 or 0
XylocaineY or N1 or 0
Abbreviations: A/G, albumin/globulin ratio; ALP, alkaline phosphatase; N, number; PLT; platelet count; X, values of the feature Y, yes.

## Discussion

In the present study, with the aim of developing a personalized risk forecast platform for general-anesthesia-related anaphylaxis, we applied the clinical features of surgical patients to a machine-learning-based strategy. Based on the results of cross-section validation, the combinations of the LDA+RFE model and LR+RFE model showed good performance of all 15 combinations of feature selection and classification. A predictive R program was therefore established using the LDA+RFE model, which will be used for further validation of predictive performance by inputting previously unseen preoperative patient data.
Perioperative anaphylaxis has a heterogenous clinical presentation that ranges from mild to catastrophic. Although mild presentations are easily managed and may not require specific treatment, the cluster of multisystem derangement seen at the severe end of the spectrum can cause cardiac arrest and inability to oxygenate. Perioperative mortality rates of 4.76% and 4% have been reported for all causative drugs in Japan and the United States, respectively [
• Reitter M.
• Petitpain N.
• Latarche C.
• Cottin J.
• Massy N.
• Demoly P.
• Gillet P.
• Mertes P.M.
Fatal anaphylaxis with neuromuscular blocking agents: a risk factor and management analysis.
]. Higher mortality is often associated with frailty and the absence of preoperative preparation. Inadequate preoperative evaluation or delayed recognition of anaphylaxis may lead to missed opportunities for early intervention and compromised quality of patient care. It is difficult to design an evaluation system for anaphylaxis using preoperative tests available in clinical practice. Therefore, establishing an effective risk-scoring and predictive system for perioperative anaphylaxis that can predict unstable patient states preceding anaphylactic shock and help identify whether anaphylaxis will occur would be highly valuable.
As the first step in exploring this paradigm, we investigated the characterization capability of a machine-learning technique in patients who had undergone general anesthesia, based on very diverse preoperative information. The extremely imbalanced data structure and overfitting problems were a significant concern prior to data analysis. The noteworthy MCC value indicates the good performance of this model considering the small sample size. Although good model performance is highly desirable, models with reasonably good performance but a smaller number of features (such as SVM+RFE and LG+RFE) are also valuable, because these models are less complex and are likely to be more robust when applied to future, unseen data. The results of this pilot study demonstrate the feasibility and utility of a machine-learning technique for low-prevalence conditions such as anesthesia-related anaphylaxis.
In the best clinical feature combination (Table 4), the final set of risk factors, which included sex-based factors, metabolic factors (e.g., the presence of hyperlipidemia or having diabetes), anesthetic-related factors (e.g., use of rocuronium, lidocaine, or sugammadex), and antibiotic-related factors (e.g., use of ampicillin), was relatively comprehensive. In addition to major risk factors, such as neuromuscular blocking agents and antibiotics, sex hormones and hypercholesterolemia also represent potential risk factors for severe anaphylaxis or allergic response [
• Keselman A.
• Heller N.
Estrogen signaling modulates allergic inflammation and contributes to sex differences in asthma.
,
• Jones B.G.
• Penkert R.R.
• Surman S.L.
• Sealy R.E.
• Pelletier S.
• Xu B.
• Neale G.
• Maul R.W.
• Gearhart P.J.
• Hurwitz J.L.
Matters of life and death: how estrogen and estrogen receptor binding to the immunoglobulin heavy chain locus may influence outcomes of infection, allergy, and autoimmune disease.
,
• Pastorello E.A.
• Borgonovo L.
• Preziosi D.
• Schroeder J.W.
• Pravettoni V.
• Aversano M.G.
• Pastori S.
• Bilò M.B.
• Piantanida M.
• Losappio L.M.
• Nichelatti M.
• Rossi C.M.
• Farioli L.
Basal tryptase high levels associated with a history of arterial hypertension and hypercholesterolemia represent risk factors for severe anaphylaxis in hymenoptera venom-allergic subjects over 50 years old.
]. Some unexpected factors (e.g., serum sodium level, drinking history, and platelet count) were also present in the final feature set. Results from the pre-trained predictive model indicated that these factors were highly correlated with anaphylaxis caused by an unknown mechanism. For example, it was unclear why drinking history, but not allergic history, was a key factor in the final learning model, and the model did not explain how serum sodium level affects intra-operative anaphylaxis risk. Anesthesia-related anaphylaxis could be triggered by mast cell degranulation via high-affinity IgE receptor- and/or G protein-coupled receptor-mediated signalling pathways [
• Suzuki Y.
• Liu S.
• Takasaki Y.
• Yorozuya T.
• Mogi M.
Association between mutated mas-related g protein-coupled receptor-X2 and rocuronium-induced intraoperative anaphylaxis.
]. A number of G protein-coupled receptors exhibit sensitivity to the presence of sodium-binding pockets, where sodium ions occupy a site at the centre of a square-pyramidal network of hydrogen bonds [
• Newton C.L.
• Wood M.D.
• Strange P.G.
Examining the effects of sodium ions on the binding of antagonists to dopamine D2 and D3 receptors.
]. Binding of sodium ions within the pocket is believed to cause a change in conformation that may allosterically modulate ligand binding at the orthosteric site. Systemically, mast cell activation was found to be correlated with the release of aldosterone via the renin–angiotensin axis in response to sodium depletion [
• Boyer H.G.
• Wils J.
• Renouf S.
• Arabo A.
• Duparc C.
• Boutelet I.
• Lefebvre H.
• Louiset E.
Dysregulation of aldosterone secretion in mast cell-deficient mice.
]. This evidence suggests that serum sodium has an undefined relationship with systemic cellular responses during anaphylaxis. Besides preoperative prediction, a machine learning strategy could also derive insights from data across fields, which could further our understanding of anaphylaxis.
Herein, we report the results of an initial attempt to predict intra-operative anaphylaxis, which has a low prevalence, using machine learning. In considering the feasibility of machine learning in clinical applications, whether such models – black boxes – can or cannot be trusted is a very important question. Therefore, increasing model trustworthiness by clarifying opaque results with unexplained mechanisms is essential for future intra-operative anaphylaxis prediction studies, as well as those in other medical fields. Also, the small sample size and extremely imbalanced data structure limited the accuracy and reliability of the trained model. The difference in rate of anaphylaxis between the training dataset (15.4%) and the whole population (approximately 1%) may have caused instability between iterations during learning. We believe that a larger dataset would significantly improve the accuracy and reliability of the model. More importantly, the use of deep learning to clarify the underlying mechanisms when anaphylaxis is predicted may address the lack of model transparency (i.e., providing explainable predicted results). Such a model may offer a promising clinical data-driven option for use in improving intra-operative patient care.
For perioperative anaphylaxis, nowcasting diagnoses and forecasting prognoses are both important for patient outcomes. Besides the predictive function, a dynamic-risk-based scoring and decision-making supporting system is expected to be established. A prediction based on personalized parameters is expected to be combined with effective methods to base its prediction results on memories of evolving real-time monitoring trends in vital signs (e.g., heart rate, respiratory rate, blood pressure, pulse oximeter O2, and ventilator monitoring patterns).
In the literature on real-time forecasting, the pioneering electroencephalogram-based deep learning framework equipped on a microcomputer has been introduced as a novel index to accurately predict the depth of anesthesia [
• Park Y.
• Han S.H.
• Byun W.
• Kim J.H.
• Lee H.C.
• Kim S.J.
A real-time depth of anesthesia monitoring system based on deep neural network with large EDO tolerant EEG analog front-end.
,
• Khan F.H.
An EEG-based hypnotic state monitor for patients during general anesthesia.
,
• Khan F.H.
• Ashraf U.
• Altaf M.A.B.
A patient-specific machine learning based EEG processor for accurate estimation of depth of anesthesia.
]. Also, as an extension of a binary classification, a weighted directed acyclic graph SVM algorithm can offer benefits to fusion multi-trained SVMs, enabling mixed-data clusters to be classified with high precision [
• Xie X.
• Chang Z.
Intelligent wearable occupational health safety assurance system of power operation.
]. Thus, our future planned study involves extending the present modeling framework to a dynamic forecasting system using personalized preoperative information and perioperative monitoring of the data stream. Applying this system to moving time windows during general anesthesia would allow patient-specific anaphylaxis risk trajectories to be dynamically computed.
This study presents initial proof of the capability of a machine-learning-based strategy to forecast low-prevalence anesthesia-related anaphylaxis. We believe that this approach offers bedside clinical-data-driven options for optimizing patient care and minimizing the incidence of intraoperative adverse events. As a final aim, an optimized dynamic predictive system would make preoperative, as well as intraoperative real-time, risk prediction available as a tool to support decision-making for anesthesiologists.

## Author summary

A machine-learning-based strategy for forecasting low-prevalence anesthesia-related anaphylaxis is feasible.

## Funding statement

SL was supported by the Japan Society for the Promotion of Science (JSPS), and a Grant-in-Aid for Scientific Research (KAKENHI) grant number 18K08389. YS was supported by the Japan Society for the Promotion of Science (JSPS), a Grant-in-Aid for Scientific Research KAKENHI grant number 18K16541, and the Japanese Society of Anesthesiologists (JSA) Pitch Contest 2017.

## CRediT authorship contribution statement

Shuang Liu: Conceptualization, Data curation, Software, Writing – original draft. Yasuyuki Suzuki: Conceptualization, Data curation, Software. Toshihiro Yorozuya: Visualization, Writing – review & editing. Masaki Mogi: Visualization, Writing – review & editing.

## Declaration of Competing Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationship that could be construed as a potential conflict of interest.

## References

• Mertes P.M.
• Tajima K.
• Regnier-Kimmoun M.A.
• Lambert M.
• Iohom G.
• Gueant-Rodriguez R.M.
• Malinovsky J.M.
Perioperative anaphylaxis.
Med Clin N Am. 2010; 94 (xi): 761-789
• Horiuchi T.
• Takazawa T.
• Orihara M.
• Sakamoto S.
• Nagumo K.
• Saito S.
Drug-induced anaphylaxis during general anesthesia in 14 tertiary hospitals in Japan: a retrospective, multicenter, observational study.
J Anesth. 2021; 35: 154-160
• Fisher M.M.
• Doig G.S.
Prevention of anaphylactic reactions to anaesthetic drugs.
Drug Saf. 2004; 27: 393-410
• Laguna J.J.
• Archilla J.
• Doña I.
• Corominas M.
• Gastaminza G.
• Mayorga C.
• Berjes-Gimeno P.
• Tornero P.
• Martin S.
• Planas A.
• Moreno E.
• Torres M.J.
Practical guidelines for perioperative hypersensitivity reactions.
J Investig Allergol Clin Immunol. 2018; 28: 216-232
• Aalberse R.C.
• Kleine Budde I.
• Mulder M.
• Stapel S.O.
• Paulij W.
• Hollmann M.W.
Differentiating the cellular and humoral components of neuromuscular blocking agent-induced anaphylactic reactions in patients undergoing anaesthesia.
Br J Anaesth. 2011; 106: 665-674
• Kvedariene V.
• Kamey S.
• Ryckwaert Y.
• Rongier M.
• Bousquet J.
• Demoly P.
• Arnoux B.
Diagnosis of neuromuscular blocking agent hypersensitivity reactions using cytofluorimetric analysis of basophils.
Allergy. 2006; 61: 311-315
• Kalangara J.
• Vanijcharoenkarn K.
• Lynde G.C.
• McIntosh N.
• Kuruvilla M.
Approach to perioperative anaphylaxis in 2020: updates in diagnosis and management.
Curr Allergy Asthma Rep. 2021; 21: 4
1. World medical association declaration of Helsinki: ethical principles for medical research involving human subjects.
JAMA. 2013; 310: 2191-2194
• Ring J.
• Laubenthal H.
• Messmer K.
Incidence and classification of adverse reactions to plasma substitutes.
Klin Wochenschr. 1982; 60: 997-1002
• Reitter M.
• Petitpain N.
• Latarche C.
• Cottin J.
• Massy N.
• Demoly P.
• Gillet P.
• Mertes P.M.
Fatal anaphylaxis with neuromuscular blocking agents: a risk factor and management analysis.
Allergy. 2014; 69: 954-959
• Liu W.
• Meng X.
• Xu Q.
• Flower D.R.
• Li T.
Quantitative prediction of mouse class I MHC peptide binding affinity using support vector machine regression (SVR) models.
BMC Bioinform. 2006; 7: 182
• Heagertu P.
• Liang K.
• Zeger S.
The e1071 Package: Misc Functions of the Department of Statistics.
TU Wien., 2006
2. R.R.A.L. Kothoff, FSelector: slecting attributes., (2016) R package version 0.21.

• Friedman J.
• Hastie T.
• Tibshirani R.
Regularization paths for generalized linear models via coordinate descent.
J Stat Softw. 2010; 33: 1-22
• Guyon I.
• Weston J.
• Bamhill S.
• Vapnik V.
Gene selection for cancer classification using support vector machines.
Mach Learn. 2002; 46: 33
3. A. Liaw, M. Wiener, Classiciation and regression by randomForest, R news 2 (2002) 4.

4. A. Liaw, M. Wiener, Classification and Regression by randomForest., R News 2 (2002) 5.

• Venebles B.
• Ripley B.D.
Modern Applied Statistics with S-Plus.
Springer Cham, 2002
• Byvatov E.
• Schneider G.
Support vector machine applications in bioinformatics.
Appl Bioinform. 2003; 2: 67-77
• Deo R.C.
Machine learning in medicine.
Circulation. 2015; 132: 1920-1930
• Boughorbel S.
• Jarray F.
• El-Anbari M.
Optimal classifier for imbalanced data using Matthews correlation coefficient metric.
PLoS One. 2017; 12e0177678
• Matthews B.W.
Comparison of the predicted and observed secondary structure of T4 phage lysozyme.
Biochim Biophys Acta. 1975; 405: 442-451
• Keselman A.
• Heller N.
Estrogen signaling modulates allergic inflammation and contributes to sex differences in asthma.
Front Immunol. 2015; 6: 568
• Jones B.G.
• Penkert R.R.
• Surman S.L.
• Sealy R.E.
• Pelletier S.
• Xu B.
• Neale G.
• Maul R.W.
• Gearhart P.J.
• Hurwitz J.L.
Matters of life and death: how estrogen and estrogen receptor binding to the immunoglobulin heavy chain locus may influence outcomes of infection, allergy, and autoimmune disease.
Cell Immunol. 2019; 346103996
• Pastorello E.A.
• Borgonovo L.
• Preziosi D.
• Schroeder J.W.
• Pravettoni V.
• Aversano M.G.
• Pastori S.
• Bilò M.B.
• Piantanida M.
• Losappio L.M.
• Nichelatti M.
• Rossi C.M.
• Farioli L.
Basal tryptase high levels associated with a history of arterial hypertension and hypercholesterolemia represent risk factors for severe anaphylaxis in hymenoptera venom-allergic subjects over 50 years old.
Int Arch Allergy Immunol. 2021; 182: 146-152
• Suzuki Y.
• Liu S.
• Takasaki Y.
• Yorozuya T.
• Mogi M.
Association between mutated mas-related g protein-coupled receptor-X2 and rocuronium-induced intraoperative anaphylaxis.
Br J Anaesth. 2020; 125: e446-e448
• Newton C.L.
• Wood M.D.
• Strange P.G.
Examining the effects of sodium ions on the binding of antagonists to dopamine D2 and D3 receptors.
PLoS One. 2016; 11e0158808
• Boyer H.G.
• Wils J.
• Renouf S.
• Arabo A.
• Duparc C.
• Boutelet I.
• Lefebvre H.
• Louiset E.
Dysregulation of aldosterone secretion in mast cell-deficient mice.
Hypertension. 2017; 70: 1256-1263
• Park Y.
• Han S.H.
• Byun W.
• Kim J.H.
• Lee H.C.
• Kim S.J.
A real-time depth of anesthesia monitoring system based on deep neural network with large EDO tolerant EEG analog front-end.
IEEE Trans Biomed Circuits Syst. 2020; 14: 825-837
• Khan F.H.