Reproducibility and Explainability of Deep Learning in Mammography: A Systematic Review of Literature

Deeksha Bhalla; Krithika Rangarajan; Tany Chandra; Subhashis Banerjee; Chetan Arora

doi:10.1055/s-0043-1775737

Subscribe to RSS

Please copy the URL and add it into your RSS Feed Reader.

https://www.thieme-connect.de/rss/thieme/en/10.1055-s-00050590.xml

Share / Bookmark

Facebook X Linkedin Weibo

Download PDF

CC BY-NC-ND 4.0 · Indian J Radiol Imaging 2024; 34(03): 469-487
DOI: 10.1055/s-0043-1775737

Systematic Review

Reproducibility and Explainability of Deep Learning in Mammography: A Systematic Review of Literature

Deeksha Bhalla

¹Department of Radiodiagnosis, All India Institute of Medical Sciences, New Delhi, India

,

Krithika Rangarajan

¹Department of Radiodiagnosis, All India Institute of Medical Sciences, New Delhi, India

,

Tany Chandra

¹Department of Radiodiagnosis, All India Institute of Medical Sciences, New Delhi, India

,

Subhashis Banerjee

²Department of Computer Science and Engineering, Indian Institute of Technology, New Delhi, India

,

Chetan Arora

²Department of Computer Science and Engineering, Indian Institute of Technology, New Delhi, India

› Author Affiliations

Funding This work was supported in part by the Department of Biotechnology, Government of India, under grant BT/PR33193/AI/133/5/2019.

› Further Information

Abstract
Full Text
References
Figures
Supplementary Material

PDF Download Permissions and Reprints

Abstract
Introduction
Materials and Methods

Information Sources

Eligibility Criteria

Data Extraction and Quality Assessment

Data Analysis and Summary Measures

Results

Literature Search and Study Selection

Data Extraction (Initial Assessment)

Quality Assessment

Clinical Study Design

Common Practices Used for AI Model Design and Training

Performance of AI

Discussion

Common Practices for Model Design

Common Practices for Clinical Design

Overall Assessment of Position of AI in Breast Imaging

References

Abstract

Background Although abundant literature is currently available on the use of deep learning for breast cancer detection in mammography, the quality of such literature is widely variable.

Purpose To evaluate published literature on breast cancer detection in mammography for reproducibility and to ascertain best practices for model design.

Methods The PubMed and Scopus databases were searched to identify records that described the use of deep learning to detect lesions or classify images into cancer or noncancer. A modification of Quality Assessment of Diagnostic Accuracy Studies (mQUADAS-2) tool was developed for this review and was applied to the included studies. Results of reported studies (area under curve [AUC] of receiver operator curve [ROC] curve, sensitivity, specificity) were recorded.

Results A total of 12,123 records were screened, of which 107 fit the inclusion criteria. Training and test datasets, key idea behind model architecture, and results were recorded for these studies. Based on mQUADAS-2 assessment, 103 studies had high risk of bias due to nonrepresentative patient selection. Four studies were of adequate quality, of which three trained their own model, and one used a commercial network. Ensemble models were used in two of these. Common strategies used for model training included patch classifiers, image classification networks (ResNet in 67%), and object detection networks (RetinaNet in 67%). The highest reported AUC was 0.927 ± 0.008 on a screening dataset, while it reached 0.945 (0.919–0.968) on an enriched subset. Higher values of AUC (0.955) and specificity (98.5%) were reached when combined radiologist and Artificial Intelligence readings were used than either of them alone. None of the studies provided explainability beyond localization accuracy. None of the studies have studied interaction between AI and radiologist in a real world setting.

Conclusion While deep learning holds much promise in mammography interpretation, evaluation in a reproducible clinical setting and explainable networks are the need of the hour.

#

Keywords

artificial intelligence - breast cancer - deep learning - mammography - neural networks - systematic review

Introduction

Computer-aided detection (CAD) techniques in mammography have a controversial history.

Traditional CAD used hand-crafted features to detect cancers on mammograms, and received Food and Drug Administration clearance way back in 1998. There was a lot of initial enthusiasm about the use of CAD in mammography, with many large studies suggesting it can improve detection of cancers.[1] [2] [3] [4] These were however retrospective studies, performed in simulated environments. When deployed for clinical use, it was found that CAD actually reduces the accuracy for cancer detection, and increases biopsy rates.

Deep learning (DL) has made much headway in medical imaging. This is particularly true of breast imaging, where various studies have reported accuracies comparable with radiologists. Many studies have even suggested that DL may be used, not just as a second user, but also to triage mammograms without user intervention, thereby reducing the workload on the radiologist. This may be particularly valuable, given the increasing work-load and may even make way for breast screening in developing countries. However, even today, most studies are in retrospective simulated environments. A systematic review by Freeman et al[5] indicated that the clinical design of most studies is poor, and the level of evidence for conclusions drawn is low.

DL models essentially learn from the data they have trained on, and would carry forward biases in these data in an invisible, difficult-to-detect manner. It has been seen that results reported in the literature are often not reproducible in clinical settings. Wang et al[6] in their study demonstrated how performance varied widely when six different algorithms were tested on four mammography datasets, with a significant fall in accuracy on external validation. Thus, reproducibility is an essential metric when assessing for possible clinical deployment of any algorithm. In recent times, detailed check-lists such as the Medical Image Computing and Computer Assisted Interventions (MICCAI) reproducibility check-list[7] and the Checklist for Artificial Intelligence in Medical Imaging (CLAIM)[8] have been made available as a guide to authors planning and reporting such studies, to protect from lack of reproducibility. Thus, to assess for potential reproducibility of studies included in this systematic review, we checked for their adherence to such check-lists. The importance of explainability can be further understood by understanding the inverse relationship between simplicity of an algorithm and the performance. Unlike simpler algorithms which are inherently easier to understand, on the other hand, more advanced algorithms, especially multilayered DL-based algorithms, are known as a “black box,” since little is known about what made the algorithm come to a particular conclusion. In health care, this explanation is essential for patient-centered counselling and ethical as well legal concerns. For models to be considered credible in the clinical setting, it is essential that it be known whether the predictions made by these models are clinically justifiable. The reader is referred to a review by Li et al for an understanding of various technical methods used for building trust-worthy, interpretable Artificial Intelligence (AI) models[9]

In this review, we attempt to assess currently available literature for reproducibility and explainability of these models, to take a measured view of the position of DL in mammography today. In addition, for studies which do have a robust clinical validation, we describe the best practices in model development and clinical design in detail, as adopted by these investigators. Some technical terms used in this review have been explained in online [Supplementary Table S1] for ease of the reader.

#

Materials and Methods

This systematic review was conducted as per the Preferred Reporting Items for Systematic reviews and Meta-Analysis (PRISMA) guidelines.[10] The protocol was registered with the international prospective registry of systematic reviews (CRD42020222668).

Information Sources

A search of the Pubmed and Scopus databases was made in August 2021 by two independent reviewers. The keywords used were “deep learning” OR “artificial intelligence” AND “mammography” OR “breast” OR “breast cancer.” The titles as well as abstracts of the studies were examined by reviewers for inclusion in the study. The identified articles were retrieved and manual search of bibliography was done to identify other potentially relevant studies.

#

Eligibility Criteria

Studies required to fulfill the following criteria to be considered for inclusion. (1) Studies reporting the development of a new DL model or validation of an existing commercially available model. (2) Application of model to the domain of either lesion detection or classification. (3) Information on training and performance of algorithm available in study; or if version and model of commercially available software have been mentioned. (4) Full text of article available in English language. The exclusion criteria were: (1) studies reporting only breast density assessment by models. (2) Studies reporting only accuracy of segmentation of regions of interest extracted by user on mammograms.

We also excluded review articles, opinions, letters to editors, and conference abstracts. Both reviewers examined the full texts of eligible articles to determine inclusion in the final analysis.

#

Data Extraction and Quality Assessment

A data extraction form was used to obtain relevant data from included studies. The dataset used for training and testing was recorded, along with the number of images in each subset. The task performed by the model along with its features, including the key-idea behind model training, and reported results were also recorded. Since the validation methodology and study design were different in each study, we first performed a quality assessment by assessing the risk of bias and addressing applicability concerns. We devised a modification of the Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) tool,[11] which was applicable to our research question, the “modified QUADAS-2” (mQUADAS-2). This was adapted from the CLAIM[8] as well as the MICCAI reproducibility checklist.[7] The modifications made to particularly fit AI assessment have been highlighted.

To ensure that studies we chose for detailed description had a robust study methodology, making their reported results reproducible and verifiable, we identified studies which reported their results on enriched datasets which were not representative of the distribution of breast cancer in the population, studies which did not have consecutive or random sampling, studies which did not perform external validation, studies where reference standard is not based on histopathology, and studies which had inappropriate exclusions (such as testing on only cancer images, and excluding normal images).

The entire mQUADAS-2 assessment tool is available in online as [Supplementary Table S2]. The quality assessment was performed by two independent reviewers (D.B. and T.C.). Differences in opinion were settled by a third reviewer (K.R.).

#

Data Analysis and Summary Measures

Among the studies that fulfilled the inclusion criteria and were deemed to be of acceptable diagnostic quality based on mQUADAS-2, we performed a detailed analysis of model training strategies and reported results of model training strategies. The analysis was focused on determining (1) whether the clinical study design allows for generalizability of results and (2) whether any attempt at explainability of results has been attempted in the model building and model analysis process. Based on these, the following analysis was performed.

Clinical study design: we studied the suggested use of AI; whether AI was used as a standalone for triage of screening studies, as an aid to reporting radiologists, or direct comparison was made between AI systems and radiology readers of varying experience. We described the data collection process for training and validation of a model.
Common practices in model design: the details of the model, including key concepts, model architecture, and hyperparameters, wherever mentioned were described
Performance of AI: the metrics of reporting data, including accuracy, sensitivity, specificity, or area under the curve for a receiver operating characteristic (ROC) curve with confidence intervals were compared. Common metrics used included sensitivity in relation to number of false positives per image as per the free ROC curve (FROC) for detection tasks while area under the ROC curve for classification tasks. Studies were also analyzed to see if any explanation for results of AI is provided, such as lesion localization, explanation of missed cancers/false positives, or attempt at feature visualizations (or any other form of explainability).

#
#

Results

Literature Search and Study Selection

Using the search criteria, initially 12,123 articles were identified. After removal of duplicates and screening of abstracts, the full text of 179 articles was retrieved. Of these, 72 articles were excluded after screening the full texts. One hundred and seven articles were included in the final analysis.[12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] The study selection process is summarized in [Fig. 1].

Fig. 1 Summary of study inclusion process for our review.

#

Data Extraction (Initial Assessment)

Forty-seven studies tested their results on private datasets, either in isolation or in combination with public datasets. Sixty studies exclusively used publicly available datasets to report their results. Common public datasets used for testing the model included Breast Cancer Digital Repository (BCDR) (5 studies), INbreast (15 studies), Mammographic Image Analysis Society (MIAS) (17 studies), Digital Database for Screening Mammography (DDSM) (34 studies), and OPTIMAM (3 studies). Most private datasets used had only image-level labels; many authors used lesion-level labels provided in public datasets in addition to their private datasets. Initial approaches for network training included a combination of hand-engineered features and DL; several studies also used machine learning approaches such as support vector machine and random forest classifiers at some stage in their pipeline. More recent approaches use DL end-to-end. Common approaches include a standard classification network such as AlexNet, Residual Neural network (ResNet), and Visual Geometry group (VGG) trained on medical images. Many of the studies which reported detection accuracy used a standard object detection network, used for natural images. The most common networks used include Faster Regions with convolutional neural networks (RCNN),[13] [18] You only look once (YOLO),[15] [16] [18] [34] and RetinaNet.[21] To deal with availability of only small datasets with strong (lesion level) labels, authors have attempted patch learning[20] [22] [24] [51] (classifiers trained on patches are used to initialize full image classifiers), and multi-instance learning.[22] [24] [69] To overcome shortage of data, several authors have mentioned performing data augmentation by flipping, rotation, and geometric transformation, while few authors have attempted generative adversarial network-based synthetic image generation for data augmentation. Most authors mention the use of transfer learning from natural images. Networks are commonly initialized with weights from ImageNet training. Common strategies used in improving accuracy included use of opposite view, opposite breast, use of full resolution images for training, multi-scale training, and use of patient metadata. Detection accuracy is commonly presented as an FROC curve which plots true-positive detections against false-positive marks per image.[119] Most common metric of classification accuracy was the area under the ROC curve.

#

Quality Assessment

Studies were assessed for risk of bias and applicability as per our mQUADAS-2 tool. The details of assessment are provided in [Table 1]. Overall, four[62] [76] [113] [117] studies qualified mQUADAS-2, of which one study described the test of a commercially available model and three described details of model training, as well as tested the model with a robust clinical study. Each of these four studies are summarized in the online [Supplementary Table S3]. Below, we have analyzed these studies for their clinical study design, model design and training, and their reported results.

Table 1
Details of mQUADAS-2 assessment of included studies
	Title	Author	Year	Decision toward detailed analysis	Risk of bias	Reason for exclusion: domain	Reason
1.	Deep Learning to improve breast cancer detection on screening mammography	Shen et al[38]	Aug 2019	Exclude	High	Patient selection	Only enriched datasets used
2.	Deep Learning to distinguish recalled but benign mammography images in breast cancer screening	Aboutalib et al[39]	Dec 2018	Exclude	Unclear	Patient selection	Unclear consecutive sample used or not
3.	Deep learning in mammography: diagnostic accuracy of a multipurpose image analysis software in the detection of breast cancer	Becker et al[40]	Jul 2017	Exclude	High	Patient selection	Split dataset
4.	Large scale deep learning for computer aided detection of mammographic lesions	Kooi et al[12]	Jan 2017	Exclude	High	Patient selection	Split dataset
5.	Discrimination of breast cancer with microcalcifications on mammography by deep learning	Wang et al[41]	Jun 2016	Exclude	High	Patient selection	Split dataset
6.	Representation learning for mammography mass lesion classification with convolutional neural networks	Arevalo et al[42]	Apr 2016	Exclude	High	Patient selection	Exclusion of normal breasts
7.	Detecting and classifying lesions in mammograms with Deep Learning	Ribli et al[13]	Mar 2018	Exclude	High	Patient selection	Only enriched datasets used
8.	Predicting breast cancer by applying deep learning to linked health records and mammograms	Akselrod-Ballin et al[114]	Aug 2019	Exclude	High	Patient selection	Split dataset
9.	A deep learning model to triage screening mammograms: a simulation study	Yala et al[43]	Oct 2019	Exclude	High	Patient selection	Split dataset
10.	A deep learning approach for the analysis of masses in mammograms with minimal user intervention	Dhungel et al[14]	Apr 2017	Exclude	High	Patient selection	Only enriched datasets used
11.	Multi-task transfer learning deep convolutional neural network: application to computer-aided diagnosis of breast cancer on mammograms	Samala et al[44]	Nov 2017	Exclude	High	Patient selection	Split dataset
12.	A deep learning-based decision support tool for precision risk assessment of breast cancer	He et al[45]	May 2019	Exclude	High	Patient selection	Test only on BIRADS 4 images
13.	Visually interpretable deep network for diagnosis of breast masses on mammograms	Kim et al[46]	Dec 2018	Exclude	High	Patient selection	Only enriched datasets used
14.	A fully integrated computer-aided diagnosis system for digital X-ray mammograms via deep learning detection, segmentation, and classification	Al-Antari et al[15]	Sep 2018	Exclude	High	Patient selection	Split dataset
15.	Automated analysis of unregistered multi-view mammograms with deep learning	Carneiro et al[47]	Nov 2017	Exclude	High	Patient selection	Only enriched datasets used
16.	Deep Convolutional Neural Networks for breast cancer screening	Chougrad et al[48]	Apr 2018	Exclude	High	Patient selection	Only enriched datasets used
17.	Few-shot learning with deformable convolution for multiscale lesion detection in mammography	Li et al[17]	Jul 2020	Exclude	High	Patient selection	Only enriched datasets used
18.	Breast microcalcification diagnosis using deep convolutional neural network from digital mammograms	Cai et al[49]	Mar 2019	Exclude	High	Patient selection	Split dataset
19.	Simultaneous detection and classification of breast masses in digital mammograms via a deep learning YOLO-based CAD system	Al-Masni et al[19]	Apr 2018	Exclude	High	Patient selection	Only enriched datasets used
20.	Deep learning for mass detection in Full Field Digital Mammograms	Agarwal et al[18]	Jun 2020	Exclude	High	Patient selection	Split dataset
21.	A novel solution based on scale invariant feature transform descriptors and deep learning for the detection of suspicious regions in mammogram images	Bruno et al[50]	Jul 2020	Exclude	High	Patient selection	Only enriched datasets used
22.	Deep feature-based automatic classification of mammograms	Arora et al[51]	Jun 2020	Exclude	High	Patient selection	Only enriched datasets used
23.	Detection and classification of the breast abnormalities in digital mammograms via regional Convolutional Neural Network	Al-Masni et al[16]	Jul 2017	Exclude	High	Patient selection	Only enriched datasets used
24.	Classification of whole mammogram and tomosynthesis images using deep convolutional neural networks	Zhang et al[52]	Jul 2018	Exclude	High	Patient selection	Cross-validation
25.	Detection of mass regions in mammograms by bilateral analysis adapted to breast density using similarity indexes and convolutional neural networks	Bandeira Diniz et al[20]	Mar 2018	Exclude	High	Patient selection	Only enriched datasets used
26.	Enhancing deep convolutional neural network scheme for breast cancer diagnosis with unlabeled data	Sun et al[53]	Apr 2017	Exclude	High	Patient selection	Split dataset
27.	Improving breast mass classification by shared data with domain transformation using a generative adversarial network	Muramatsu et al[54]	Apr 2020	Exclude	High	Patient selection	Cross-validation
28.	An interpretable classifier for high-resolution breast cancer screening images utilizing weakly supervised localization	Shen et al[55]	Dec 2020	Exclude	High	Patient selection	Split dataset
29.	Breast cancer detection using synthetic mammograms from generative adversarial networks in convolutional neural networks	Guan and Loew[56]	Jul 2019	Exclude	High	Patient selection	Split dataset of DDSM and GAN images generated from DDSM
30.	Detection of masses in mammograms using a one-stage object detector based on a deep convolutional neural network	Jung et al[21]	Sep 2018	Exclude	High	Patient selection	Testing only on enriched dataset (INbreast)
31.	A deep feature fusion methodology for breast cancer diagnosis demonstrated on three imaging modality datasets	Antropova et al[57]	Oct 2017	Exclude	High	Patient selection	Cross-validation
32.	Classification of mammogram images using multiscale all convolutional neural network (MA-CNN)	Agnes et al[58]	Dec 2019	Exclude	High	Patient selection	Testing only on enriched dataset (mini MIAS)
33.	Three-Class mammogram classification based on descriptive CNN features	Jadoon et al[59]	Jan 2017	Exclude	High	Patient selection	Cross-validation
34.	DeepCAT: deep computer-aided triage of screening mammography	Yi et al[32]	Jan 2021	Exclude	High	Patient selection	Only enriched datasets used, exclusion of microcalcification
35.	New convolutional neural network model for screening and diagnosis of mammograms	Zhang et al[60]	Aug 2020	Exclude	High	Patient selection	Testing only on enriched dataset (DDSM)
36.	Deep neural networks with region-based pooling structures for mammographic image classification	Shu et al[61]	Jun 2020	Exclude	High	Patient selection	Testing only on enriched datasets (INbreast, CBIS, DDSM)
37.	Classifying symmetrical differences and temporal change for the detection of malignant masses in mammography using deep neural networks	Kooi et al[63]	Oct 2017	Exclude	High	Patient selection	Split dataset
38.	Risks of feature leakage and sample size dependencies in deep feature extraction for breast mass classification	Samala et al[64]	Dec 2020	Exclude	High	Patient selection	Split dataset
39.	An ad hoc random initialization deep neural network architecture for discriminating malignant breast cancer lesions in mammographic images	Duggento et al[65]	May 2019	Exclude	High	Patient selection	Testing only on public dataset
40.	Comparison of segmentation-free and segmentation-dependent computer-aided diagnosis of breast masses on a public mammography dataset	Sawyer Lee et al[66]	Dec 2020	Exclude	High	Patient selection	Testing only on public dataset
41.	RAMS: Remote and automatic mammogram screening	Cogan et al[67]	Apr 2019	Exclude	High	Patient selection	Testing only on public dataset (INbreast)
42.	A multi-context CNN ensemble for small lesion detection	Savelli et al[22]	Mar 2020	Exclude	High	Patient selection	Only enriched dataset (INbreast), cross-validation
43.	Convolutional neural networks for the segmentation of microcalcification in mammography imaging	Valvano et al[23]	Apr 2019	Exclude	High	Patient selection	Split dataset
44.	Breast cancer detection using deep convolutional neural networks and support vector machines	Ragab et al[68]	Jan 2019	Exclude	High	Patient selection	Only enriched dataset (CBIS, DDSM)
45.	Globally-aware multiple instance classifier for breast cancer screening	Shen et al[69]	Oct 2019	Exclude	High	Patient selection	Split dataset
46.	A new approach to develop computer-aided diagnosis scheme of breast mass classification using deep learning technology	Qiu et al[70]	2017	Exclude	High	Patient selection	Cross-validation
47.	Malignancy detection on mammography using dual deep convolutional neural networks and genetically discovered false color input enhancement	Teare et al[71]	Aug 2017	Exclude	High	Patient selection	Only enriched datasets (DDSM, ZMDS) used
48.	Detecting asymmetric patterns and localizing cancers on mammograms	Guan et al[72]	Oct 2020	Exclude	High	Patient selection	Split dataset (DREAM)
49.	Digital mammographic tumor classification using transfer learning from deep convolutional neural networks	Huynh et al[74]	Jul 2016	Exclude	High	Patient selection	Cross-validation
50.	Evaluation of data augmentation via synthetic images for improved breast mass detection on mammograms using deep learning	Cha et al[26]	Jan 2020	Exclude	High	Patient selection	Only enriched datasets used (CBIS-DDSM)
51.	Stand-alone artificial intelligence for breast cancer detection in mammography: comparison with 101 radiologists	Rodriguez-Ruiz et al[73]	Sep 2019	Exclude	High	Patient selection	Enriched private datasets used
52.	Detection of breast cancer with mammography: effect of an artificial intelligence support system	Rodriguez-Ruiz et al[75]	Feb 2019	Exclude	High	Patient selection	Nonconsecutive sample
53.	Evaluation of combined artificial intelligence and radiologist assessment to interpret screening mammograms	Schaffter et al[76]	Mar 2020	Include	Low
54.	Artificial intelligence for breast cancer detection in mammography: experience of use of the ScreenPoint Medical Transpara system in 310 Japanese women	Sasaki et al[77]	Jul 2020	Exclude	High	Patient selection	Nonconsecutive dataset
55.	Aiding the digital mammogram for detecting the breast cancer using Shearlet transform and neural network	Shenbagavalli and Thangarajan[78]	Sep 2018	Exclude	High	Patient Selection	Only enriched datasets used (DDSM)
56.	Assessing breast cancer risk with an artificial neural network	Sepandi et al[79]	Apr 2018	Exclude	High	Patient Selection	Cross-validation
57.	Can we reduce the workload of mammographic screening by automatic identification of normal exams with artificial intelligence? A feasibility study	Rodriguez-Ruiz et al[80]	Sep 2019	Exclude	High	Patient Selection	Only enriched datasets used
58.	Changes in cancer detection and false-positive recall in mammography using artificial intelligence: a retrospective, multireader study	Kim et al[81]	Mar 2020	Exclude	High	Patient selection	Nonconsecutive sample
59.	Transfer representation learning using inception-v3 for the detection of masses in mammography	Mednikov et al[82]	Jul 2018	Exclude	High	Patient selection	Only enriched datasets (INbreast) used
60.	A two-stage multiple instance learning framework for the detection of breast cancer in mammograms	Sarath et al[24]	Jul 2020	Exclude	High	Patient selection	Only enriched datasets used (INbreast)
61.	A hybridized ELM for automatic micro calcification detection in mammogram images based on multi-scale features	Melekoodappattu and Subbian[83]	May 2019	Exclude	High	Patient selection	Cross-validation, testing only on public dataset (MIAS)
62.	Applying a new quantitative image analysis scheme based on global mammographic features to assist diagnosis of breast cancer	Chen et al[84]	Oct 2019	Exclude	High	Patient selection	Cross-validation
63.	Convolutional neural networks for mammography mass lesion classification	Arevalo et al[85]	Aug 2015	Exclude	High	Patient selection	Only enriched datasets used (BCDR)
64.	Pareto-optimal multi-objective dimensionality reduction deep auto-encoder for mammography classification	Taghanaki et al[86]	Jul 2017	Exclude	High	Patient selection	Only enriched datasets used (IRMA, INbreast)
65.	Breast mass detection in digital mammogram based on Gestalt psychology	Wang et al[25]	May 2018	Exclude	High	Patient selection	Only enriched datasets used (DDSM, MIAS)
66.	A novel cascade classifier for automatic microcalcification detection	Shin et al[28]	Dec 2015	Exclude	High	Patient selection	Only enriched datasets used (MIAS, mini-MIAS)
67.	Ensemble of convolutional neural networks for classification of breast microcalcification from mammograms	Sert et al[87]	Jul 2017	Exclude	High	Patient selection	Only enriched datasets used (DDSM)
68.	A new approach to develop computer-aided detection schemes of digital mammograms	Tan et al[88]	Jun 2015	Exclude	High	Patient selection	Cross-validation
69.	A CAD system to analyze mammogram images using fully complex-valued relaxation neural network ensembled classifier	Saraswathi and Srinivasan[89]	Oct 2014	Exclude	High	Patient selection	Only enriched datasets used (MIAS)
70.	Automated breast cancer detection in digital mammograms of various densities via deep learning	Suh et al[90]	Nov 2020	Exclude	High	Patient selection	Split dataset
71.	A deep feature based framework for breast masses classification	Jiao et al[91]	Feb 2016	Exclude	High	Patient Selection	Only enriched datasets (ImageNet LSRVC, DDSM) used
72.	Discriminating solitary cysts from soft tissue lesions in mammography using a pretrained deep convolutional neural network	Kooi et al[92]	Mar 2017	Exclude	High	Patient selection	Cross-validation
73.	Global detection approach for clustered microcalcifications in mammograms using a deep learning network	Wang et al[27]	Apr 2017	Exclude	High	Patient selection	Split dataset
74.	Computer-aided mammogram diagnosis system using deep learning convolutional fully complex-valued relaxation neural network classifier	Duraisamy and Emperumal[93]	Dec 2017	Exclude	High	Patient selection	Only enriched datasets (MIAS + BCDR) used
75.	Deep learning versus classical neural approach to mammogram recognition	Kurek et al[94]	Dec 2018	Exclude	High	Patient selection	Only enriched datasets (DDSM) used
76.	A parasitic metric learning net for breast mass classification based on mammography	Jiao et al[95]	Mar 2018	Exclude	High	Patient selection	Only enriched datasets used (DDSM)
77.	An automatic computer-aided diagnosis system for breast cancer in digital mammograms via deep belief network	Al-antari et al[96]	Sep 2017	Exclude	High	Patient selection	Only enriched datasets (DDSM) used
78.	A context-sensitive deep learning approach for microcalcification detection in mammograms	Wang and Yangl[29]	June 2018	Exclude	High	Patient selection	Dataset collection method unlikely consecutive
79.	Multi-view feature fusion based four views model for mammogram classification using convolutional neural network	Nasir Khan et al[99]	Nov 2019	Exclude	High	Patient selection	Only enriched datasets (CBIS-DDSM, MIAS) used
80.	Detection of abnormalities in mammograms using deep features	Tavakoli et al[103]	Dec 2019	Exclude	High	Patient selection	Only enriched dataset (MIAS), split dataset
81.	A deep learning approach for breast cancer mass detection	Fathy and Ghoneim[30]	2019	Exclude	High	Patient selection	Only enriched dataset (DDSM), split dataset
82.	A new triplet convolutional neural network for classification of lesions on mammograms	Medjeded et al[100]	Oct 2019	Exclude	High	Patient selection	Only enriched datasets (DDSM and MIAS) used
83.	Multi-view convolutional neural networks for mammographic image classification	Sun et al[101]	Sep 2019	Exclude	High	Patient Selection	Only enriched datasets (MIAS, DDSM) used
84.	Transferring deep neural networks for the differentiation of mammographic breast lesions	Yu et al[102]	Dec 2018	Exclude	High	Patient selection	Only enriched datasets (BCDR) used
85.	Deep learning for breast cancer diagnosis from mammograms—a comparative study	Tsochatzidis et al[118]	Mar 2019	Exclude	High	Patient Selection	Only enriched datasets (CBIS-DDSM, DDSM) used
86.	Application of deep learning in the detection of breast lesions with four different breast densities	Li et al[31]	July 2021	Exclude	High	Patient selection	Enriched private dataset used for testing
87.	Breast mass detection in mammography based on image template matching and CNN	Sun et al[33]	Apr 2021	Exclude	High	Patient selection	Only enriched datasets (DDSM) used
88.	Impact of image compression on deep learning-based mammogram classification	Jo et al[104]	Apr 2021	Exclude	High	Patient Selection	Cross-validation
89.	Improving the prediction of benign or malignant breast masses using a combination of image biomarkers and clinical parameters	Cui et al[105]	Mar 2021	Exclude	High	Patient selection	• Split dataset • Exclusion of benign images
90.	Compare and contrast: detecting mammographic soft-tissue lesions with C 2-Net	Liu et al[37]	Jul 2021	Exclude	High	Patient selection	Split dataset
91.	Deep convolutional neural network and emotional learning based breast cancer detection using digital mammography	Chouhan et al[107]	May 2021	Exclude	High	Patient selection	Only enriched datasets (IRMA) used
92.	Microscopic tumour classification by digital mammography	Yang et al[106]	Feb 2021	Exclude	High	Patient selection	Split dataset
93.	A framework for breast cancer classification using Multi-DCNNs	Ragab et al[108]	Apr 2021	Exclude	High	Patient selection	Only enriched datasets (DDSM, MIAS) used
94.	Integrating segmentation information into CNN for breast cancer diagnosis of mammographic masses	Tsochatzidis et al[109]	Mar 2021	Exclude	High	Patient selection	Only enriched datasets (DDSM, CBIS-DDSM) used
95.	YOLO based breast masses detection and classification in full-field digital mammograms	Aly et al[34]	Mar 2021	Exclude	High	Patient selection	Only enriched datasets (INbreast) used
96.	Computer vision-based microcalcification detection in digital mammograms using fully connected depthwise separable convolutional neural network	Rehman et al[111]	Jul 2021	Exclude	High	Patient Selection	Only enriched datasets (DDSM, Pinum) used
97.	Presentation of novel hybrid algorithm for detection and classification of breast cancer using growth region method and probabilistic neural network	Isfahani et al[35]	Jun 2021	Exclude	Unclear	Patient selection	Only enriched datasets (DDSM, BIRADS) used
98.	Pattern classification for breast lesion on FFDM by integration of radiomics and deep features	Zhang et al[110]	Jun 2021	Exclude	Unclear	Patient selection	• Nonconsecutive sample • Split dataset
99.	Multi-scale attention-based convolutional neural network for classification of breast masses in mammograms	Niu et al[97]	Jul 2021	Exclude	High	Patient selection	Only enriched datasets (DDSM INbreast) used
100.	Mammogram mass segmentation and detection using Legendre neural network-based optimal threshold	Sarangi et al[36]	Apr 2021	Exclude	High	Patient selection	Only enriched datasets (MIAS) used
101.	Optimized radial basis neural network for classification of breast cancer images	Rajathi et al[98]	2021	Exclude	High	Patient selection	Only enriched datasets (MIAS) used
102.	External evaluation of 3 commercial artificial intelligence algorithms for independent assessment of screening mammograms	Salim et al[112]	2020	Exclude	High	Patient selection	Case control design
103.	Robust breast cancer detection in mammography and digital breast tomosynthesis using an annotation-efficient deep learning approach	Lotter et al[117]	Feb 2021	Include	Low
104.	Identifying normal mammograms in a large screening population using artificial intelligence	Lång et al[62]	2020	Include	Low
105.	Improving breast cancer detection accuracy of mammography with the concurrent use of an artificial intelligence tool	Pacilè et al[115]	2020	Exclude	High	Patient selection	Enriched private dataset used
106.	Improved cancer detection using artificial intelligence: a retrospective evaluation of missed cancers on mammography	Watanabe et al[116]	2019	Exclude	High	Patient selection	Enriched private dataset used
107.	International evaluation of an AI system for breast cancer screening	McKinney et al[113]	Jan 2020	Include	Low

Abbreviations: BCDR, Breast Cancer Digital Repository; MIAS, Mammographic Image Analysis Society; DDSM Digital Database for Screening Mammography.

#

Clinical Study Design

Studies with a robust clinical study design as ascertained by mQUADAS-2 are enlisted in [Table 2]. All of the studies involved retrospective patient recruitment. The study by Lång and colleagues[62] studied the utility of AI in triage of normal mammograms, while the study by Schaffter and colleagues[76] studied the performance of AI as a second reader to a radiologist. Two studies, those by Lotter and colleagues[117] and McKinney and colleagues,[113] compared the performance of AI and radiologists on similar enriched datasets. In three studies, the evaluation of AI as a standalone reader was also reported.

Table 2
Details of studies with adequate clinical design as per mQUADAS-2 tool
Author	Testing dataset	Training dataset	Task performed	Classification level	BB annotation	Model availability	Patient recruitment	Follow-up period for negative studies	Location of cancer; indicated	External validation set (country/continent/race)	Limitation/explainability
Lång et al[62]	Testing: screening exams 9,581 (Malmö Breast Tomosynthesis Screening Trial)	NA	Network provides a continuous score ranging between 1 and 10 representing the level of suspicion of cancer present	Patient level, including both CC and MLO views	NA	Transpara 1.4.0	Retrospective	Nil	No	Urban Swedish population	Failure analysis performed- both small and large cancers missed. 85.7% missed cancers in dense breasts.
McKinney et al[113]	UK test set: 25,856 U.S. test set 3,097	UK set: (OPTIMAM) 13,918 (train) 62,866 (tune) US set: 12,224 (training) 3,334 (tuning)	AI standalone Comparison of AI with radiologist	Patient level	Yes	Available ([Table 3])	Retrospective	≥21 mo	Yes	Trained on UK population, evaluated on U.S. population	Failure analysis performed: yes Explainability: only localization
Lotter et al[117]	Screening data: 2,743 (OMI-DB), 7,951 (private US) Diagnostic data: 1,533 (private China)	Screening data: OMI-DB (23,396), DDSM (2,282), Private US (48,714)	AI standalone AI vs. radiologist	Patient level	Yes	Available ([Table 3])	Retrospective	18 mo: private testing dataset	Yes	Trained on UK and U.S. datasets, tested on U.S. and Chinese dataset	Failure analysis performed: yes Explainability: only localization
Schaffter et al[76]	68,026 Sweden (screening examinations)	Screening examinations Private (59,923) Private: 25,657 (validation)	AI standalone AI and radiologist	Patient level	Yes	Available ([Table 3])	Retrospective	12 mo: KPW 18–24 mo: KI	No	Ext validation set based on Stockholm Sweden, KI	Failure analysis performed: yes Explainability: lower performance when compared with consensus radiologist interpretation, since trained with only single radiologist interpretation

Abbreviations: AI, artificial intelligence; BB, bounding box; KI, Karolinska Institute; KPW, Kaiser Permanente Washington; NA, not applicable.

Training data used included the OPTIMAM dataset from the United Kingdom along with private datasets from U.S. hospitals ranging in size from 12,223 exams to 48,714 exams. The study by Schaffter et al[76] trained their model on a large dataset from the Kaiser Permanente Washington, comprising 85,580 exams which was part of the DREAM mammography challenge. Testing data size ranged from 68,026 exams from the Karolinska Institute, Sweden, which was part of the DREAM challenge, to 3,097 exams from a single institute in the United States used by McKinney et al.[113] The smallest subset used for testing was 1,533 diagnostic exams from a single institute in China used by Lotter et al.[117] In three studies, test data came from a different continent in comparison to training data. Histopathology was used for cancer proof in all studies. Length of follow-up for labeling an exam as normal or benign, ranged from 12 to 27 months. All the networks made comprehensive predictions for the entire examination, including both cranio-caudal and medio-lateral-oblique views of a patient. Image-level predictions were not made by any networks. Localization of cancer for accuracy prediction was described in two studies[113] [117] while the rest of the studies did not provide any location information. Nearly all of the studies were performed in a screening setting, only the study by Lotter et al[117] tested their network on an enriched diagnostic dataset from an institute in China.

#

Common Practices Used for AI Model Design and Training

There were three studies which qualified mQUADAS-2 and described their models[76] [113] [117] (instead of using a commercial software). The three studies described eight models, which are described below.

All three studies described multi-stage pipelines,[113] [117] and two of the three studies used ensembles.[76] [113] All studies used lesion-level labels at some stage in their pipeline. All studies also attempted to use high resolution of images at some stage in their pipeline. The input resolution ranged from 1,100 × 600 to full resolution of 3,328 × 4,096. These were provided either through patches generated at full resolution[76] [117] or as direct input of full-resolution images.[113] Only one of these three studies explicitly described use of medically relevant information from the opposite breast and opposite view.[113] Common data augmentation techniques included resizing, rotations, and vertical flipping. Two of the models[76] [113] also used patient metadata such as age to attempt to improve their performance. A summary of description of the key idea in each model is given in [Table 3]. [Fig. 2] summarizes the workflow among these four studies that were analyzed in detail.

Table 3
Analysis of models employed by studies with adequate clinical design
Author	Model description
Schaffter et al[76]	Ensemble, each model of the ensemble came from the top winners in a grand challenge.
	The first model, developed by Therapixel, was a modification of VGG Net. The network was modified to reduce the number of parameters, so that it could accept a larger input size of image. The team reduced the resolution of DM images to 1152 × 832 pixels. They also reduced the number of pooling layers to detect fine features. To deal with the problem of the image having a weak signal due to presence of very small object in comparison to size of image, they first pretrained with strongly labeled data (with image patches with position information). To deal with class imbalance, they trained this with minibatches containing equal number of negative and positive samples.
	The second model developed by Ribli et al was an object detection network, and predictions were used to generate classification scores. They trained a faster RCNN on public data and some hand-annotated component of the challenge data.
	The third model developed by Guan et al[72] trained multiple segmentation models (four different models) and combined the result of these four models. The models used a combination of high-resolution images with a sliding window approach for calcification detection and low-resolution images for mass detection. They also trained the model using public datasets which contained location information, like the other authors.
	The final model developed by DeepHealth consisted of two patch level classifiers (ResNet) at two different scales for microcalcifications and masses. They used these to initialize the whole image classifier with a scanning window approach.
McKinney et al[113]	Ensemble of three models, each working at a different level of interpretation of mammograms (lesion level, breast level, and case level), each model producing a breast cancer risk score between 0 and 1 for the entire patient.
	First stage of MODEL 1 was a RetinaNet object detector trained on full mammogram images rescaled to 2,048 × 2,048. Rectangular bounding boxes were produced along with a confidence score, and the top 10 boxes among all 4 views were chosen. These patches were rescaled to 409 × 409 and a corresponding patch from the opposite breast was chosen after rough registration of the breasts. Along with this, patient age, laterality, detection coordinates, and view were concatenated. This was passed through a Mobilenet architecture. A cancer score was obtained for each patch which was combined into a case level score. The second stage of MODEL 1 took these fixed size detections and trained them with a classification model that used case level labels. At train time, 5 such crops were used per case, and at test time, 10 such crops were used, and average predictions determined.
	MODEL 2 was a breast level model. Here each image after augmentation was run through a ResNet 50 feature extractor and the final feature vector obtained from all four breasts were concatenated. This concatenated feature vector was run though a few residual blocks, convolutional blocks, and then an average pool was performed to obtain a prediction score per breast.
	MODEL 3 was a case level model, this also involved a ResNet as a feature extractor from each of the four images. Data augmentation was used and input size of 2,048 × 2,048 was used. The four feature vectors were concatenated and a single hidden layer of size 512 was applied to the combined feature vector followed by a binary classification. This ResNet was initialized with trained weights of the backbone of the object detector used by MODEL 1.
Lotter et al[117]	3-stage model. In the first stage a ResNet classifier was trained on patches of 275 × 275 obtained from full mammogram images. In the first stage they performed a 5-class classification into mass, calcification, focal asymmetry, architectural distortion, or no lesion. The same classifier was further trained to give a 3-class classification as normal, benign, or malignant. In the next stage (stage 2), this trained ResNet weights were used to initialize the backbone for a RetinaNet object detector. The images for RetinaNet were resized to 1,750 pixels (other dimension modified to maintain aspect ratio). Stage 3 consisted of a multi-instance learning-based object detector trained with only image-level labels.

Abbreviation: DM, digital mammography.

Fig. 2 Summary of detailed analysis of studies which qualified mQUADAS-2.

#

Performance of AI

Since all studies have reported performance on widely different datasets, they are not directly comparable. However, within the category of studies which were assessed as being high quality, similar methodology was used to curate the data. Therefore, these results are tabulated in [Table 4].

Table 4.
Reported results for various tasks performed by the AI networks
Author	AI standalone AUC	AI+ Radiologist AUC	Radiologist standalone AUC	Others
Mckinney et al[113]		NA	ROC curve encompasses average radiologist performance point 0.625 (SD 0.032)	Model sensitivity: 56.24% Model specificity: 84.29% Non-inferiority compared to radiologist
US dataset	0.757 (0.732-0.780)
Enriched dataset (465)	0.740 (0.696-0.794)
Lotter et al[117]		Nil	0.891 (±0.019) (best reader AUC)	Model sensitivity: 96.2% (91.7-99.2) 14.2% higher than radiologist Model specificity: 90.9% (84.9-96.1) 24% higher than radiologist
US dataset	0.927 ± 0.008
Enriched dataset (285)	0.945 (0.919-0.968)
Schaffter et al[76] Sweden dataset: Ensemble model	0.923	0.955 (consensus radiologist)		Specificity model: 92.5% Radiologist 96.7% (96.6-96.8) Combined model plus radiologist 98.5% (98.4-98.6)
Lang et al[62]	–	–	–	Missed cancers= 10.3% (3.1–17.5)

Abbreviations: AI, artificial intelligence; AUC, area under the curve; ROC, receiver operating curve; SD, standard deviation.

Two studies[113] [117] compared the performance of AI against a radiologist. Although both performed this analysis only on a small enriched subset of their dataset, both reported a slightly higher performance of AI in comparison to the radiologist. One study compared the performance of radiologists with and without AI, and showed that the performance of radiologists with AI is better than either the radiologist or AI alone.[76]

All studies provided localization-based explainability, though only one evaluated localization accuracy by means of mROC curves[113] (another study provided lesion detection accuracy; however, this was restricted to location in terms of laterality and quadrant[117]). This was also only in a small subset of the test population. No other form of interpretability or explainability has been attempted in any study.

#
#

Discussion

In this review we found that although a very large number of studies have been published in scientific literature on DL in mammography, a very miniscule number of these have actually tested their results in a robust clinical study. Importantly, no study offers any explainability beyond identification of lesions (either by bounding box prediction or saliency maps).

We identified four studies which tested their results in a reproducible manner,[62] [76] [113] [117] out of which three described their in-house models. For these we also describe the practices they used for model building.

Common Practices for Model Design

All identified studies had some common features in model design. First, all of them attempted to use images with as high resolution as possible, at some stage in the network. This stresses on the importance of the fact that despite memory constraints, it is important to preserve the resolution of images while giving them as input to neural networks. This is consistent with medical knowledge on the need for exceptionally high spatial resolution for mammograms. Second, all authors stress on the importance of using precise location information on cancers. This is because the malignant lesion tends to occupy a very small portion of the image. Therefore, purely classification networks which work only on image-level labels tend to perform far inferior to studies which use location of cancer. Third, all networks use transfer learning from natural images and use some form of data augmentation. Fourth, medically relevant information such as metadata and information from opposite view and opposite breast adds greatly to network performance. All the above point toward the importance of core radiology knowledge in network design. While all networks provide lesion location as a means of explainability and to check saliency of network predictions, none of the networks have explicitly discussed any other means of studying explainability.

#

Common Practices for Clinical Design

All the identified studies were retrospective and performed in a screening environment. All networks used large datasets for training, and tested on datasets ranging from 3,000 to 68,000 mammograms[76] [113]. While all the studies concluded that AI can be used for triage, or as an assistant to a radiologist as a second reader to improve accuracy, no analysis has been performed to understand the effect of false positives suggested by AI on the recall tendency of the radiologist. All studies mention the number of false-negative (missed) cancers, and some even compare the numbers with the corresponding numbers missed by radiologists in their studies. The characteristics of cancers missed by AI have also been analyzed by authors,[62] [113] [117] to determine patterns based on breast density, tumor size, and histological type, among others, but no consistent patterns emerged that could provide a medically sound reason for the miss. This would be of great importance in the event of potential deployment, where it would be of vital importance to explain to a patient why her cancer may have been missed by AI. In addition, among the four studies that we analyzed, only two studies mentioned confidence intervals of area under the curve for ROC curves in the results,[113] [117] calling into question the possible variability in results described by the other studies. An objective measure of localization accuracy, determined by the mROC curve, was also mentioned in only a single study of these four. This is however understandable, as evaluating localization accuracy would need lesion-level labels for the entire test dataset, which would be very expensive to obtain.

Studies that report detection of interval cancers on pre-index mammograms do not mention the specificity level at which the cancer was caught on the pre-index study. Thus, how this would translate in a real-world setting remains to be seen.

Evaluation on a diagnostic mammography dataset was performed only in a single study,[117] which tested on an enriched dataset that consisted of a consecutive sample of cancers (34.8%) along with a random sample of noncancers (63.2%). Similarly, this was the only dataset from a previously unscreened population. Thus, little is known on how these networks would behave when deployed in such an environment. There were no studies that tested the AI on computed radiography systems, which are still present in many developing countries.

A recently published systematic review by Uzun Ozsahin et al[120] similarly highlights the differences and inhomogeneity in the developmental methodologies of AI algorithms but with a general sense of improvement in the quality of studies with passing time.

#

Overall Assessment of Position of AI in Breast Imaging

As radiology, like every other specialty in medicine and indeed every other industry, gears up for a transformation in the form of introduction of AI within the work-flow, reproducibility and explainability of neural networks form the essential building blocks of such implementation.

We found in our review that both reproducibility and explainability continue to stand in question, and would need significantly more research prior to potential clinical deployment. We thus suggest these to be important check-points for radiologists, when attempting to assess commercially available algorithms for deployment in their department. We also refer the readers to the MICCAI reproducibility checklist[7] and the CLAIM checklist[8] while designing a study to ensure their studies are of adequate quality. In our study, two algorithms performed better than radiologists at classifying mammograms; however, these had relatively small testing datasets. On the other hand, in the study with the largest testing dataset, radiologist reading showed considerably higher specificity. While it is clear that when used in the correct clinical scenario, AI holds great potential, a nuanced view should be taken to how and in what capacity it may be deployed, and where it can provide real clinical benefit.

#
#
#

Conflict of Interest

None declared.

Acknowledgment

We acknowledge the effort of our data entry operator Hema Malhothra, for her meticulous work.

Supplementary Material

Supplementary Material

References
1 Warren Burhenne LJ, Wood SA, D'Orsi CJ. et al. Potential contribution of computer-aided detection to the sensitivity of screening mammography. Radiology 2000; 215 (02) 554-562

Crossref PubMed Google Scholar
2 Birdwell RL, Ikeda DM, O'Shaughnessy KF, Sickles EA. Mammographic characteristics of 115 missed cancers later detected with screening mammography and the potential utility of computer-aided detection. Radiology 2001; 219 (01) 192-202

Crossref PubMed Google Scholar
3 Birdwell RL, Bandodkar P, Ikeda DM. Computer-aided detection with screening mammography in a university hospital setting. Radiology 2005; 236 (02) 451-457

Crossref PubMed Google Scholar
4 Brem RF, Baum J, Lechner M. et al. Improvement in sensitivity of screening mammography with computer-aided detection: a multiinstitutional trial. Am J Roentgenol 2003; 181 (03) 687-693

Crossref PubMed Google Scholar
5 Freeman K, Geppert J, Stinton C. et al. Use of artificial intelligence for image analysis in breast cancer screening programmes: systematic review of test accuracy. BMJ 2021; 374: n1872

Crossref PubMed Google Scholar
6 Wang X, Liang G, Zhang Y, Blanton H, Bessinger Z, Jacobs N. Inconsistent performance of deep learning models on mammogram classification. J Am Coll Radiol 2020; 17 (06) 796-803

Crossref PubMed Google Scholar
7 miccai reproducibility - Google Search. Accessed May 16, 2022 at: https://www.google.com/search?q=miccai+reproducibility&oq=miccai+&aqs=chrome.1.69i57j35i39j0i512j0i20i263i512j0i512l6.4510j0j4&sourceid=chrome&ie=UTF-8
8 Mongan J, Moy L, Kahn Jr CE. Checklist for Artificial Intelligence in Medical Imaging (CLAIM): a guide for authors and reviewers. Radiol Artif Intell 2020; 2 (02) e200029

Crossref PubMed Google Scholar
9 Li X, Xiong H, Li X. et al. Interpretable deep learning: Interpretation, interpretability, trustworthiness, and beyond. Knowl Inf Syst 2022; 64 (12) 3197-3234

Crossref Google Scholar
10 McInnes MDF, Moher D, Thombs BD. et al; and the PRISMA-DTA Group. Preferred Reporting Items for a Systematic Review and Meta-analysis of Diagnostic Test Accuracy Studies: the PRISMA-DTA statement. JAMA 2018; 319 (04) 388-396

Crossref PubMed Google Scholar
11 Whiting PF, Rutjes AWS, Westwood ME. et al; QUADAS-2 Group. QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med 2011; 155 (08) 529-536

Crossref PubMed Google Scholar
12 Kooi T, Litjens G, van Ginneken B. et al. Large scale deep learning for computer aided detection of mammographic lesions. Med Image Anal 2017; 35: 303-312

Crossref PubMed Google Scholar
13 Ribli D, Horváth A, Unger Z, Pollner P, Csabai I. Detecting and classifying lesions in mammograms with Deep Learning. Sci Rep 2018; 8 (01) 4165

Crossref PubMed Google Scholar
14 Dhungel N, Carneiro G, Bradley AP. A deep learning approach for the analysis of masses in mammograms with minimal user intervention. Med Image Anal 2017; 37: 114-128

Crossref PubMed Google Scholar
15 Al-Antari MA, Al-Masni MA, Choi MT, Han SM, Kim TS. A fully integrated computer-aided diagnosis system for digital X-ray mammograms via deep learning detection, segmentation, and classification. Int J Med Inform 2018; 117: 44-54

Crossref PubMed Google Scholar
16 Al-Masni MA, Al-Antari MA, Park JM. et al. Detection and classification of the breast abnormalities in digital mammograms via regional Convolutional Neural Network. Annu Int Conf IEEE Eng Med Biol Soc 2017; 2017: 1230-1233

PubMed Google Scholar
17 Li C, Zhang D, Tian Z, Du S, Qu Y. Few-shot learning with deformable convolution for multiscale lesion detection in mammography. Med Phys 2020; 47 (07) 2970-2985

Crossref PubMed Google Scholar
18 Agarwal R, Díaz O, Yap MH, Lladó X, Martí R. Deep learning for mass detection in full field digital mammograms. Comput Biol Med 2020; 121: 103774

Crossref PubMed Google Scholar
19 Al-Masni MA, Al-Antari MA, Park JM. et al. Simultaneous detection and classification of breast masses in digital mammograms via a deep learning YOLO-based CAD system. Comput Methods Programs Biomed 2018; 157: 85-94

Crossref PubMed Google Scholar
20 Bandeira Diniz JO, Bandeira Diniz PH, Azevedo Valente TL, Corrêa Silva A, de Paiva AC, Gattass M. Detection of mass regions in mammograms by bilateral analysis adapted to breast density using similarity indexes and convolutional neural networks. Comput Methods Programs Biomed 2018; 156: 191-207

Crossref PubMed Google Scholar
21 Jung H, Kim B, Lee I. et al. Detection of masses in mammograms using a one-stage object detector based on a deep convolutional neural network. PLoS One 2018; 13 (09) e0203355

Crossref PubMed Google Scholar
22 Savelli B, Bria A, Molinara M, Marrocco C, Tortorella F. A multi-context CNN ensemble for small lesion detection. Artif Intell Med 2020; 103: 101749

Crossref PubMed Google Scholar
23 Valvano G, Santini G, Martini N. et al. Convolutional neural networks for the segmentation of microcalcification in mammography imaging. J Healthc Eng 2019; 2019: 9360941

Crossref PubMed Google Scholar
24 Sarath CK, Chakravarty A, Ghosh N, Sarkar T, Sethuraman R, Sheet D. A two-stage multiple instance learning framework for the detection of breast cancer in mammograms. Annu Int Conf IEEE Eng Med Biol Soc 2020; 2020: 1128-1131

PubMed Google Scholar
25 Wang H, Feng J, Bu Q. et al. Breast mass detection in digital mammogram based on gestalt psychology. J Healthc Eng 2018; 2018: 4015613

Crossref PubMed Google Scholar
26 Cha KH, Petrick N, Pezeshk A. et al. Evaluation of data augmentation via synthetic images for improved breast mass detection on mammograms using deep learning. J Med Imaging (Bellingham) 2020; 7 (01) 012703

PubMed Google Scholar
27 Wang J, Nishikawa RM, Yang Y. Global detection approach for clustered microcalcifications in mammograms using a deep learning network. J Med Imaging (Bellingham) 2017; 4 (02) 024501

Crossref PubMed Google Scholar
28 Shin SY, Lee S, Yun ID. et al. A novel cascade classifier for automatic microcalcification detection. PLoS One 2015; 10 (12) e0143725

Crossref PubMed Google Scholar
29 Wang J, Yang Y. A context-sensitive deep learning approach for microcalcification detection in mammograms. Pattern Recognit 2018; 78: 12-22

Crossref PubMed Google Scholar
30 Fathy W, Ghoneim A. A deep learning approach for breast cancer mass detection. Int J Adv Comput Sci Appl 2019; 10: 175-182

Google Scholar
31 Li H, Ye J, Liu H. et al. Application of deep learning in the detection of breast lesions with four different breast densities. Cancer Med 2021; 10 (14) 4994-5000

Crossref PubMed Google Scholar
32 Yi PH, Singh D, Harvey SC, Hager GD, Mullen LA. DeepCAT: deep computer-aided triage of screening mammography. J Digit Imaging 2021; 34 (01) 27-35

Crossref PubMed Google Scholar
33 Sun L, Sun H, Wang J, Wu S, Zhao Y, Xu Y. Breast mass detection in mammography based on image template matching and CNN. Sensors (Basel) 2021; 21 (08) 2855

Crossref PubMed Google Scholar
34 Aly GH, Marey M, El-Sayed SA, Tolba MF. YOLO based breast masses detection and classification in full-field digital mammograms. Comput Methods Programs Biomed 2021; 200: 105823

Crossref PubMed Google Scholar
35 Isfahani ZN, Jannat-Dastjerdi I, Eskandari F, Ghoushchi SJ, Pourasad Y. Presentation of novel hybrid algorithm for detection and classification of breast cancer using growth region method and probabilistic neural network. Comput Intell Neurosci 2021; 2021: 5863496

Crossref PubMed Google Scholar
36 Sarangi S, Rath NP, Sahoo HK. Mammogram mass segmentation and detection using Legendre neural network-based optimal threshold. Med Biol Eng Comput 2021; 59 (04) 947-955

Crossref PubMed Google Scholar
37 Liu Y, Zhou C, Zhang F. et al. Compare and contrast: detecting mammographic soft-tissue lesions with C²-Net. Med Image Anal 2021; 71: 101999

Crossref PubMed Google Scholar
38 Shen L, Margolies LR, Rothstein JH, Fluder E, McBride R, Sieh W. Deep learning to improve breast cancer detection on screening mammography. Sci Rep 2019; 9 (01) 12495

Crossref PubMed Google Scholar
39 Aboutalib SS, Mohamed AA, Berg WA, Zuley ML, Sumkin JH, Wu S. Deep learning to distinguish recalled but benign mammography images in breast cancer screening. Clin Cancer Res 2018; 24 (23) 5902-5909

Crossref PubMed Google Scholar
40 Becker AS, Marcon M, Ghafoor S, Wurnig MC, Frauenfelder T, Boss A. Deep learning in mammography: diagnostic accuracy of a multipurpose image analysis software in the detection of breast cancer. Invest Radiol 2017; 52 (07) 434-440

Crossref PubMed Google Scholar
41 Wang J, Yang X, Cai H, Tan W, Jin C, Li L. Discrimination of breast cancer with microcalcifications on mammography by deep learning. Sci Rep 2016; 6: 27327

Crossref PubMed Google Scholar
42 Arevalo J, González FA, Ramos-Pollán R, Oliveira JL, Guevara Lopez MA. Representation learning for mammography mass lesion classification with convolutional neural networks. Comput Methods Programs Biomed 2016; 127: 248-257

Crossref PubMed Google Scholar
43 Yala A, Schuster T, Miles R, Barzilay R, Lehman C. A deep learning model to triage screening mammograms: a simulation study. Radiology 2019; 293 (01) 38-46

Crossref PubMed Google Scholar
44 Samala RK, Chan HP, Hadjiiski LM, Helvie MA, Cha KH, Richter CD. Multi-task transfer learning deep convolutional neural network: application to computer-aided diagnosis of breast cancer on mammograms. Phys Med Biol 2017; 62 (23) 8894-8908

Crossref PubMed Google Scholar
45 He T, Puppala M, Ezeana CF. et al. A deep learning-based decision support tool for precision risk assessment of breast cancer. JCO Clin Cancer Inform 2019; 3: 1-12

Crossref PubMed Google Scholar
46 Kim ST, Lee JH, Lee H, Ro YM. Visually interpretable deep network for diagnosis of breast masses on mammograms. Phys Med Biol 2018; 63 (23) 235025

Crossref PubMed Google Scholar
47 Carneiro G, Nascimento J, Bradley AP. Automated analysis of unregistered multi-view mammograms with deep learning. IEEE Trans Med Imaging 2017; 36 (11) 2355-2365

Crossref PubMed Google Scholar
48 Chougrad H, Zouaki H, Alheyane O. Deep convolutional neural networks for breast cancer screening. Comput Methods Programs Biomed 2018; 157: 19-30

Crossref PubMed Google Scholar
49 Cai H, Huang Q, Rong W. et al. Breast microcalcification diagnosis using deep convolutional neural network from digital mammograms. Comput Math Methods Med 2019; 2019: 2717454

Google Scholar
50 Bruno A, Ardizzone E, Vitabile S, Midiri M. A novel solution based on scale invariant feature transform descriptors and deep learning for the detection of suspicious regions in mammogram images. J Med Signals Sens 2020; 10 (03) 158-173

Crossref Google Scholar
51 Arora R, Rai PK, Raman B. Deep feature-based automatic classification of mammograms. Med Biol Eng Comput 2020; 58 (06) 1199-1211

Crossref PubMed Google Scholar
52 Zhang X, Zhang Y, Han EY. et al. Classification of whole mammogram and tomosynthesis images using deep convolutional neural networks. IEEE Trans Nanobiosci 2018; 17 (03) 237-242

Crossref PubMed Google Scholar
53 Sun W, Tseng TB, Zhang J, Qian W. Enhancing deep convolutional neural network scheme for breast cancer diagnosis with unlabeled data. Comput Med Imaging Graph 2017; 57: 4-9

Crossref PubMed Google Scholar
54 Muramatsu C, Nishio M, Goto T. et al. Improving breast mass classification by shared data with domain transformation using a generative adversarial network. Comput Biol Med 2020; 119: 103698

Crossref PubMed Google Scholar
55 Shen Y, Wu N, Phang J. et al. An interpretable classifier for high-resolution breast cancer screening images utilizing weakly supervised localization. Med Image Anal 2021; 68: 101908

Crossref PubMed Google Scholar
56 Guan S, Loew M. Breast cancer detection using synthetic mammograms from generative adversarial networks in convolutional neural networks. J Med Imaging (Bellingham) 2019; 6 (03) 031411

PubMed Google Scholar
57 Antropova N, Huynh BQ, Giger ML. A deep feature fusion methodology for breast cancer diagnosis demonstrated on three imaging modality datasets. Med Phys 2017; 44 (10) 5162-5171

Crossref PubMed Google Scholar
58 Agnes SA, Anitha J, Pandian SIA, Peter JD. Classification of mammogram images using multiscale all convolutional neural network (MA-CNN). J Med Syst 2019; 44 (01) 30

Crossref PubMed Google Scholar
59 Jadoon MM, Zhang Q, Haq IU, Butt S, Jadoon A. Three-class mammogram classification based on descriptive CNN features. BioMed Res Int 2017; 2017: 3640901

Crossref PubMed Google Scholar
60 Zhang C, Zhao J, Niu J, Li D. New convolutional neural network model for screening and diagnosis of mammograms. PLoS One 2020; 15 (08) e0237674

Crossref PubMed Google Scholar
61 Shu X, Zhang L, Wang Z, Lv Q, Yi Z. Deep neural networks with region-based pooling structures for mammographic image classification. IEEE Trans Med Imaging 2020; 39 (06) 2246-2255

Crossref PubMed Google Scholar
62 Lång K, Dustler M, Dahlblom V, Åkesson A, Andersson I, Zackrisson S. Identifying normal mammograms in a large screening population using artificial intelligence. Eur Radiol 2021; 31 (03) 1687-1692

Crossref PubMed Google Scholar
63 Kooi T, Karssemeijer N. Classifying symmetrical differences and temporal change for the detection of malignant masses in mammography using deep neural networks. J Med Imaging (Bellingham) 2017; 4 (04) 044501

PubMed Google Scholar
64 Samala RK, Chan HP, Hadjiiski L, Helvie MA. Risks of feature leakage and sample size dependencies in deep feature extraction for breast mass classification. Med Phys 2021; 48 (06) 2827-2837

Crossref PubMed Google Scholar
65 Duggento A, Aiello M, Cavaliere C. et al. An ad hoc random initialization deep neural network architecture for discriminating malignant breast cancer lesions in mammographic images. Contrast Media Mol Imaging 2019; 2019: 5982834

Crossref PubMed Google Scholar
66 Sawyer Lee R, Dunnmon JA, He A, Tang S, Ré C, Rubin DL. Comparison of segmentation-free and segmentation-dependent computer-aided diagnosis of breast masses on a public mammography dataset. J Biomed Inform 2021; 113: 103656

Crossref PubMed Google Scholar
67 Cogan T, Cogan M, Tamil L. RAMS: remote and automatic mammogram screening. Comput Biol Med 2019; 107: 18-29

Crossref PubMed Google Scholar
68 Ragab DA, Sharkas M, Marshall S, Ren J. Breast cancer detection using deep convolutional neural networks and support vector machines. PeerJ 2019; 7: e6201

Crossref PubMed Google Scholar
69 Shen Y, Wu N, Phang J. et al. Globally-aware multiple instance classifier for breast cancer screening. Mach Learn Med Imaging 2019; 11861: 18-26

Crossref PubMed Google Scholar
70 Qiu Y, Yan S, Gundreddy RR. et al. A new approach to develop computer-aided diagnosis scheme of breast mass classification using deep learning technology. J XRay Sci Technol 2017; 25 (05) 751-763

Google Scholar
71 Teare P, Fishman M, Benzaquen O, Toledano E, Elnekave E. Malignancy detection on mammography using dual deep convolutional neural networks and genetically discovered false color input enhancement. J Digit Imaging 2017; 30 (04) 499-505

Crossref PubMed Google Scholar
72 Guan Y, Wang X, Li H. et al. Detecting asymmetric patterns and localizing cancers on mammograms. Patterns (N Y) 2020; 1 (07) 100106

Crossref PubMed Google Scholar
73 Rodriguez-Ruiz A, Lång K, Gubern-Merida A. et al. Stand-alone artificial intelligence for breast cancer detection in mammography: comparison with 101 radiologists. J Natl Cancer Inst 2019; 111 (09) 916-922

Crossref PubMed Google Scholar
74 Huynh BQ, Li H, Giger ML. Digital mammographic tumor classification using transfer learning from deep convolutional neural networks. J Med Imaging (Bellingham) 2016; 3 (03) 034501

Crossref PubMed Google Scholar
75 Rodríguez-Ruiz A, Krupinski E, Mordang JJ. et al. Detection of breast cancer with mammography: effect of an artificial intelligence support system. Radiology 2019; 290 (02) 305-314

Crossref PubMed Google Scholar
76 Schaffter T, Buist DSM, Lee CI. et al; and the DM DREAM Consortium. Evaluation of combined artificial intelligence and radiologist assessment to interpret screening mammograms. JAMA Netw Open 2020; 3 (03) e200265

Crossref PubMed Google Scholar
77 Sasaki M, Tozaki M, Rodríguez-Ruiz A. et al. Artificial intelligence for breast cancer detection in mammography: experience of use of the ScreenPoint Medical Transpara system in 310 Japanese women. Breast Cancer 2020; 27 (04) 642-651

Crossref PubMed Google Scholar
78 P S, R T. Aiding the digital mammogram for detecting the breast cancer using shearlet transform and neural network. Asian Pac J Cancer Prev 2018; 19 (09) 2665-2671

PubMed Google Scholar
79 Sepandi M, Taghdir M, Rezaianzadeh A, Rahimikazerooni S. Assessing breast cancer risk with an artificial neural network. Asian Pac J Cancer Prev 2018; 19 (04) 1017-1019

PubMed Google Scholar
80 Rodriguez-Ruiz A, Lång K, Gubern-Merida A. et al. Can we reduce the workload of mammographic screening by automatic identification of normal exams with artificial intelligence? A feasibility study. Eur Radiol 2019; 29 (09) 4825-4832

Crossref PubMed Google Scholar
81 Kim HE, Kim HH, Han BK. et al. Changes in cancer detection and false-positive recall in mammography using artificial intelligence: a retrospective, multireader study. Lancet Digit Health 2020; 2 (03) e138-e148

Crossref PubMed Google Scholar
82 Mednikov Y, Nehemia S, Zheng B, Benzaquen O, Lederman D. Transfer representation learning using inception-V3 for the detection of masses in mammography. Annu Int Conf IEEE Eng Med Biol Soc 2018; 2018: 2587-2590

PubMed Google Scholar
83 Melekoodappattu JG, Subbian PS. A hybridized ELM for automatic micro calcification detection in mammogram images based on multi-scale features. J Med Syst 2019; 43 (07) 183

Crossref PubMed Google Scholar
84 Chen X, Zargari A, Hollingsworth AB, Liu H, Zheng B, Qiu Y. Applying a new quantitative image analysis scheme based on global mammographic features to assist diagnosis of breast cancer. Comput Methods Programs Biomed 2019; 179: 104995

Crossref PubMed Google Scholar
85 Arevalo J, Gonzalez FA, Ramos-Pollan R, Oliveira JL, Guevara Lopez MA. Convolutional neural networks for mammography mass lesion classification. Annu Int Conf IEEE Eng Med Biol Soc 2015; 2015: 797-800

PubMed Google Scholar
86 Taghanaki SA, Kawahara J, Miles B, Hamarneh G. Pareto-optimal multi-objective dimensionality reduction deep auto-encoder for mammography classification. Comput Methods Programs Biomed 2017; 145: 85-93

Crossref PubMed Google Scholar
87 Sert E, Ertekin S, Halici U. Ensemble of convolutional neural networks for classification of breast microcalcification from mammograms. Annu Int Conf IEEE Eng Med Biol Soc 2017; 2017: 689-692

PubMed Google Scholar
88 Tan M, Qian W, Pu J, Liu H, Zheng B. A new approach to develop computer-aided detection schemes of digital mammograms. Phys Med Biol 2015; 60 (11) 4413-4427

Crossref PubMed Google Scholar
89 Saraswathi D, Srinivasan E. A CAD system to analyse mammogram images using fully complex-valued relaxation neural network ensembled classifier. J Med Eng Technol 2014; 38 (07) 359-366

Crossref PubMed Google Scholar
90 Suh YJ, Jung J, Cho BJ. Automated breast cancer detection in digital mammograms of various densities via deep learning. J Pers Med 2020; 10 (04) E211

Crossref PubMed Google Scholar
91 Jiao Z, Gao X, Wang Y, Li J. A deep feature based framework for breast masses classification. Neurocomputing 2016; 197: 221-231

Crossref Google Scholar
92 Kooi T, van Ginneken B, Karssemeijer N, den Heeten A. Discriminating solitary cysts from soft tissue lesions in mammography using a pretrained deep convolutional neural network. Med Phys 2017; 44 (03) 1017-1027

Crossref PubMed Google Scholar
93 Duraisamy S, Emperumal S. Computer-aided mammogram diagnosis system using deep learning convolutional fully complex-valued relaxation neural network classifier. IET Comput Vis 2017; 11 (08) 656-662

Crossref Google Scholar
94 Kurek J, Swiderski B, Osowski S, Kruk M, Barhoumi W. Deep learning versus classical neural approach to mammogram recognition. Bull Pol Acad Sci Tech Sci 2018; 66: 831-840

Google Scholar
95 Jiao Z, Gao X, Wang Y, Li J. A parasitic metric learning net for breast mass classification based on mammography. Pattern Recognit 2018; 75: 292-301

Crossref Google Scholar
96 Al-antari MA, Al-masni MA, Park SU. et al. An automatic computer-aided diagnosis system for breast cancer in digital mammograms via deep belief network. J Med Biol Eng 2018; 38 (03) 443-456

Crossref Google Scholar
97 Niu J, Li H, Zhang C, Li D. Multi-scale attention-based convolutional neural network for classification of breast masses in mammograms. Med Phys 2021; 48 (07) 3878-3892

Crossref PubMed Google Scholar
98 Rajathi GM. Optimized radial basis neural network for classification of breast cancer images. Curr Med Imaging 2021; 17 (01) 97-108

Crossref PubMed Google Scholar
99 Khan HN, Shahid AR, Raza B, Dar AH, Alquhayz H. Multi-view feature fusion based four views model for mammogram classification using convolutional neural network. IEEE Access 2019; 7: 165724-165733

Crossref Google Scholar
100 Medjeded M, Saïd M, Chenine A, Chikh M. A new triplet convolutional neural network for classification of lesions on mammograms. Rev Intell Artif 2019; 33: 213-217

Google Scholar
101 Sun L, Wang J, Hu Z, Xu Y, Cui Z. Multi-view convolutional neural networks for mammographic image classification. IEEE Access 2019; 7: 126273-126282

Crossref Google Scholar
102 Yu S, Liu L, Wang Z, Dai G, Xie Y. Transferring deep neural networks for the differentiation of mammographic breast lesions. Sci China Technol Sci 2019; 62 (03) 441-447

Crossref Google Scholar
103 Tavakoli N, Karimi M, Norouzi A, Karimi N, Samavi S. Soroushmehr SMR. Detection of abnormalities in mammograms using deep features. J Ambient Intell Humaniz Comput 2019; 14 (05) 5355-5367

Crossref Google Scholar
104 Jo YY, Choi YS, Park HW. et al. Impact of image compression on deep learning-based mammogram classification. Sci Rep 2021; 11 (01) 7924

Crossref PubMed Google Scholar
105 Cui Y, Li Y, Xing D, Bai T, Dong J, Zhu J. Improving the prediction of benign or malignant breast masses using a combination of image biomarkers and clinical parameters. Front Oncol 2021; 11: 629321

Crossref PubMed Google Scholar
106 Yang J, Li H, Shi N, Zhang Q, Liu Y. Microscopic tumour classification by digital mammography. J Healthc Eng 2021; 2021: 6635947

Crossref PubMed Google Scholar
107 Chouhan N, Khan A, Shah JZ, Hussnain M, Khan MW. Deep convolutional neural network and emotional learning based breast cancer detection using digital mammography. Comput Biol Med 2021; 132: 104318

Crossref PubMed Google Scholar
108 Ragab DA, Attallah O, Sharkas M, Ren J, Marshall S. A framework for breast cancer classification using Multi-DCNNs. Comput Biol Med 2021; 131: 104245

Crossref PubMed Google Scholar
109 Tsochatzidis L, Koutla P, Costaridou L, Pratikakis I. Integrating segmentation information into CNN for breast cancer diagnosis of mammographic masses. Comput Methods Programs Biomed 2021; 200: 105913

Crossref PubMed Google Scholar
110 Zhang X, Liang C, Zeng D. et al. Pattern classification for breast lesion on FFDM by integration of radiomics and deep features. Comput Med Imaging Graph 2021; 90: 101922

Crossref PubMed Google Scholar
111 Rehman KU, Li J, Pei Y, Yasin A, Ali S, Mahmood T. Computer vision-based microcalcification detection in digital mammograms using fully connected depthwise separable convolutional neural network. Sensors (Basel) 2021; 21 (14) 4854

Crossref PubMed Google Scholar
112 Salim M, Wåhlin E, Dembrower K. et al. External evaluation of 3 commercial artificial intelligence algorithms for independent assessment of screening mammograms. JAMA Oncol 2020; 6 (10) 1581-1588

Crossref PubMed Google Scholar
113 McKinney SM, Sieniek M, Godbole V. et al. International evaluation of an AI system for breast cancer screening. Nature 2020; 577 (7788) 89-94

Crossref PubMed Google Scholar
114 Akselrod-Ballin A, Chorev M, Shoshan Y. et al. Predicting breast cancer by applying deep learning to linked health records and mammograms. Radiology 2019; 292 (02) 331-342

Crossref PubMed Google Scholar
115 Pacilè S, Lopez J, Chone P, Bertinotti T, Grouin JM, Fillard P. Improving breast cancer detection accuracy of mammography with the concurrent use of an artificial intelligence tool. Radiol Artif Intell 2020; 2 (06) e190208

Crossref PubMed Google Scholar
116 Watanabe AT, Lim V, Vu HX. et al. Improved cancer detection using artificial intelligence: a retrospective evaluation of missed cancers on mammography. J Digit Imaging 2019; 32 (04) 625-637

Crossref PubMed Google Scholar
117 Lotter W, Diab AR, Haslam B. et al. Robust breast cancer detection in mammography and digital breast tomosynthesis using an annotation-efficient deep learning approach. Nat Med 2021; 27 (02) 244-249

Crossref PubMed Google Scholar
118 Tsochatzidis L, Costaridou L, Pratikakis I. Deep learning for breast cancer diagnosis from mammograms-a comparative study. J Imaging 2019; 5 (03) 37

Crossref PubMed Google Scholar
119 Chakraborty DP, Breatnach ES, Yester MV, Soto B, Barnes GT, Fraser RG. Digital and conventional chest imaging: a modified ROC study of observer performance using simulated nodules. Radiology 1986; 158 (01) 35-39

Crossref PubMed Google Scholar
120 Uzun Ozsahin D, Ikechukwu Emegano D, Uzun B, Ozsahin I. The systematic review of artificial intelligence applications in breast cancer diagnosis. Diagnostics (Basel) 2022; 13 (01) 45

Crossref PubMed Google Scholar

Address for correspondence

Krithika Rangarajan, MD

Room 47A IRCH, All India Institute of Medical Sciences

New Delhi 10029

India

Email: krithikarangarajan86@gmail.com

Publication History

Article published online:
10 October 2023

© 2023. Indian Radiological Association. This is an open access article published by Thieme under the terms of the Creative Commons Attribution-NonDerivative-NonCommercial License, permitting copying and reproduction so long as the original work is given appropriate credit. Contents may not be used for commercial purposes, or adapted, remixed, transformed or built upon. (https://creativecommons.org/licenses/by-nc-nd/4.0/)

Thieme Medical and Scientific Publishers Pvt. Ltd.
A-12, 2nd Floor, Sector 2, Noida-201301 UP, India

References
1 Warren Burhenne LJ, Wood SA, D'Orsi CJ. et al. Potential contribution of computer-aided detection to the sensitivity of screening mammography. Radiology 2000; 215 (02) 554-562

Crossref PubMed Google Scholar
2 Birdwell RL, Ikeda DM, O'Shaughnessy KF, Sickles EA. Mammographic characteristics of 115 missed cancers later detected with screening mammography and the potential utility of computer-aided detection. Radiology 2001; 219 (01) 192-202

Crossref PubMed Google Scholar
3 Birdwell RL, Bandodkar P, Ikeda DM. Computer-aided detection with screening mammography in a university hospital setting. Radiology 2005; 236 (02) 451-457

Crossref PubMed Google Scholar
4 Brem RF, Baum J, Lechner M. et al. Improvement in sensitivity of screening mammography with computer-aided detection: a multiinstitutional trial. Am J Roentgenol 2003; 181 (03) 687-693

Crossref PubMed Google Scholar
5 Freeman K, Geppert J, Stinton C. et al. Use of artificial intelligence for image analysis in breast cancer screening programmes: systematic review of test accuracy. BMJ 2021; 374: n1872

Crossref PubMed Google Scholar
6 Wang X, Liang G, Zhang Y, Blanton H, Bessinger Z, Jacobs N. Inconsistent performance of deep learning models on mammogram classification. J Am Coll Radiol 2020; 17 (06) 796-803

Crossref PubMed Google Scholar
7 miccai reproducibility - Google Search. Accessed May 16, 2022 at: https://www.google.com/search?q=miccai+reproducibility&oq=miccai+&aqs=chrome.1.69i57j35i39j0i512j0i20i263i512j0i512l6.4510j0j4&sourceid=chrome&ie=UTF-8
8 Mongan J, Moy L, Kahn Jr CE. Checklist for Artificial Intelligence in Medical Imaging (CLAIM): a guide for authors and reviewers. Radiol Artif Intell 2020; 2 (02) e200029

Crossref PubMed Google Scholar
9 Li X, Xiong H, Li X. et al. Interpretable deep learning: Interpretation, interpretability, trustworthiness, and beyond. Knowl Inf Syst 2022; 64 (12) 3197-3234

Crossref Google Scholar
10 McInnes MDF, Moher D, Thombs BD. et al; and the PRISMA-DTA Group. Preferred Reporting Items for a Systematic Review and Meta-analysis of Diagnostic Test Accuracy Studies: the PRISMA-DTA statement. JAMA 2018; 319 (04) 388-396

Crossref PubMed Google Scholar
11 Whiting PF, Rutjes AWS, Westwood ME. et al; QUADAS-2 Group. QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med 2011; 155 (08) 529-536

Crossref PubMed Google Scholar
12 Kooi T, Litjens G, van Ginneken B. et al. Large scale deep learning for computer aided detection of mammographic lesions. Med Image Anal 2017; 35: 303-312

Crossref PubMed Google Scholar
13 Ribli D, Horváth A, Unger Z, Pollner P, Csabai I. Detecting and classifying lesions in mammograms with Deep Learning. Sci Rep 2018; 8 (01) 4165

Crossref PubMed Google Scholar
14 Dhungel N, Carneiro G, Bradley AP. A deep learning approach for the analysis of masses in mammograms with minimal user intervention. Med Image Anal 2017; 37: 114-128

Crossref PubMed Google Scholar
15 Al-Antari MA, Al-Masni MA, Choi MT, Han SM, Kim TS. A fully integrated computer-aided diagnosis system for digital X-ray mammograms via deep learning detection, segmentation, and classification. Int J Med Inform 2018; 117: 44-54

Crossref PubMed Google Scholar
16 Al-Masni MA, Al-Antari MA, Park JM. et al. Detection and classification of the breast abnormalities in digital mammograms via regional Convolutional Neural Network. Annu Int Conf IEEE Eng Med Biol Soc 2017; 2017: 1230-1233

PubMed Google Scholar
17 Li C, Zhang D, Tian Z, Du S, Qu Y. Few-shot learning with deformable convolution for multiscale lesion detection in mammography. Med Phys 2020; 47 (07) 2970-2985

Crossref PubMed Google Scholar
18 Agarwal R, Díaz O, Yap MH, Lladó X, Martí R. Deep learning for mass detection in full field digital mammograms. Comput Biol Med 2020; 121: 103774

Crossref PubMed Google Scholar
19 Al-Masni MA, Al-Antari MA, Park JM. et al. Simultaneous detection and classification of breast masses in digital mammograms via a deep learning YOLO-based CAD system. Comput Methods Programs Biomed 2018; 157: 85-94

Crossref PubMed Google Scholar
20 Bandeira Diniz JO, Bandeira Diniz PH, Azevedo Valente TL, Corrêa Silva A, de Paiva AC, Gattass M. Detection of mass regions in mammograms by bilateral analysis adapted to breast density using similarity indexes and convolutional neural networks. Comput Methods Programs Biomed 2018; 156: 191-207

Crossref PubMed Google Scholar
21 Jung H, Kim B, Lee I. et al. Detection of masses in mammograms using a one-stage object detector based on a deep convolutional neural network. PLoS One 2018; 13 (09) e0203355

Crossref PubMed Google Scholar
22 Savelli B, Bria A, Molinara M, Marrocco C, Tortorella F. A multi-context CNN ensemble for small lesion detection. Artif Intell Med 2020; 103: 101749

Crossref PubMed Google Scholar
23 Valvano G, Santini G, Martini N. et al. Convolutional neural networks for the segmentation of microcalcification in mammography imaging. J Healthc Eng 2019; 2019: 9360941

Crossref PubMed Google Scholar
24 Sarath CK, Chakravarty A, Ghosh N, Sarkar T, Sethuraman R, Sheet D. A two-stage multiple instance learning framework for the detection of breast cancer in mammograms. Annu Int Conf IEEE Eng Med Biol Soc 2020; 2020: 1128-1131

PubMed Google Scholar
25 Wang H, Feng J, Bu Q. et al. Breast mass detection in digital mammogram based on gestalt psychology. J Healthc Eng 2018; 2018: 4015613

Crossref PubMed Google Scholar
26 Cha KH, Petrick N, Pezeshk A. et al. Evaluation of data augmentation via synthetic images for improved breast mass detection on mammograms using deep learning. J Med Imaging (Bellingham) 2020; 7 (01) 012703

PubMed Google Scholar
27 Wang J, Nishikawa RM, Yang Y. Global detection approach for clustered microcalcifications in mammograms using a deep learning network. J Med Imaging (Bellingham) 2017; 4 (02) 024501

Crossref PubMed Google Scholar
28 Shin SY, Lee S, Yun ID. et al. A novel cascade classifier for automatic microcalcification detection. PLoS One 2015; 10 (12) e0143725

Crossref PubMed Google Scholar
29 Wang J, Yang Y. A context-sensitive deep learning approach for microcalcification detection in mammograms. Pattern Recognit 2018; 78: 12-22

Crossref PubMed Google Scholar
30 Fathy W, Ghoneim A. A deep learning approach for breast cancer mass detection. Int J Adv Comput Sci Appl 2019; 10: 175-182

Google Scholar
31 Li H, Ye J, Liu H. et al. Application of deep learning in the detection of breast lesions with four different breast densities. Cancer Med 2021; 10 (14) 4994-5000

Crossref PubMed Google Scholar
32 Yi PH, Singh D, Harvey SC, Hager GD, Mullen LA. DeepCAT: deep computer-aided triage of screening mammography. J Digit Imaging 2021; 34 (01) 27-35

Crossref PubMed Google Scholar
33 Sun L, Sun H, Wang J, Wu S, Zhao Y, Xu Y. Breast mass detection in mammography based on image template matching and CNN. Sensors (Basel) 2021; 21 (08) 2855

Crossref PubMed Google Scholar
34 Aly GH, Marey M, El-Sayed SA, Tolba MF. YOLO based breast masses detection and classification in full-field digital mammograms. Comput Methods Programs Biomed 2021; 200: 105823

Crossref PubMed Google Scholar
35 Isfahani ZN, Jannat-Dastjerdi I, Eskandari F, Ghoushchi SJ, Pourasad Y. Presentation of novel hybrid algorithm for detection and classification of breast cancer using growth region method and probabilistic neural network. Comput Intell Neurosci 2021; 2021: 5863496

Crossref PubMed Google Scholar
36 Sarangi S, Rath NP, Sahoo HK. Mammogram mass segmentation and detection using Legendre neural network-based optimal threshold. Med Biol Eng Comput 2021; 59 (04) 947-955

Crossref PubMed Google Scholar
37 Liu Y, Zhou C, Zhang F. et al. Compare and contrast: detecting mammographic soft-tissue lesions with C²-Net. Med Image Anal 2021; 71: 101999

Crossref PubMed Google Scholar
38 Shen L, Margolies LR, Rothstein JH, Fluder E, McBride R, Sieh W. Deep learning to improve breast cancer detection on screening mammography. Sci Rep 2019; 9 (01) 12495

Crossref PubMed Google Scholar
39 Aboutalib SS, Mohamed AA, Berg WA, Zuley ML, Sumkin JH, Wu S. Deep learning to distinguish recalled but benign mammography images in breast cancer screening. Clin Cancer Res 2018; 24 (23) 5902-5909

Crossref PubMed Google Scholar
40 Becker AS, Marcon M, Ghafoor S, Wurnig MC, Frauenfelder T, Boss A. Deep learning in mammography: diagnostic accuracy of a multipurpose image analysis software in the detection of breast cancer. Invest Radiol 2017; 52 (07) 434-440

Crossref PubMed Google Scholar
41 Wang J, Yang X, Cai H, Tan W, Jin C, Li L. Discrimination of breast cancer with microcalcifications on mammography by deep learning. Sci Rep 2016; 6: 27327

Crossref PubMed Google Scholar
42 Arevalo J, González FA, Ramos-Pollán R, Oliveira JL, Guevara Lopez MA. Representation learning for mammography mass lesion classification with convolutional neural networks. Comput Methods Programs Biomed 2016; 127: 248-257

Crossref PubMed Google Scholar
43 Yala A, Schuster T, Miles R, Barzilay R, Lehman C. A deep learning model to triage screening mammograms: a simulation study. Radiology 2019; 293 (01) 38-46

Crossref PubMed Google Scholar
44 Samala RK, Chan HP, Hadjiiski LM, Helvie MA, Cha KH, Richter CD. Multi-task transfer learning deep convolutional neural network: application to computer-aided diagnosis of breast cancer on mammograms. Phys Med Biol 2017; 62 (23) 8894-8908

Crossref PubMed Google Scholar
45 He T, Puppala M, Ezeana CF. et al. A deep learning-based decision support tool for precision risk assessment of breast cancer. JCO Clin Cancer Inform 2019; 3: 1-12

Crossref PubMed Google Scholar
46 Kim ST, Lee JH, Lee H, Ro YM. Visually interpretable deep network for diagnosis of breast masses on mammograms. Phys Med Biol 2018; 63 (23) 235025

Crossref PubMed Google Scholar
47 Carneiro G, Nascimento J, Bradley AP. Automated analysis of unregistered multi-view mammograms with deep learning. IEEE Trans Med Imaging 2017; 36 (11) 2355-2365

Crossref PubMed Google Scholar
48 Chougrad H, Zouaki H, Alheyane O. Deep convolutional neural networks for breast cancer screening. Comput Methods Programs Biomed 2018; 157: 19-30

Crossref PubMed Google Scholar
49 Cai H, Huang Q, Rong W. et al. Breast microcalcification diagnosis using deep convolutional neural network from digital mammograms. Comput Math Methods Med 2019; 2019: 2717454

Google Scholar
50 Bruno A, Ardizzone E, Vitabile S, Midiri M. A novel solution based on scale invariant feature transform descriptors and deep learning for the detection of suspicious regions in mammogram images. J Med Signals Sens 2020; 10 (03) 158-173

Crossref Google Scholar
51 Arora R, Rai PK, Raman B. Deep feature-based automatic classification of mammograms. Med Biol Eng Comput 2020; 58 (06) 1199-1211

Crossref PubMed Google Scholar
52 Zhang X, Zhang Y, Han EY. et al. Classification of whole mammogram and tomosynthesis images using deep convolutional neural networks. IEEE Trans Nanobiosci 2018; 17 (03) 237-242

Crossref PubMed Google Scholar
53 Sun W, Tseng TB, Zhang J, Qian W. Enhancing deep convolutional neural network scheme for breast cancer diagnosis with unlabeled data. Comput Med Imaging Graph 2017; 57: 4-9

Crossref PubMed Google Scholar
54 Muramatsu C, Nishio M, Goto T. et al. Improving breast mass classification by shared data with domain transformation using a generative adversarial network. Comput Biol Med 2020; 119: 103698

Crossref PubMed Google Scholar
55 Shen Y, Wu N, Phang J. et al. An interpretable classifier for high-resolution breast cancer screening images utilizing weakly supervised localization. Med Image Anal 2021; 68: 101908

Crossref PubMed Google Scholar
56 Guan S, Loew M. Breast cancer detection using synthetic mammograms from generative adversarial networks in convolutional neural networks. J Med Imaging (Bellingham) 2019; 6 (03) 031411

PubMed Google Scholar
57 Antropova N, Huynh BQ, Giger ML. A deep feature fusion methodology for breast cancer diagnosis demonstrated on three imaging modality datasets. Med Phys 2017; 44 (10) 5162-5171

Crossref PubMed Google Scholar
58 Agnes SA, Anitha J, Pandian SIA, Peter JD. Classification of mammogram images using multiscale all convolutional neural network (MA-CNN). J Med Syst 2019; 44 (01) 30

Crossref PubMed Google Scholar
59 Jadoon MM, Zhang Q, Haq IU, Butt S, Jadoon A. Three-class mammogram classification based on descriptive CNN features. BioMed Res Int 2017; 2017: 3640901

Crossref PubMed Google Scholar
60 Zhang C, Zhao J, Niu J, Li D. New convolutional neural network model for screening and diagnosis of mammograms. PLoS One 2020; 15 (08) e0237674

Crossref PubMed Google Scholar
61 Shu X, Zhang L, Wang Z, Lv Q, Yi Z. Deep neural networks with region-based pooling structures for mammographic image classification. IEEE Trans Med Imaging 2020; 39 (06) 2246-2255

Crossref PubMed Google Scholar
62 Lång K, Dustler M, Dahlblom V, Åkesson A, Andersson I, Zackrisson S. Identifying normal mammograms in a large screening population using artificial intelligence. Eur Radiol 2021; 31 (03) 1687-1692

Crossref PubMed Google Scholar
63 Kooi T, Karssemeijer N. Classifying symmetrical differences and temporal change for the detection of malignant masses in mammography using deep neural networks. J Med Imaging (Bellingham) 2017; 4 (04) 044501

PubMed Google Scholar
64 Samala RK, Chan HP, Hadjiiski L, Helvie MA. Risks of feature leakage and sample size dependencies in deep feature extraction for breast mass classification. Med Phys 2021; 48 (06) 2827-2837

Crossref PubMed Google Scholar
65 Duggento A, Aiello M, Cavaliere C. et al. An ad hoc random initialization deep neural network architecture for discriminating malignant breast cancer lesions in mammographic images. Contrast Media Mol Imaging 2019; 2019: 5982834

Crossref PubMed Google Scholar
66 Sawyer Lee R, Dunnmon JA, He A, Tang S, Ré C, Rubin DL. Comparison of segmentation-free and segmentation-dependent computer-aided diagnosis of breast masses on a public mammography dataset. J Biomed Inform 2021; 113: 103656

Crossref PubMed Google Scholar
67 Cogan T, Cogan M, Tamil L. RAMS: remote and automatic mammogram screening. Comput Biol Med 2019; 107: 18-29

Crossref PubMed Google Scholar
68 Ragab DA, Sharkas M, Marshall S, Ren J. Breast cancer detection using deep convolutional neural networks and support vector machines. PeerJ 2019; 7: e6201

Crossref PubMed Google Scholar
69 Shen Y, Wu N, Phang J. et al. Globally-aware multiple instance classifier for breast cancer screening. Mach Learn Med Imaging 2019; 11861: 18-26

Crossref PubMed Google Scholar
70 Qiu Y, Yan S, Gundreddy RR. et al. A new approach to develop computer-aided diagnosis scheme of breast mass classification using deep learning technology. J XRay Sci Technol 2017; 25 (05) 751-763

Google Scholar
71 Teare P, Fishman M, Benzaquen O, Toledano E, Elnekave E. Malignancy detection on mammography using dual deep convolutional neural networks and genetically discovered false color input enhancement. J Digit Imaging 2017; 30 (04) 499-505

Crossref PubMed Google Scholar
72 Guan Y, Wang X, Li H. et al. Detecting asymmetric patterns and localizing cancers on mammograms. Patterns (N Y) 2020; 1 (07) 100106

Crossref PubMed Google Scholar
73 Rodriguez-Ruiz A, Lång K, Gubern-Merida A. et al. Stand-alone artificial intelligence for breast cancer detection in mammography: comparison with 101 radiologists. J Natl Cancer Inst 2019; 111 (09) 916-922

Crossref PubMed Google Scholar
74 Huynh BQ, Li H, Giger ML. Digital mammographic tumor classification using transfer learning from deep convolutional neural networks. J Med Imaging (Bellingham) 2016; 3 (03) 034501

Crossref PubMed Google Scholar
75 Rodríguez-Ruiz A, Krupinski E, Mordang JJ. et al. Detection of breast cancer with mammography: effect of an artificial intelligence support system. Radiology 2019; 290 (02) 305-314

Crossref PubMed Google Scholar
76 Schaffter T, Buist DSM, Lee CI. et al; and the DM DREAM Consortium. Evaluation of combined artificial intelligence and radiologist assessment to interpret screening mammograms. JAMA Netw Open 2020; 3 (03) e200265

Crossref PubMed Google Scholar
77 Sasaki M, Tozaki M, Rodríguez-Ruiz A. et al. Artificial intelligence for breast cancer detection in mammography: experience of use of the ScreenPoint Medical Transpara system in 310 Japanese women. Breast Cancer 2020; 27 (04) 642-651

Crossref PubMed Google Scholar
78 P S, R T. Aiding the digital mammogram for detecting the breast cancer using shearlet transform and neural network. Asian Pac J Cancer Prev 2018; 19 (09) 2665-2671

PubMed Google Scholar
79 Sepandi M, Taghdir M, Rezaianzadeh A, Rahimikazerooni S. Assessing breast cancer risk with an artificial neural network. Asian Pac J Cancer Prev 2018; 19 (04) 1017-1019

PubMed Google Scholar
80 Rodriguez-Ruiz A, Lång K, Gubern-Merida A. et al. Can we reduce the workload of mammographic screening by automatic identification of normal exams with artificial intelligence? A feasibility study. Eur Radiol 2019; 29 (09) 4825-4832

Crossref PubMed Google Scholar
81 Kim HE, Kim HH, Han BK. et al. Changes in cancer detection and false-positive recall in mammography using artificial intelligence: a retrospective, multireader study. Lancet Digit Health 2020; 2 (03) e138-e148

Crossref PubMed Google Scholar
82 Mednikov Y, Nehemia S, Zheng B, Benzaquen O, Lederman D. Transfer representation learning using inception-V3 for the detection of masses in mammography. Annu Int Conf IEEE Eng Med Biol Soc 2018; 2018: 2587-2590

PubMed Google Scholar
83 Melekoodappattu JG, Subbian PS. A hybridized ELM for automatic micro calcification detection in mammogram images based on multi-scale features. J Med Syst 2019; 43 (07) 183

Crossref PubMed Google Scholar
84 Chen X, Zargari A, Hollingsworth AB, Liu H, Zheng B, Qiu Y. Applying a new quantitative image analysis scheme based on global mammographic features to assist diagnosis of breast cancer. Comput Methods Programs Biomed 2019; 179: 104995

Crossref PubMed Google Scholar
85 Arevalo J, Gonzalez FA, Ramos-Pollan R, Oliveira JL, Guevara Lopez MA. Convolutional neural networks for mammography mass lesion classification. Annu Int Conf IEEE Eng Med Biol Soc 2015; 2015: 797-800

PubMed Google Scholar
86 Taghanaki SA, Kawahara J, Miles B, Hamarneh G. Pareto-optimal multi-objective dimensionality reduction deep auto-encoder for mammography classification. Comput Methods Programs Biomed 2017; 145: 85-93

Crossref PubMed Google Scholar
87 Sert E, Ertekin S, Halici U. Ensemble of convolutional neural networks for classification of breast microcalcification from mammograms. Annu Int Conf IEEE Eng Med Biol Soc 2017; 2017: 689-692

PubMed Google Scholar
88 Tan M, Qian W, Pu J, Liu H, Zheng B. A new approach to develop computer-aided detection schemes of digital mammograms. Phys Med Biol 2015; 60 (11) 4413-4427

Crossref PubMed Google Scholar
89 Saraswathi D, Srinivasan E. A CAD system to analyse mammogram images using fully complex-valued relaxation neural network ensembled classifier. J Med Eng Technol 2014; 38 (07) 359-366

Crossref PubMed Google Scholar
90 Suh YJ, Jung J, Cho BJ. Automated breast cancer detection in digital mammograms of various densities via deep learning. J Pers Med 2020; 10 (04) E211

Crossref PubMed Google Scholar
91 Jiao Z, Gao X, Wang Y, Li J. A deep feature based framework for breast masses classification. Neurocomputing 2016; 197: 221-231

Crossref Google Scholar
92 Kooi T, van Ginneken B, Karssemeijer N, den Heeten A. Discriminating solitary cysts from soft tissue lesions in mammography using a pretrained deep convolutional neural network. Med Phys 2017; 44 (03) 1017-1027

Crossref PubMed Google Scholar
93 Duraisamy S, Emperumal S. Computer-aided mammogram diagnosis system using deep learning convolutional fully complex-valued relaxation neural network classifier. IET Comput Vis 2017; 11 (08) 656-662

Crossref Google Scholar
94 Kurek J, Swiderski B, Osowski S, Kruk M, Barhoumi W. Deep learning versus classical neural approach to mammogram recognition. Bull Pol Acad Sci Tech Sci 2018; 66: 831-840

Google Scholar
95 Jiao Z, Gao X, Wang Y, Li J. A parasitic metric learning net for breast mass classification based on mammography. Pattern Recognit 2018; 75: 292-301

Crossref Google Scholar
96 Al-antari MA, Al-masni MA, Park SU. et al. An automatic computer-aided diagnosis system for breast cancer in digital mammograms via deep belief network. J Med Biol Eng 2018; 38 (03) 443-456

Crossref Google Scholar
97 Niu J, Li H, Zhang C, Li D. Multi-scale attention-based convolutional neural network for classification of breast masses in mammograms. Med Phys 2021; 48 (07) 3878-3892

Crossref PubMed Google Scholar
98 Rajathi GM. Optimized radial basis neural network for classification of breast cancer images. Curr Med Imaging 2021; 17 (01) 97-108

Crossref PubMed Google Scholar
99 Khan HN, Shahid AR, Raza B, Dar AH, Alquhayz H. Multi-view feature fusion based four views model for mammogram classification using convolutional neural network. IEEE Access 2019; 7: 165724-165733

Crossref Google Scholar
100 Medjeded M, Saïd M, Chenine A, Chikh M. A new triplet convolutional neural network for classification of lesions on mammograms. Rev Intell Artif 2019; 33: 213-217

Google Scholar
101 Sun L, Wang J, Hu Z, Xu Y, Cui Z. Multi-view convolutional neural networks for mammographic image classification. IEEE Access 2019; 7: 126273-126282

Crossref Google Scholar
102 Yu S, Liu L, Wang Z, Dai G, Xie Y. Transferring deep neural networks for the differentiation of mammographic breast lesions. Sci China Technol Sci 2019; 62 (03) 441-447

Crossref Google Scholar
103 Tavakoli N, Karimi M, Norouzi A, Karimi N, Samavi S. Soroushmehr SMR. Detection of abnormalities in mammograms using deep features. J Ambient Intell Humaniz Comput 2019; 14 (05) 5355-5367

Crossref Google Scholar
104 Jo YY, Choi YS, Park HW. et al. Impact of image compression on deep learning-based mammogram classification. Sci Rep 2021; 11 (01) 7924

Crossref PubMed Google Scholar
105 Cui Y, Li Y, Xing D, Bai T, Dong J, Zhu J. Improving the prediction of benign or malignant breast masses using a combination of image biomarkers and clinical parameters. Front Oncol 2021; 11: 629321

Crossref PubMed Google Scholar
106 Yang J, Li H, Shi N, Zhang Q, Liu Y. Microscopic tumour classification by digital mammography. J Healthc Eng 2021; 2021: 6635947

Crossref PubMed Google Scholar
107 Chouhan N, Khan A, Shah JZ, Hussnain M, Khan MW. Deep convolutional neural network and emotional learning based breast cancer detection using digital mammography. Comput Biol Med 2021; 132: 104318

Crossref PubMed Google Scholar
108 Ragab DA, Attallah O, Sharkas M, Ren J, Marshall S. A framework for breast cancer classification using Multi-DCNNs. Comput Biol Med 2021; 131: 104245

Crossref PubMed Google Scholar
109 Tsochatzidis L, Koutla P, Costaridou L, Pratikakis I. Integrating segmentation information into CNN for breast cancer diagnosis of mammographic masses. Comput Methods Programs Biomed 2021; 200: 105913

Crossref PubMed Google Scholar
110 Zhang X, Liang C, Zeng D. et al. Pattern classification for breast lesion on FFDM by integration of radiomics and deep features. Comput Med Imaging Graph 2021; 90: 101922

Crossref PubMed Google Scholar
111 Rehman KU, Li J, Pei Y, Yasin A, Ali S, Mahmood T. Computer vision-based microcalcification detection in digital mammograms using fully connected depthwise separable convolutional neural network. Sensors (Basel) 2021; 21 (14) 4854

Crossref PubMed Google Scholar
112 Salim M, Wåhlin E, Dembrower K. et al. External evaluation of 3 commercial artificial intelligence algorithms for independent assessment of screening mammograms. JAMA Oncol 2020; 6 (10) 1581-1588

Crossref PubMed Google Scholar
113 McKinney SM, Sieniek M, Godbole V. et al. International evaluation of an AI system for breast cancer screening. Nature 2020; 577 (7788) 89-94

Crossref PubMed Google Scholar
114 Akselrod-Ballin A, Chorev M, Shoshan Y. et al. Predicting breast cancer by applying deep learning to linked health records and mammograms. Radiology 2019; 292 (02) 331-342

Crossref PubMed Google Scholar
115 Pacilè S, Lopez J, Chone P, Bertinotti T, Grouin JM, Fillard P. Improving breast cancer detection accuracy of mammography with the concurrent use of an artificial intelligence tool. Radiol Artif Intell 2020; 2 (06) e190208

Crossref PubMed Google Scholar
116 Watanabe AT, Lim V, Vu HX. et al. Improved cancer detection using artificial intelligence: a retrospective evaluation of missed cancers on mammography. J Digit Imaging 2019; 32 (04) 625-637

Crossref PubMed Google Scholar
117 Lotter W, Diab AR, Haslam B. et al. Robust breast cancer detection in mammography and digital breast tomosynthesis using an annotation-efficient deep learning approach. Nat Med 2021; 27 (02) 244-249

Crossref PubMed Google Scholar
118 Tsochatzidis L, Costaridou L, Pratikakis I. Deep learning for breast cancer diagnosis from mammograms-a comparative study. J Imaging 2019; 5 (03) 37

Crossref PubMed Google Scholar
119 Chakraborty DP, Breatnach ES, Yester MV, Soto B, Barnes GT, Fraser RG. Digital and conventional chest imaging: a modified ROC study of observer performance using simulated nodules. Radiology 1986; 158 (01) 35-39

Crossref PubMed Google Scholar
120 Uzun Ozsahin D, Ikechukwu Emegano D, Uzun B, Ozsahin I. The systematic review of artificial intelligence applications in breast cancer diagnosis. Diagnostics (Basel) 2022; 13 (01) 45

Crossref PubMed Google Scholar

Permissions and Reprints

Supplementary Material

Supplementary Material

Subscribe to RSS

Share / Bookmark

Reproducibility and Explainability of Deep Learning in Mammography: A Systematic Review of Literature

Abstract

Keywords

Introduction

Materials and Methods

Information Sources

Eligibility Criteria

Data Extraction and Quality Assessment

Data Analysis and Summary Measures

Results

Literature Search and Study Selection

Data Extraction (Initial Assessment)

Quality Assessment

Details of mQUADAS-2 assessment of included studies

Clinical Study Design

Details of studies with adequate clinical design as per mQUADAS-2 tool

Common Practices Used for AI Model Design and Training

Analysis of models employed by studies with adequate clinical design

Performance of AI

Reported results for various tasks performed by the AI networks

Discussion

Common Practices for Model Design

Common Practices for Clinical Design

Overall Assessment of Position of AI in Breast Imaging

Conflict of Interest

Acknowledgment

Supplementary Material

References

Address for correspondence

Publication History

References