Background
Patients with late-onset Pompe disease [glycogen storage disease II, acid-maltase deficiency (MP)], motor neuron disease, muscular dystrophy, or other neuromuscular diseases frequently experience diagnostic delay [
1‐
4]. The rarity of these diseases together with clinical variability, atypical presentations, or lack of time for a thorough examination and medical history taking contribute to the delay in diagnosis. In patients with late-onset MP, the diagnostic latency can be more than 20 years [
5]. For amyotrophic lateral sclerosis (ALS) patients, the median time from onset of first symptom to diagnosis has been reported to be 11 months [
6,
7].
The past medical history offers important clues for diagnosing neuromuscular diseases. Indeed, medical history taking is one of the oldest arts in medicine, but introduction of new reimbursement systems has resulted in less time for communication between physicians and patients and relatives [
8]. One goal of this study was therefore to integrate the past medical history into a diagnostic tool and to combine it with modern statistical technologies. In addition, to incorporating the patient’s point of view, we explored the past medical history using questions that were created systematically following interviews focusing on the pre-diagnostic time period. Likewise, the practical experiences of the patient should be closely integrated into the diagnostic process.
We aimed to develop a computerized diagnostic support tool for earlier identification of neuromuscular diseases. In our previous work, we exploited useful scenarios for medical diagnostic support and generated a novel diagnostic support tool for the pediatric emergency department [
9]. This ‘emergency tool’ used 14 clinical (e.g. body temperature, blood pressure, pain) and 12 laboratory parameters (e.g. blood count, CRP level, blood-gas analysis) to produce a possible diagnosis. In this study, the system had a diagnostic accuracy between 81 and 97 % for 17 diagnoses such as meningitis, appendicitis, and pneumonia. Although successful, this tool excluded important parts of the past medical history. Therefore, we intended to develop a tool focusing on patients’ perceptions and experiences. In the current project for diagnostic support for individuals with selected neuromuscular diseases; we incorporated patients’ pre-diagnostic experiences and observations to collect answer patterns using questionnaires. Data mining methods then proved to be a reliable tool for answer pattern recognition. This novel tool could serve as diagnostic support for general practitioners (GP) to shorten the diagnostic time in patients with uncommon neuromuscular diseases.
Methods
Study design and interviews
In this multicenter prospective pilot study, we tested whether the patient experience explored via a questionnaire could provide diagnostic support for selected rare neuromuscular diseases characterized by long diagnostic latency. First, to gain insight into the patient’s viewpoint during pre-diagnostic phase, interviews with 16 patients with different neuromuscular diseases [MP, ALS, and muscular dystrophy (MD)] were performed across Germany between September 2011 and February 2012 by two authors (US and LG). These semi-structured (narrative) interviews lasted between 45 min and 2.5 h and started with the same initial question (“Please tell us everything that comes to mind before your diagnosis was established. Relay to us everything you consider to be of any importance: your observations and experiences that you would like to share”). At the conclusion of the patient’s narrative, the interviewer could ask additional questions to elucidate more details.
All interviews were digitally recorded, transcribed, and analyzed according to Colaizzi’s techniques [
10]. Consequently, an inductive system of categories was developed reflecting the pre-diagnostic phenomena (experiences, symptoms, and/or observations). Examples of pre-diagnostic phenomena are given in Table
1. The process of how the interviews were analyzed to yield a question is illustrated in Supplemental Table 1 for one category.
Table 1
Examples for pre-diagnostic experiences and the process of categorization
“My husband enjoys hiking, but for me, steep trails were extremely difficult to manage. I needed to rest often and he would get impatient and cross with me. But what could I do – there was simply no strength in my legs!” | Gait/gait pattern | Can you easily walk uphill? |
“Sports in school were simply a nightmare for me. Youth sport meets or any competitive sport exasperated me. Especially those activities that required quick movements were a major fail for me” | Sport activities and training | When you were young were you able to keep up in sports? |
“During military service we were forced to pass a fitness course. In addition to other challenges, we had to climb over a six-foot wall. Lifting my body over the barrier was impossible. So I waited until the sergeant was not looking and I would instead run around the barricade.” | Conscious or unconscious compensation of disability | Did you have to “cheat” such as using alternative muscles when performing certain activities? |
Ethical considerations
The ethics committees of Hannover medical university (Ethikkommission der Medizinischen Hochschule Hannover, head: Prof. Dr. H.D. Tröger) and Bochum medical university (Ethik-Kommission der Ruhr Universität Bochum, head: Prof. Dr. M. Zenz) approved the study. All patients gave informed consent for the interviews and all individuals answering the questionnaire gave their informed consent to participate.
Systematic analysis of the interviews and creation of a questionnaire
Two researchers (US and LG) reviewed and analyzed the interviews. Utilizing techniques described by Colaizzi, patients’ observations were then systematically categorized. A stepwise qualitative analysis was undertaken, including extraction of significant phrases, reduction of the phrases into their essential structures, generation of a question from the essential structure, and validation of questions by the interviewees. To organize the observations and create a questionnaire that would reflect the important experiences, we classified the content of the interviews into different categories. Additionally, we incorporated an additional step, not part of the Colaizzi’s stepwise analysis, and created a question reflecting the pre-diagnostic experiences (Additional file
1). Based on these categories questions were generated resulting in a questionnaire that reflected all categories. Likewise, the questionnaire reflected all the pre-diagnostic phenomena of the interviewees. In close dialogue with patient support groups, the maximum length of the questionnaire was to have no more than two pages and be able to be completed in less than ten minutes. The answers in the questionnaire were scaled from 1 (“absolutely not true”) to 6 (“completely true”). All interviewees as well as patients who were not interviewed evaluated the questions and made suggestions to improve the comprehensibility of the final version of the questionnaire which consisted of 46 questions. Five questions from the questionnaire are shown in Table
2 and the complete questionnaire is provided in the appendix.
Table 2
Example of questions used for diagnosing selected neuromuscular diseases
1 | Were you ever diagnosed with an elevated CK level (creatinkinase, a muscle enzyme)? |
2 | Have your liver parameter/enzymes ever been elevated without apparent reason? |
3 | Is it particularly challenging to walk uphill? |
4 | Do you have difficulties standing up from a crouch? |
5 | Do you often stumble when you walk or do your feet feel “sticky”? |
| Do people describe your walk as “funny” or “particular”? |
Collection of answered questionnaires
After formulating of the questionnaire, patients with an established diagnosis (based on standard criteria) of the selected neuromuscular diseases, i.e. muscular dystrophy and myotonia (MdMy) [including patients with Duchenne and Becker muscular dystrophy, oculopharyngeal muscular dystrophy (OPMD), proximal myotonic myopathy (PROMM), facioscapulohumeral MD, limb-girdle-MD, myotonia congenita Thomsen], MP, spinal muscular atrophy (SMA), ALS, polyneuropathy (PNP), and other neuromuscular diseases [including patients with chronic progressive external opthalmoplegia (CPEO)-plus, polymyositis, Ullrich congenital muscular dystrophy, Miyoshi myopathy, Friedreich ataxia, primary lateral sclerosis (PLS), and spinal and bulbar muscular atrophy (SBMA)] were invited (between March 2013 until November 2013) to complete the questionnaire through our neurological outpatient clinic or via local patient group sites. To facilitate participation, a web-based platform was created to answer the questionnaire. Individuals without neuromuscular disease are interpreted as a 7th disease group. During this first period, 210 completed questionnaires were collected and used for cross-validation and later as a training set to predict the correct diagnosis for 64 new patients in a second step.
Prospective study and extension of the system
The second step, a prospective and multicenter study with different neurological clinics was initiated between October 2013 and October 2014. 64 patients with an established diagnosis of MdMy, MP, SMA, ALS, or PNP completed the questionnaire. The questionnaires were answered and collected in different hospitals in Hannover and Bochum, Germany. Only patients with McA disease were contacted via patient groups.
Data mining techniques
Finding the right diagnosis based on the answer patterns in the questionnaires can be seen multiclass classification problem. The target attribute was the diagnosis and the elements used for the prediction were the answers to the questions which are given on an ordinal scale. Most classifiers are designed to handle either numerical or categorical attributes. Therefore, the ordinal scale was interpreted as a numerical scale.
Classifiers are based on different assumptions of how the classes – the diagnoses – can be identified or separated. For instance, linear discriminant analysis is based on the assumption that each class is represented by a multivariate normal distribution whereas a decision tree assumes that the classes can be separated by axes-parallel hyper-planes. None of these assumptions really fits the questionnaire data set. Therefore, no single classifier was chosen but rather an ensemble of classifiers.
Classifier ensembles [
32] (i.e. combinations of different classification algorithms) often lead to better predictions. The application of classifier ensembles in the context of support for medical diagnosis has been described previously [
9]. In the current study, however, we used a combination of eight distinct classifiers (support vector machine, artificial neural network, fuzzy rule-based, random forest, logistic regression, linear discriminant analysis, naive Bayes, and nearest neighbor) to enhance the accuracy of the diagnosis. Selecting the six classifiers is based on the authors’ experience gathered by medical data evaluation for many years.
Although various classifiers are available, there are main groups with a similar underlying mathematical concept. The selected classifiers implement different mathematical assumptions and a diversity of algorithm structures.
In a first step the evaluation of a single questionnaire was performed by six different classifier algorithms. For a patient showing specific symptoms with respect to one of the seven diagnoses, a majority of the 6 classifiers returned an identical result. The classifier results are a vector of probability values for each of the seven diagnoses.
For most questionnaires a fusion algorithm was necessary to perform a weighted majority voting. Each classifier delivered a disease number as well as a corresponding probability value for each assumed diagnosis. The maximum total sum of all probability values for each single diagnosis indicated the diagnosis with the highest relative probability. Summing the probabilities of all classifiers for each diagnosis yielded a score. The diagnosis with the highest score was chosen if it exceeded a certain value.
With the probability p(d,c) for the diagnosis d calculated by the classifier c (c = 1,…6) the diagnosis of the fusion classifier is given by:\( \underset{d}{\mathrm{argmax}}\ \left\{{\displaystyle \sum_{c=1}^6}p\left(d,c\right)\right\} \)
Evaluation of the classifier ensemble was based on a 21-fold stratified cross-validation algorithm and on case studies with patients who entered the hospital without knowing the final diagnoses. The models were developed and tested by Java software sources including function calls to the R statistics software package libraries.
Discussion
The main findings of this study are that patients with selected neuromuscular diseases could be identified or distinguished using data mining in conjunction with answer pattern analysis from newly developed questionnaire. Secondly, the results of the study support the notion that data mining methods show plasticity and expandability, making this approach a promising tool for modern diagnostics. Indeed, the diagnostic accuracy of the tool was nearly 90 % depending on the diagnostic group. Good results for NPV and PPV could be reached but need confirmation in a larger scale study. These preliminary results support our hypothesis that medical history taking, which was simulated here using selected questions, together with modern computational methods is powerful to assist the physician in generating a diagnosis.
Diagnostic support is needed for neuromuscular diseases due to a lack of experience with these disease entities by GPs and even many sub-specialties. Often the diagnosis is delayed. A recent report on patients with oculopharyngeal muscular dystrophy by Scotland et al. demonstrated a prolonged time frame, up to 20 years, before the diagnosis was made [
2]. The reasons for the delay were multiple including patient denial, nonspecific symptoms, clinical variability, and rarity of the disease [
13‐
16]. However, the role of the GP as gatekeeper must be highlighted as well [
1,
11,
12]. New systems to remind medical gatekeepers of rare diseases are highly desirable and multiple reports addressing delays in diagnosis in different disease groups underscored this issue [
6,
7,
11].
Computer aided diagnostic support dates back to the 1980s [
17]. Using databases and statistical algorithms, scientists attempted to reduce diagnostic mistakes and enhance diagnostic accuracy [
18‐
21]. Despite some success, daily real life application was limited and most diagnoses are still made by the practitioner without the assistance of computerized programs. In addition, these initial computer based diagnostic tools had drawbacks. First, the programming of rules to update any expert system is time-consuming and the number of rules to be incorporated in such a system rises exponentially such that data entry is often impracticable [
22]. Moreover, self-assessment by doctors has the potential to inadvertently reinforce false concepts to the detriment of excluding other plausible ideas [
23‐
27]. These barriers were successfully addressed in our project by utilizing self-learning data mining methods and transferring the data entry to patients who simply answer the questionnaire while waiting to see the doctor. This structure also takes advantage of the patient as being an expert on his/her own health.
Unfortunately, the clues for diagnosis are often lost in the physician-patient communication or the physician simply do not appreciate the patient’s perspective fully [
28,
29]. Exploring the past medical history thoroughly is a cornerstone of the medical evaluation, but it is hampered by lack of time and misunderstanding between health professionals and patients [
30,
31]. On the other hand, patients with rare chronic diseases are experts in detecting the signs and symptoms of their disease. Careful attention to patients’ experiences as related to their disease gives important hints for additional work up. These ideas were successfully integrated into our diagnostic support tool using questions developed from patients’ pre-diagnostic experiences [
29].
The diagnostic delay in patients with neuromuscular disorders is influenced by the treating physician at first encounter [
2]. A neurologist might not need a diagnostic support tool for detecting neuromuscular diseases, but for a GP this could be different. The patient with certain key symptoms (e.g. fatigue, cramps, muscle twitching/fasciculations, tripping, slurred speech, or muscle weakness) could answer the questionnaire in the waiting room. The putative diagnosis would be immediately displayed to the physician who could then consider the suggested diagnosis and explore the past medical history in more detail to help refute or substantiate the diagnosis and request additional laboratory or radiological exams prior to referring the patient to a subspecialist.
Our study has certain limitations, however. First, we conducted interviews and collected questionnaires on a heterogeneous group of individuals and the number was small. This might have resulted in a selection bias of the final questions. Importantly, some observations are not reflected in the current questionnaire. Although this may reflect the daily work of a GP who cannot ask all possible questions, it also reveals the restraints of a questionnaire-based diagnostic tool. Second, the tool under investigation does not render a definitive diagnosis but rather directs the GP to a diagnostic group. The treating physician can prompt further testing to reach a definite diagnosis. Of note, we choose only six neuromuscular diseases where diagnostic delay is common, but many other conditions with similar symptoms cannot be diagnosed with this tool at the current time. In addition, one might criticize the system for overfitting and as such being biased for detecting certain diseases much better than detecting a simple muscle ache. However, this may be partially remedied by prospective testing and expansion of the system with new diagnoses (e.g. McA, MMN, and IBM). However, the pilot evaluation of nine patients without a diagnosis resulted in high quality diagnostic suggestions. Third, the prospective trial included only patients with an established neuromuscular disease but no other diagnoses, e.g. chronic cardiac or pulmonary diseases, mimicking a neuromuscular disorder.
The training data set of 210 questions as well as the prospective tests with 64 patients was relatively small and did not represent all possible disease manifestations or all possible neuromuscular diseases. Particularly in the group of patients with muscular dystrophies, we collected questionnaires from patients with different diagnoses who were then computed into one larger group, resulting in more heterogeneity in the group. The next challenge for the system will be to detect individuals with fibromyalgia and pulmonological or psychosomatic disorders, which will be addressed in a future trial. However, as a surprising proof of concept, our data showed that it is possible to generate a diagnostic hint of neuromuscular diseases by computer-based analysis of answer patterns. In contrast, internet search engines of symptoms for self-diagnosis showed disappointing results for motor neuron diseases [
33]. The application of data mining techniques improved the diagnostic quality in selected clinical scenarios [
34]. Recently, the combination of questionnaires and data mining techniques proved very successful for diagnosing rare pulmonary diseases in children [
35]. A randomized study performed by Kostopoulou and co-workers recently demonstrated the beneficial effects of computerized support on the diagnostic accuracy of GPs indicating the potential value of CDSS for clinical usage [
36]. A similar study is planned with the tool under investigation here to analyze its benefit for the clinical use.
Acknowledgments
We are grateful for the support from the patient groups. Special thanks to Mr. Schwagenscheidt and Mr. Schaller (glycogen storage disease self-help group), Mrs. Wolter and Mr. Reher (ALS self-help group, Syke, Germany), and the German Muscle Disease patient group (DGM; Mr. Bösche and Mrs. and Mr. Schulz).
Dr. Hans Hartmann and Dr. Martina Huemer are gratefully acknowledged for valuable discussion of the data. The Elternverein Krebskranke Kinder Hannover e.V. funded parts of WL’s work. Mr. Axel Weiser provided valuable support in designing the questionnaire. Dr. Mwe-Mwe Chao provided excellent support for the revised version of the manuscript.
The Robert Bosch Stiftung (Stuttgart, Germany) is funding further research to improve the diagnosis of rare diseases.
Competing interests
The study was funded by Genzyme Sanofi. WL, LG, and FK are co-founders of Improved Medical Diagnostics IMD GmbH, Hannover, Germany.
Authors’ contributions
LG, US developed the study concept and design. US, LG, and SM acquired the data. WL programmed the data mining algorithm. FK, LG, US, and WL analyzed the data. XK and FK carried out the p-value computation. LG, KK, SP, and US interpreted the data. AG, CSG, TL, CK, and SM collected the prospective data. SM and LG designed the prospective study. LG, FK, WL, and US drafted the manuscript. SP, RD and KK diagnosed and treated patients in Hannover. All authors were involved in the revision of the manuscript. All authors read and approved the final manuscript.