Journal of Medical Signals & Sensors

: 2019  |  Volume : 9  |  Issue : 3  |  Page : 174--180

Diagnosis of common headaches using hybrid expert-based systems

Monire Khayamnia1, Mohammadreza Yazdchi2, Aghile Heidari3, Mohsen Foroughipour4,  
1 Department of Mathematics, Payame Noor University, Tehran, Iran
2 Department of Biomedical Engineering, Faculty of Engineering, University of Isfahan, Isfahan, Iran
3 Department of Mathematics, School of Mathematics, Mashhad Payame Noor University, Mashhad, Iran
4 Department of Neurology, Faculty of Medicine, Neurology School of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran

Correspondence Address:
Dr. Monire Khayamnia
Payame Noor University, Tehran


Background: Headache is one of the most common forms of medical complaints with numerous underlying causes and many patterns of presentation. The first step for starting the treatment is the recognition stage. In this article, the problem of primary and secondary headache diagnosis is considered, and we evaluate the use of intelligence techniques and soft computing in order to predict the diagnosis of common headaches. Methods: A fuzzy expert-based system for the diagnosis of common headaches by Learning-From-Examples (LFE) algorithm is presented, in which Mamdani model was used in fuzzy inference engine using Max–Min as Or–And operators, and the Centroid method was used as defuzzification technique. In addition, this article has analyzed common headache using two classification techniques, and headache diagnosis based on a support vector machine (SVM) and multilayer perceptron (MLP)-based method has been proposed. The classifiers were used to recognize the four types of common headache, namely migraine, tension, headaches as a result of infection, and headaches as a result of increased intra cranial presser. Results: By using a dataset obtained from 190 patients, suffering from primary and secondary headaches, who were enrolled from a medical center located in Mashhad, the diagnostic fuzzy system was trained by LFE algorithm, and on an average, 123 pieces of If-Then rules were produced for fuzzy system, and it was observed that the system had the ability of correct recognition by a rate of 85%. Using the headache diagnostic system by MLP- and SVM-based decision support system, the accuracy of classification into four types improved by 88% when using the MLP and by 90% with the SVM classifier. The performance of all methods is evaluated using classification accuracy, precision, sensitivity, and specificity. Conclusion: As the linguistic rules may be incomplete when human experts express their knowledge, and according to the proximity of common headache symptoms and importance of early diagnosis, the LFE training algorithm is more effective than human expert system. Favorable results obtained by the implementation and evaluation of the suggested medical decision support system based on the MLP and SVM show that intelligence techniques can be very useful for the recognition of common headaches with similar symptoms.

How to cite this article:
Khayamnia M, Yazdchi M, Heidari A, Foroughipour M. Diagnosis of common headaches using hybrid expert-based systems.J Med Signals Sens 2019;9:174-180

How to cite this URL:
Khayamnia M, Yazdchi M, Heidari A, Foroughipour M. Diagnosis of common headaches using hybrid expert-based systems. J Med Signals Sens [serial online] 2019 [cited 2020 Aug 6 ];9:174-180
Available from:

Full Text


The need for improved efficiency in the use of diagnostic systems has long been documented. This need for selectivity has been identified clearly for illnesses with close symptoms. Headache is one of the most common reasons for neurological consultation and has many causes and symptoms. There are two categories of headache: primary and secondary headaches. Primary headache is not associated with other diseases and those diseases that are not due to other underlying health problems, such as migraine and tension headache. Secondary headache is caused by associated diseases, which have a separate cause such as headaches as a result of infection and headaches as a result of increase in intracranial pressure (ICP).[1],[2],[3],[4],[5],[6] The recognition of headache would be possible through its signs and symptoms, but due to similar symptoms and markers, it could lead to some mistakes in the recognition process of low-experienced medicines or doctors. In addition to pain specification checking, most of the doctors would precede clinical palpation, which is comprised of blood pleasure and pulse checking, complete central nervous system checking, and ophthalmoscopy. In addition to the mentioned actions, sometimes, it is necessary to proceed some specific recognition approaches, such as computed tomography scan, magnetic resonance imaging, electroencephalography (EEG) (EEG registration), visibility scrutinizing, and blood testing for recognizing of pain reasons. These sorts of experiments would be prescribed for putting away the other reasons of the pain.[6],[7],[8] For recognition of headaches, fuzzy system could play a valuable role.[9],[10],[11],[12],[13],[14],[15] For attaining fuzzy If-then rules, two possible approaches are in access. The first one is receiving from human expert and the second one is fuzzy system production by automatic learning approaches. It could be possible which the received If-then rules from human expert were not complete or these human experiences could not cover the whole items, and due to this reason, the systems which were based on human expert experience are assumed to be of low level of validity. In recent years, for producing fuzzy If-then rules, many approaches have been offered from training data.[16],[17],[18] One of the existing approaches is the Learning-From-Example (LFE) algorithm.[18]

The use of classifier systems in medical decision support systems is increasing gradually and has shown great potential in medical diagnosis. Classification systems can minimize errors in the recognition of disease in shorter time and improve the diagnostic accuracy. Multilayer perceptron (MLP) neural networks and support vector machines (SVM) are being used in medical diagnosis.[19],[20] Artificial neural networks (ANNs) have been used in the recognition of urological dysfunctions, heart disease, and psychiatric disorders.[21],[22],[23] In urology, prostate cancer serves as a good example for working out with ANNs.[24] SVM has been used successfully as an learning method for classification and for the diagnosis of erythemato-squamous diseases and breast cancer diagnosis.[25],[26]

The objective of this study was to evaluate the sensitivity and specificity of three different diagnostic strategies: a fuzzy expert system which the If-Then rules have been reached from the LFE algorithm, MLP neural networks classifier, and SVM classifier. In addition, we also compared the ability of these strategies to achieve an absolute diagnostic test accuracy of >88%.

The rest of the article is organized as follows. Section 2 describes the methods; summarizes the related work on common headache diagnosis; describes LFE algorithm, basic MLP neural networks concepts, and the method of SVM. The experimental results are reported in Section 3. The methodology and experiments are discussed in Section 4. Finally, the conclusions are summarized in Section 5.


We reviewed our database of patients who are suffering from headache and presented to Ghaem Hospital in Mashhad city, Iran, from April 2014 to January 2016. In fact, the data were clearly collected through clinical data that were presented from patients' medical records. Patients eligible for inclusion were consecutive adults (>10 years) with symptoms based on headache. All patients were evaluated and screened for study eligibility by the first author (ENE) prior to the study entry. This was a convenience sample of patients with headache; the patients were enrolled when the first author was present in the physicians' office. According to our primary study, it was estimated that the system could have the diagnostic ability of 76% for a sample by thirty members; in other words, the estimated “p” factor is 0.76. Then, sample society volume was calculated using Cochran formulation (due to unknown society volume) and finally obtained n = 190. The Cochran formulation (on the time which the society volume is unknown) is [INSIDE:1], in which the [INSIDE:2] depends on the current distance of certainty and the error of α. Whenever the error level is considered 0.05, the certainty level would be 0.95, and consequently the [INSIDE:3]= 1.96 and d = 0.05 could be gained.

Related work on common headache diagnosis

Ambiguity and uncertainty in medicine knowledge is explicit and clear, which is related to the modeling of medicine knowledge. The fuzzy logic has a good power for describing the enigmatic and imprecise aspects and due to this reason, this tool could be used for the system modeling.

A fuzzy expert system with 12 input parameters and 4 output parameters [Table 1] and 237 If-Then rules from human expert was used for the recognition of common headaches, Mamdani model was used in fuzzy inference engine using Max–Min as OR–AND operators, and Centroid method was used as a defuzzification technique. Accuracy of the fuzzy expert system was 80% by using data obtained from 190 patients.[12] It is very likely that an expert may not be able to express his or her knowledge explicitly and accurately. Another approach is to generate fuzzy rules through a machine-learning process. In recent years, for producing fuzzy if-then regulation, many approaches have been offered from training data.[16],[17],[18] One of the existing approaches is the LFE algorithm.[18]{Table 1}

Learning-From-Example algorithm

Wang and Mendel proposed a five-step algorithm for generating fuzzy rules by learning from examples.[18]

In LFE algorithm, Step 1 divides the input and output spaces into fuzzy regions; Step 2 generates fuzzy rules from the given desired input–output data pairs; Step 3 assigns a degree to each generated rule; Step 4 forms the combined fuzzy rule base; and Step 5 presents a defuzzifying procedure for obtaining a mapping based on the combined fuzzy rule base.

Nearly 90% of our current existing data is assumed as our training data and 10% of this data is considered as testing data. By training data, the system could be taught and then the If-Then rules could be produced. By testing data, the ability of extension to fuzzy deduction would be evaluated.

It would be assumed that the A complex includes the below members:

A = {a1, a2,…, am} (1)

Inside this complex, each ai member shows 12 input parameters and 4 output parameters as shown below:

ai= ([xi1, xi2,… xi12], yi) (2)

At first, the degree of each parameter xi1, xi2,… xi12, yi in the different fuzzy groups should be calculated, and then the system would dedicate a group by the maximum degree to each parameter and by this approach, one rule for each pair of input–output data would be achieved. For example, the membership function of severity of headache is shown in [Figure 1].{Figure 1}

For the next stage, for each rule, one degree would be considered according to the below formulations:

D (rule [i]) = μ (xi1). μ (xi2)… μ (xi12). μ (yi) (3)

If there is more than one rule, in which their assumption part is the same but they are different at their results, the rule which has the higher degree would be considered and then a compressed group of rules would be reached.

Multilayer perceptron neural networks

MLP is one of the most commonly used neural network architectures in medical decision support systems because of its features such as the ability to learn, ease of implementation, fast operation, and it belongs to the class of supervised neural networks.[27]

A MLP network consists of three or more layers of neurons: an input layer that receives external inputs, one or more hidden layers, and an output layer which generates the classification results [Figure 2]. A jth neuron in a first hidden layer by which MLP computes the weighted sum of the inputs and adds a “bias” term (θj) and transforms this sum through a suitable mathematical “transfer function,” and transfers the result to neurons in the next layer. The whole process is defined as follows:{Figure 2}


Where x1, x2,… xp are inputs, θj is the bias, and wji is the connection weight between the input xi and the j th neuron, and fj is the transfer function of the j th neuron, and yj is the output. Various transfer functions are available; however, the most common choice of the transfer function is the sigmoid one, as defined in Eq. 5.


The learning used is back propagation (BP) algorithm with the adaptive learning rate and the momentum constant. BP is one of the simplest methods for the supervised training of MLP.[27],[28] The basic BP algorithm runs as follows:

All the connection weights w are initialized with small random values from a pseudorandom sequence generatorWhile the error E is below a preset value or until the gradient [INSIDE:4] is smaller than a preset value, the following three basic steps are repeated, until to converge:

The update are computed by [INSIDE:5]The weights are updated using [INSIDE:6]Compute the error [INSIDE:7].

Where t is the iteration number, W is the connection weight, and η is the learning rate. The error E can be chosen as the mean square error function between the actual output yj and the desired output dj:


To achieve faster learning and avoid oscillation problems during the search for the minimum value on the error surface, an additional term is used and hence Eq. 7 becomes:


Where α is the “momentum” term and 0< α<1. The learning rate can be updated using [INSIDE:8], where η0 is a preset learning rate and λ >0

Support vector machine

Vapnik proposed the SVM that has been studied extensively for classification.[29] The SVM algorithm represents an example of a binary classifier which can be generalized to a multiclass classifier. Assume that the classification of the training vectors belongs to two linearly separable classes,


Where xi is a n-dimensional input vector and yi is a label that determines the class of xi. SVM is mainly based on a separating hyperplane which is represented by Eq. 9.



W is an orthogonal vector and represents the weight vector

X represents the input vector

b represents a bias or a threshold

The parameters W and b are constrained by


that in canonical form must satisfy the following relations,


A separating hyperplane with a large margin is defined by (12)


Where x1 and x2 represents two support vectors. The maximization process of the margin is equivalent with the minimization of the ||W||2

Hence, if we can separate the data perfectly, then we can optimize the following:

Minimize ||W||2 (13)

subject to yi(wxi+ b) ≥1

To deal with the nonseparable case, one can rewrite the problem as:


subject to [INSIDE:9]

with a user-defined positive finite constant C.


Between April 2014 and January 2016, 213 patients were assessed for initial eligibility and invited to participate. [Figure 3] shows the flow of patients through the study. In total, 190 patients completed the study, with a completion rate of 89%. Of the 190 patients with headache, 133 had migraine headache, 19 had tension headache, 12 had headache as a result of infection, and 26 had headache as a result of increased ICP (IICP).{Figure 3}

The median age of the participants was 32 years (range: 17–65), and 112 participants (58.9%) were female.

Baseline demographic and clinical characteristics of the study participants are shown in [Table 2].{Table 2}

The designed fuzzy system by LFE algorithm has been studied and evaluated on 190 medical records collected from patients suffering from four headache diseases, have been used to learn and test the system by 10-fold cross-validation and on an average, 123 rules of fuzzy if-then were attained for the system and showed 82% good agreement. For example, two rules that obtained are: if fever is no, diplopia is no, convulsion is no, vomiting is no, aura is yes, severing by special smell is yes, improving with inhalation is no, headache site is both sides, headache quality is not throbbing, headache intensity is severe and high severe, headache duration is from 4 h to 72 h, and headache history is some month, then the type of headache is migraine.

If fever is yes, diplopia is no, convulsion is no, vomiting is no, aura is no, severing by special smell is yes, improving with inhalation is yes, headache site is both sides, headache quality is throbbing, headache intensity is severe and high severe, headache duration is from 72 h to 4 weeks, and headache history is some days, then the type of headache is headache as a result of infection.

The accuracy, precision, sensitivity, and specificity of this system are presented in [Table 3]. Overall, the value of Macro F-Score was 0.88.{Table 3}

The MLP used in this study has been designed through the WEKA (at the University of Wyokota in New Zealand) software, which consists of three layers including an input layer, a hidden layer, and an output layer with 12 input variables, 15 hidden neurons, 4 outputs, and the activation function was the sigmoidal function. The BP training parameters in the system are shown in [Table 4].{Table 4}

The SVM is a binary classifier which can be extended into a multiclass classifier. For four classes (A–D), four classifiers are necessary; for example, one SVM classifies A from B, C, and D and a second SVM classifies B from A, C, and D. The multiclass classifier output codes for A, B, C, and D have codes namely (1, −1, −1, −1), (−1, 1, −1, −1), (−1, −1, 1, −1), and (−1, −1, −1, 1), respectively.

The training algorithm of the SVMs is based on quadratic programming. The quadratic programming problem in the SVMs was solved by using the MATLAB optimization toolbox.

In this study, the classifiers proposed for medical decision-making were MLP and SVM. To comparatively evaluate the performance of the two classifiers, both of them were trained by the same training data set and tested with the evaluation data set. The data set (190 patients) was divided into two separate data sets – the training data set and the testing data set. We have used the most common measures to evaluate the effectiveness of our method. These measures are classification accuracy, precision, and sensitivity. The cross-validation estimated the accuracy of each MLP test run, and the mean accuracy based on the 10-fold cross-validation method is listed in [Table 5] and [Table 6]. The accuracy of the proposed MLP and SVM-based decision support system was 0.88 and 0.90, respectively. Overall, the value of Macro F-Score for MLP and SVM was 0.76 and 0.81, respectively.{Table 5}{Table 6}


Headache is almost a universal experience that most of us have some kind of headache at some time in our lives. In fact, headache could be the symptom of many illnesses, and it could only be assumed to be a disease and due to these reasons, the headaches could be so varied and populated. Due to the similar symptoms of headaches, it can cause some faults and mistakes for low-experienced doctors. Because of these faults and mistakes in the recognition system, the systems which are using the current knowledge can support the doctors' duties, which would be so important and valuable. In this article, we have designed a fuzzy system by using the LFE algorithm, and we have presented a medical decision support system based on the MLP neural network architecture and multiclass SVM for common headache diagnosis. By noticing the importance of on-time recognition and gaining pleasant results, the results of these systems are compared. A headache diseases database consisting of 190 cases has been used in this study and 10-fold cross-validation has been applied to assess the generalization of these systems. Our two main hypotheses were that all quality variables such as headache intensity, fever, and severing by special smell can change to quantity variables, and with 12 features, it has been attempted to design intelligent systems to the recognition of common headaches.


This study aimed at recognizing, predicting, and diagnosing prevalent headaches by soft-computation approaches and designing intellectual systems. Unlike hard-computation approaches, in soft-computation style, the contrived plans which have been produced from humankind would be dedicated for solving the problems.

In this study, in order to classification and diagnosis of the four types (migraine, tension, headaches as a result of infection, and headaches as a result of IICP) of headache by the usage of features that attain from patients' medical records, were introduced a fuzzy system without the usage of genius person's knowledge and through the LFE algorithm and the validity amount resulted to 82%. In addition, two types of classifiers (MLP and SVM) were implemented. We have applied cross-validation methods to assess the generalization of the system. It was observed that the accuracy of system was 88% and 90% for MLP and SVM, respectively.

The results show that the multiclass SVM and the MLP neural network can achieve very high diagnostic accuracy of 90% and 88%, respectively, and demonstrate that the multiclass SVM and MLP can be used in the diagnosis of the common headaches by taking into consideration the misclassification rates and prove its usefulness and ability in medical decision support systems.

By consideration of on-time diagnoses and existence of some deficiencies for received linguistic rules from experts, LFE system as a decision-making helper would be used for diagnosing headache and headache sorts in hospitals because better results have been obtained by the application of LFE learning algorithm, and the operation of SVM model has been in better condition rather than that of MLP algorithm.

Financial support and sponsorship


Conflicts of interest

There are no conflicts of interest.


Monire Khayamnia Ph.D. of Applied Mathematics, Tehran Payame Noor University, Tehran, Iran



Mohammadreza Yazdchi Associate Professor, Department of Biomedical Engineering, Faculty of Engineering, University of Isfahan, Isfahan, Iran



Aghile Heidari Profesor, Department of Mathematics, School of Mathematics, Mashhad Payame Noor University, Mashhad, Iran



Mohsen Foroughipour Associate Professor, Neurology school of medicine, Mashhad University of Medical Sciences, Mashhad, Iran



1Ravishankar K. The art of history-taking in a headache patient. Ann Indian Acad Neurol 2012;15:S7-14.
2Steiner TJ, Birbeck GL, Jensen RH, Katsarava Z, Stovner LJ, Martelletti P, et al. Headache disorders are third cause of disability worldwide. J Headache Pain 2015;16:58.
3William C, Shiel Jr, MD, FACP, FACR, Headache Symptoms & Signs, 2015.
4Cafasso J. Migraine Symptoms, Medically Reviewed by Steven Kim. Health Line, MD; 20 May, 2015.
5Forsyth PA, Posner JB. Headaches in patients with brain tumors: A study of 111 patients. Neurology 1993;43:1678-83.
6Dalessio DJ. Diagnosing the severe headache. Neurology 1994;44:S6-12.
7Evans RW. Diagnostic testing for headache. Med Clin North Am 2001;85:865.
8Godwin A, Villa J. Acute headache in the ED: Evidence-based evaluation and treatment options. Emerg Med Pract 2001;3:1-32.
9Sikchi SS, Sikchi S, Ali MS. Fuzzy expert systems (FES) for medical diagnosis. Int J Comput Appl (0975-8887) 2013;63.
10Yao JF, Yao JS. Fuzzy decision making for medical diagnosis based on fuzzy number and compositional rule of inference. Fuzzy Sets Syst 2001;120:351-66.
11Sanchez E. Truth-qualification and fuzzy relations in natural languages, application to medical diagnosis. Fuzzy Sets Syst 1996;84:155-67.
12Khayamnia M, Yazdchi M, Heidari A, Foroughipour M. Fuzzy expert system for diagnosis of common headaches with similar symptoms. MJMS 2017;5:1680-4.
13Kim YH, Kim SK, Oh SY, Ahn JY. A fuzzy differential diagnosis of headache. J Korean Data Inf Sci Soc 2007;18:429-38.
14Ahn JY, Kim YH, Kim SK. A fuzzy differential diagnosis of headache applying linear regression method and fuzzy classification. IEICE Trans Inf Syst 2003;E86-D:2790-3.
15Ahn JY, Mun KS, Kim YH, Oh SY, Han BS. A fuzzy method for medical diagnosis of headache. IEICE Trans Inf Syst 2008;E91:1215-7.
16Nomura H, Hayashi I, Wakami N. A learning method of fuzzy inference rule\ by descent method. IEEE International Conference on Fuzzy Systems. San Diego: S~wcw.s; 1992. p. 203-10.
17Burkhardt DG, Bonissone PP. Automated fuzzy knowledge base generation and tuning, IEEE International Conference on Fuzzy Systems. San Diego: CA, USA; 1992. p. 179-88.
18Wang LX, Mendel JM. Generating fuzzy rules by learning from examples. IEEE Trtrrls Syst Morl CJhrrI7VI 1992;22:1414-27.
19Amato F, López A, Peña-Méndez EM, Vaňhara P, Hampl A, Havel J. Artificial neural networks in medical diagnosis. J Appl Biomed 2013;11:47-58.
20Bo G, Xianwu H. SVM multi-class classification. J Data Acquis Process 2006;3:47-52.
21Gil D, Johnsson M, Chamizo JM, Paya AS, Fernandez DR. Application of artificial neural networks in the diagnosis of urological dysfunctions. Expert Syst Appl 2009;36:5754-60.
22Yan H, Jiang Y, Zheng J, Peng C, Li Q. A multilayer perceptron-based medical decision support system for heart disease diagnosis. Expert Syst Appl 2006;30:272-81.
23Peled A. Plasticity imbalance in mental disorders the neuroscience of psychiatry: Implications for diagnosis and research. Med Hypotheses 2005;65:947-52.
24Batuello JT, Gamito EJ, Crawford ED, Han M, Partin AW, McLeod DG, et al. Artificial neural network model for the assessment of lymph node spread in patients with clinically localized prostate cancer. Urology 2001;57:481-5.
25Übeyli ED. Multiclass support vector machines for diagnosis of erythemato-squamous diseases. Expert Syst Appl 2008;35:1733-40.
26Akay MF. Support vector machines combined with feature selection for breast cancer diagnosis. Expert Syst Appl 2009;36:3240-7.
27Bishop CM. Neural Networks for Pattern Recognition. Oxford: Oxford University Press; 1995.
28Duda RO, Hart PE, Strok DG. Pattern Classification. New York: Wiley; 2001.
29Vapnik V. The Nature of Statistical Learning Theory. New York: Springer-Verlag; 1995.