OBSERVATION ON TRAINING NEURAL NETWORK FOR DIAGNOSING SCHIZOPHRENIA

: Artificial neural networks may be deployed to diagnose illnesses including mental illness. One such mental illness is schizophrenia which is characterized by persistent delusions, hallucinations, disorganized speech, highly disorganized or catatonic behaviour and negative symptoms. It is commonly believed that one or two hidden layers in a neural network are sufficient to classify data, and that more hidden layers may be avoided because of longer times taken for the network to converge. However, we demonstrate that beyond a certain size of the hidden layer(s), it is harmful to deploy more than one layer because not only will it take longer for the network parameters to converge, the classification performance deteriorates sharply with more than one hidden layers


I. INTRODUCTION
Schizophrenia is a serious mental disorder [1] that may be diagnosed by doing a psychometric test on the subject. The results of the test are captured on a thirty point scale known as the Positive and Negative syndrome Scale (PANSS) [2]. Originally, each item on the scale was assigned a possible rating in the range of one to seven; but because of certain disadvantages of the system, the rating is now assigned between zero to six. [3] Artificial intelligence is increasingly being used to diagnose diseases like diabetes, chest disease and urological dysfunction [4] [5] [6] [7]. It may also be used to diagnose mental illness like schizophrenia. At the heart of the diagnostic system is the artificial neural network, which can classify subjects based on their thirty-dimensional PANSS ratings as either schizophrenic or otherwise. A simple perceptron may be deployed to classify data that is linearly separable, but in case the classes are not linearly separable, the data must be cast into a higher dimension [8]. This is where the multi-layer perceptron (MLP) comes in. The MLP has one input layer, one output layer and one or more hidden layers. The number of nodes in the input layer is equal to the dimension of the input data, which in our case is thirty; the number of nodes in the output layer is one, and the number of nodes in the hidden layer as well as the number of hidden layers may be varied to get optimum classification performance out of the MLP.
Researchers have worked towards establishing the ideal number of hidden layers as well as the ideal number of nodes in each hidden layer. If N t is the number of training samples, N i is the number of input nodes, N h is the number of neurons in the hidden layer and N o is the number of output nodes, then the various values of N h may be arrived at by the following methods: According to Li, Chow and Yu's method [9] [10] [11], According to Tamura and Tateishi's method [9] [10] [12], According to Xu and Chen's method [9] [10] [13], According to Shibata and Ikeda's method [9] [10] [14], According to Sheela and Deepa's method [9] [10], According to Trenn [9] [10], Apart from the above, there are several rules of thumb [15] [16], viz.
• Size of hidden layer must be between sizes of input layer and output layer. • Size of hidden layer must be two thirds the size of the input layer plus the size of the output layer. • Size of hidden layer must be less than twice the size of the input layer.
As for the number of the hidden layers, the common consensus is that one or two layers are adequate for most situations [15].

II. METHOD
We created and trained multi-layer perceptrons with the MATLAB neural network toolbox. We have on hand a set of 960 training samples. The training samples were generated synthetically with the help of a fuzzy expert system and some custom MATLAB code. Different PANSS readings were provided as inputs to the fuzzy expert system for diagnosing schizophrenia, and the output for each reading was noted. The reading was categorized as typical of a schizophrenic or otherwise depending on the output which was assessed by a qualified psychiatrist. The MATLAB software randomly distributed these samples into training, validation and testing sets. The training algorithm used was Levenberg-Marquardt algorithm and mean square error was considered as the error criterion. Various different models of neural networks with varying number of hidden nodes were created. First, the training was done on a neural network with a certain number of nodes in a single hidden layer. Then the training was repeated on a neural network with two hidden layers and the same number of nodes as above in each hidden layer. The same training pairs were used to train all models of neural network.

III. RESULTS AND DISCUSSION
The validation error obtained in each training instance is tabulated below: It is observed that if the number of neurons in the hidden layer is greater than 35, the validation performance deteriorates sharply if a second hidden layer is added. It is commonly believed that adding more hidden layers is overkill in the sense it does not improve performance. But our work has demonstrated that adding more hidden layers not only does not improve performance, it will cause the performance to decline sharply.
For example, the training performance graphs and error histograms for N h = 40 are given below: On the contrary the neural network's ability to generalize is not impacted significantly if the number of neurons in the hidden layer is relatively low. For example, the training performance curves and the error histograms for N h = 4 are given below:

IV. CONCLUSION
When deploying an artificial neural network in software, the user has the flexibility of adding as many or as few neurons as he wants. However a situation may arise wherein the user must use a hardware implementation of a neural network [16] [17] [18], which does not offer the same flexibility. Under such a circumstance, the user may continue to get good performance out of the neural network even if the number of neurons in the hidden layer is larger than what is recommended, provided the neural network has just one hidden layer. However, the validation performance will be unacceptably poor if multiple hidden layers are present. When it comes to classifying data for schizophrenia patients, the user must avoid using neural networks with more than one hidden layer.

V. ACKNOWLEDGMENT
We would like to thank Dr. Subhadip Bharati, MD, for his role in helping us obtain the training data.
No special funding was received for this project.