6.25.1 Percentage correctly classified
The potential buyers are ranked according to their predicted likelihood of repurchase (from most likely to least likely buyer).When an absolute cutoff value is chosen all customers with a posterior’ probability of repurchase higher than the absolute cut off use classified as buyers and all customer with a lower likelihood of repurchase are labeled as non-buyers.The result of classifiers are labeled as non buyers.
Time
Status
|
|
Predicted status
|
| Buyer |
Non Buyer
|
Buyer
|
True positive (TP)
|
False Negative (FN)
|
Non buyer
|
False Positive (FP)
|
True Negative (TN)
|
TP + TN
Accuracy = -----------------------
TP + TN + FP + FN
6.26 Results of Quantitative Analysis
6.26.1 Multiple regression
Table 49: Multiple Regression Model, Summaries
Table 50: Multiple Regression, ANOVA
Table 51: Multiple Regressions - Significant Variables
Table 52: Multiple Regression, Co-efficients
6.26.2 Logistic Regression
Table 53: Logistic Regression - Significant variable
|
B
|
S.E.
|
Wald
|
Df
|
Sig.
|
DOWNPAYM
|
0.000131
|
5.3E-06
|
611.1172
|
1
|
6.4E-135
|
ADVEMI
|
-0.00029
|
1.44E-05
|
395.8029
|
1
|
4.51E-88
|
AGE
|
0.033034
|
0.002476
|
177.9919
|
1
|
1.33E-40
|
QUALIFIC
|
|
|
150.7463
|
4
|
1.41E-31
|
MS
|
0.800038
|
0.068714
|
135.558
|
1
|
2.49E-31
|
WM
|
-0.56064
|
0.055685
|
101.3668
|
1
|
7.64E-24
|
Constant
|
-2.08064
|
0.215209
|
93.46961
|
1
|
4.12E-22
|
DEPENDEN
|
0.066459
|
0.011312
|
34.5172
|
1
|
4.23E-09
|
RESIDENT
|
|
|
28.1598
|
2
|
7.68E-07
|
TV
|
-0.3573
|
0.081688
|
19.13186
|
1
|
1.22E-05
|
CHILDREN
|
-0.33143
|
0.10673
|
9.64268
|
1
|
0.001901
|
Table 54 : Logistic Regression - Non-significant variable
|
B
|
S.E.
|
Wald
|
Df
|
Sig.
|
TW
|
0.10305
|
0.056502
|
3.32642
|
1
|
0.068175
|
OTHINCOM
|
9.81E-06
|
7.22E-06
|
1.843272
|
1
|
0.174568
|
EXPERIEN
|
-0.0038
|
0.002923
|
1.690376
|
1
|
0.193551
|
FW
|
-0.21096
|
0.175445
|
1.445819
|
1
|
0.2292
|
RENT
|
-5.7E-05
|
5.31E-05
|
1.163939
|
1
|
0.28065
|
INCOME
|
-5.7E-07
|
8.64E-07
|
0.43288
|
1
|
0.510579
|
FRIDGE
|
0.008998
|
0.048723
|
0.034104
|
1
|
0.853487
|
6.26.3 Discriminant analysis
Table 55: Discriminant analysis - Classification Results
6.26.4 Factor Analysis
Table 56: Factor Analysis Rotated Component Matrix
6.26.5 Neural Network
Table 57: Neural Network - Performance Table
Type
|
Error
|
Input
|
Hidden
|
Performance
|
RBF
|
0.4992672
|
2
|
1
|
0.5429804
|
RBF
|
0.4979886
|
2
|
2
|
0.5532692
|
RBF
|
0.4866627
|
2
|
4
|
0.6120146
|
Linear
|
0.3427409
|
1
|
-
|
0.9857285
|
Linear
|
0.3361037
|
2
|
-
|
0.9844009
|
Linear
|
0.3357424
|
4
|
-
|
0.9751079
|
MLP
|
0.0973995
|
2
|
8
|
0.9897113
|
MLP
|
0.09143
|
2
|
12
|
0.9907069
|
MLP
|
0.09043
|
3
|
10
|
0.9907069
|
MLP
|
0.08835
|
3
|
12
|
0.9907069
|
Fig. 52: Neural network Topology adopted
Fig. 53: Receiver Operating Curve (ROC)
Table: 58 Classification Table for NN
|
V1
|
V2
|
V3
|
V4
|
V5
|
V6
|
Total
|
2612
|
3415
|
1494
|
1519
|
1439
|
1574
|
Correct
|
2595
|
3382
|
1477
|
1508
|
142
|
1500
|
Wrong
|
17
|
33
|
17
|
11
|
15
|
14
|
Unknown
|
0
|
0
|
0
|
0
|
0
|
0
|
V1
|
2595
|
33
|
1477
|
11
|
1424
|
14
|
V2
|
17
|
3382
|
17
|
1508
|
15
|
1560
|
6.26.5 MLP results
The combination of parameters presented in Table, provide us with an MLP model that is customized for predicting bad loans, and at the same time performance in predicting or classifying other classes of loan applications is not overly sacrificed.
6.26.6 Summary of Models
Table 59 : Summary of Models
Model
|
Variables
|
Results
|
Multiple Regression
|
Advanced EMI, down payment, rent, no. of children, no. of dependents, years of experience, household income
|
R squared value of0.071.Very poor for prediction. All variables except rent and years of experience make significant contribution.
|
Logistic regression
|
Down payment, advance EMI,, dependents children,, rent, music system, washing machine
|
Prediction accuracy of model is 65.9%, very poor, cannot be used for prediction or classification
|
Discriminant analysis
|
Down payment, advance EMI, age, children, house hold income, other income, rent, experience
|
Prediction accuracy of 64% is not adequate. Except income and rent paid all others are significant.
|
Factor analysis
|
Age, income, advance EMI, rent, TV, music system, dependents, children, 2 wheeler, 4 wheeler, experience.
|
Five factors identified and factors extracted. Finance parameters(Income and assets), consumer durables, initial payments (down payment and advance EMI), vehicles, dependents
|
Neural network
|
Factor scores for Finance parameters (Income and assets) consumer durables, Initial payments, (down payments, advance EMI), vehicles, dependents
|
Best prediction accuracy
Good classification
|
6.26.7 Optimal MLP Configuration
Table: 60 Optimal MLP Configurations
Parameters
|
Value
|
Number of Data in Training Set
|
346
|
Number of Data in Test Set
|
344
|
Number of Hidden Neurons
|
15
|
Seed
|
0
|
Learning Rate
|
0.2
|
Momentum
|
0.6
|
Training epochs
|
1000
|
7. CONCLUSION
The analytics involved in this context pave the way for evolving robust credit scoring models and automation of the lending process.They also help discern the pattern of relationship between the input [borrower characteristics] and the output [loan default status].If the underlying relationship is not strictly linear in nature, routine ways of using factor analysis and non-linear oriented neural network may not improve predictive accuracy due to intense level of complexity present in the data.
Multiple regression, Logistic Regression and Discriminant analysis have been shown to be performing inadequately as prediction models whereas neural network is proved to give an improvement in accuracy. This shows that the function is nonlinear, which could not generalize with the existing data and the same data was subjected to NN.
This model will be useful to decide the weights that could be given to various attributes for example the net income, down payment advance could be allotted more marks out of 100.Next priority should be given to education followed by durables and then dependents.
From the classification of the customers and financial behavior as observed by the collection staff the various needs would be assessed and different financial products like insurance, educational loan, medical insurance and systematic investment planning could be designed and targeted. This would result in accurate campaigns for upselling and cross selling.
The scoring and profiling would enable adopting different pricing policies and collection mechanisms for different clusters.
Performance and Optimisation of neural network
NN is a powerful technique for predicting the behavior of output based on a given set of input values.It has a very good potential for application in loan default prediction and lending automation through credit scoring models.
Performance of NN can be improved through various methods of configuring the network and giving the proper inputs.
Conventional techniques like factor analysis can reduce the complexities to a great extent by reducing the number of variables to a few numbers of dimensions.Instead of subjecting the raw data to NN, the extracted factor scores for each case was used as an input to NN.
This dramatically improved the performance of NN which converged in three hours but with 98% accuracy as shown in the classification matrix is shown in the earlier section.
When these factor values are given as input values to NN, it is able to perform dramatically well, what Factor analysis alone is not able to achieve and 98 percent prediction accuracy has been achieved as illustrated in this thesis with the help of live data from a financial services company.
The future holds for a combinatorial approach to prediction with help of linear and non-linear based techniques.
Recommendation : The commitments of marriage and educational expenses should have been captured or at least there should be an ongoing relationship between the borrower and the lender through a local agent to capture the information on the income pattern and spending behaviour, family happenings, seasonality, impact of global and local factors etc.
The scoring should have taken care of the new business with a lower score.
In such cases there should be a high score for the guarantor and guarantor should be assessed rigorously and the family’s involvement in the borrowing should be ensured.
In the case of business due diligence should be exercised when it is new and not established.
In another case the customer had defaulted for 4 months and he had closed one business and started another.So again a case of un stabilized income.This should have been captured in scoring and a high score guarantor should have been tied up.These factors should have been reckoned in pricing and collection.
FIs are of the opinion that Customer information like Occupation and income could be masked from the FIs to avoid any bias and also to facilitate an independent assessment.
It would help the FIs to achieve the time target much better if DSAs get as far as possible accurate information of Customer's location and also do a preliminary screening.
Also it would be helpful if the DSAs are given inputs on fundamentals of marketing and how to face competition.
Do'stlaringiz bilan baham: |