root/MachineLearningWithMATLAB/Classification_RobotFailures/MainAnalysis.m @ 11
10 | anderm8 | %% Classify Robot Execution Failures
|
|
% Robot Execution dataset contains force and torque measurements on a robot
|
|||
% after failure detection. Each failure is characterized by 15 force/torque
|
|||
% samples collected at regular time intervals starting immediately after
|
|||
% failure detection. The total observation window for each failure instance
|
|||
% was of 315 ms. All features are numeric and represent a force or a torque
|
|||
% measured after failure detection. The dataset used in the following
|
|||
% workbook is a modified version where the average of the 15 samples is
|
|||
% used to represent each failure. The goal of this analysis is to build a
|
|||
% model to automatically identify the failure type given the sensor
|
|||
% measurements.
|
|||
%
|
|||
% The data is stored in an excel file, as follows:
|
|||
% Fx Fy Fz Tx Ty Tz Fault
|
|||
%
|
|||
% The original dataset is courtesy of:
|
|||
% Luis Seabra Lopes and Luis M. Camarinha-Matos
|
|||
% Universidade Nova de Lisboa,
|
|||
% Monte da Caparica, Portugal
|
|||
% Date Donated: April 23, 1999
|
|||
%
|
|||
% Copyright 2015 The MathWorks, Inc.
|
|||
%% Import Existing Data
|
|||
% In this example, the data is imported from an Excel File. We can make use
|
|||
% of the interactive Import Tool to import the data and auto-generate the
|
|||
% code for the purpose of automation. The table data type allows us to
|
|||
% collect mixed-type data and metadata properties (such as variable names,
|
|||
% row names, descriptions, and variable units) in a single container.
|
|||
% Tables are suitable for column-oriented or tabular data that is often
|
|||
% stored as columns in a text file or in a spreadsheet. Since our dataset
|
|||
% contains experimental data with rows representing different observations
|
|||
% and columns representing different measured variables, tables are a
|
|||
% suitable choice.
|
|||
% Auto-generated code for importing data
|
|||
faultData = importFaultData('faultData.xlsx');
|
|||
%% Convert Categorical Data into Categorical Arrays
|
|||
% Categorical data contains discrete pieces of information, such as the
|
|||
% different classes of faults in this dataset. A categorical array provides
|
|||
% efficient storage and convenient manipulation of nonnumeric data while
|
|||
% also maintaining meaningful names for the values. We can open a variable
|
|||
% in the Variable Editor and convert categorical attributes into
|
|||
% categorical arrays interactively. MATLAB will echo the code necessary
|
|||
% to accomplish these interactive tasks in the Command Window.
|
|||
% Convert categorical variables into categorical arrays
|
|||
faultData.Fault = categorical(faultData.Fault);
|
|||
%% Visualize Data
|
|||
% By simply visualizing our data, we begin get insights into our dataset.
|
|||
% For example, we can see that the forces in Z direction are largely
|
|||
% responsible in determining the Obstruction fault class.
|
|||
% We can open the variable |faultData| in the Variable Editor and
|
|||
% interactively create various types of plots by selecting one or more
|
|||
% columns. As we create the plots, MATLAB echoes the corresponding commands
|
|||
% in the Command Window.
|
|||
% Display a pie chart to illustrate numerical distribution of faults
|
|||
pie(faultData.Fault)
|
|||
% Fz vs Tz plot, differentiated by Fault Class
|
|||
figure
|
|||
gscatter(faultData.Fz,faultData.Tz,faultData.Fault)
|
|||
xlabel('Fz')
|
|||
ylabel('Tz')
|
|||
title('Outcome')
|
|||
% Visualize data using a box plot
|
|||
figure
|
|||
boxplot(faultData.Fz,faultData.Fault)
|
|||
xlabel('Fault')
|
|||
ylabel('Fz')
|
|||
title('Fz per fault class')
|
|||
%% Summary of our dataset
|
|||
% We can gain some quick insights into the our data by using the |summary|
|
|||
% command.
|
|||
% The summary contains the following information on the variables:
|
|||
% Name (Size and Data Type)
|
|||
% Units (if any)
|
|||
% Description (if any)
|
|||
% Values
|
|||
% numeric variables ? minimum, median, and maximum values
|
|||
% logical variables ? number of values that are true and false
|
|||
% categorical variables ? number of elements from each category
|
|||
summary(faultData)
|
|||
%% Filter Data
|
|||
% From the summary results, notice that the 'lost' category of faults are
|
|||
% only represented by three samples. In machine learning, data is key. Our
|
|||
% model is only as good as the data we feed it. Since we do not have enough
|
|||
% data to represent the 'lost' category, we will remove it from our dataset
|
|||
% to increase the accuracy of our model.
|
|||
% Remove 'lost' category
|
|||
faultData(faultData.Fault == 'lost',:) = [];
|
|||
faultData.Fault = removecats(faultData.Fault);
|
|||
%% Apply Machine Learning Techniques
|
|||
% Statistics and Machine Learning Toolbox features a number of supervised
|
|||
% and unsupervised machine learning techniques. It supports both
|
|||
% classification and regression algorithms. The supervised learning
|
|||
% techniques range from non-linear regression, generalized linear
|
|||
% regression, discriminant analysis, SVMs to decision trees and ensemble
|
|||
% methods.
|
|||
%
|
|||
% Observe that once the data has been prepared, the syntax to utilize the
|
|||
% different modeling techniques is very similar and most of these
|
|||
% techniques can handle categorical predictors directly. The user can
|
|||
% conveniently supply information about different parameters associated
|
|||
% with the different algorithms.
|
|||
%
|
|||
% All of the classification techniques can be explored interactively using
|
|||
% our new Classification Learner App. Once we decide on an algorithm, the
|
|||
% App can generate code for the desired technique.
|
|||
%
|
|||
% For example, below we used the App to generate code for a Medium tree.
|
|||
% Train and view classifier
|
|||
[trainedClassifier, validationAccuracy, validationPredictions, validationScores]= ...
|
|||
trainClassifierComplexTree(faultData);
|
|||
view(trainedClassifier.ClassificationTree, 'Mode', 'graph')
|
|||
%% Predict responses for new data
|
|||
% After we create classification models interactively in the Classification
|
|||
% Learner App, we can export our best model to the workspace. We can then
|
|||
% use the trained model to make predictions using new data.
|
|||
% Predict a response using completely new data
|
|||
PredictedResponse = trainedClassifier.predictFcn(faultData(3,:))
|
|||
%% Evaluate classifier performance
|
|||
% After a classification algorithm has trained on data, we may want to
|
|||
% examine the performance of the algorithm on a specific set of test data.
|
|||
% Various performance measures (such as mean squared error, classification
|
|||
% error, or exponential loss) can summarize the predictive power of a
|
|||
% classifier in a single number. However, a performance curve offers more
|
|||
% information as it lets us explore the classifier performance across a
|
|||
% range of thresholds on its output. An ROC curve shows true positive rate
|
|||
% versus false positive rate for different thresholds of the classifier
|
|||
% output. We can use it, for example, to find the threshold that maximizes
|
|||
% the classification accuracy or to assess how the classifier performs in
|
|||
% regions of high sensitivity and high specificity. Let us plot the ROC
|
|||
% curve for the class 'normal'.
|
|||
% Create and plot the ROC curve for our classifier
|
|||
[X,Y,T,AUC] = perfcurve(faultData.Fault,validationScores(:,3),'normal');
|
|||
disp(AUC)
|
|||
figure
|
|||
plot(X,Y)
|
|||
title('Performance Curves (ROC) for ''normal class''');
|
|||
xlabel('False Positive Rate [ = FP/(TN+FP)]');
|
|||
ylabel('True Positive Rate [ = TP/(TP+FN)]');
|
|||