This project was created as a self guided learning experience to gain a better understanding of using keras and the MNSIT dataset to create machine learning models.
MNIST contains 70,000 images of hand-written digits, each 28 x 28 pixels, in greyscale with pixel-values from 0 to 255.
We will build a feedforward neural network and train it to solve a real problem with Keras.
Problem: Given an image, classify it as a digit (0-9)
Solution: We’ll flatten each 28x28 into a 784 dimensional vector, which we’ll use as input to our neural network. Our output will be one of 10 possible classes: one for each digit.
First we load the data into our program. The data contains 70,000 images of hand-written digits, each 28 x 28 pixels, in greyscale with pixel-values from 0 to 255. We can separate this into training and testing datasets. We first allocate 60,000 (60,000,28,28) to the training dataset and 10,000 (10,000,28,28) to the testing dataset.
Each handwritten number comes with a label that is used to determine the number in the photo. This label will never been seen by the computer, except for when testing the models accuracy, and to adjust weights to decrease our loss.
To gain an understanding of what the computer see’s, we will print the number 4 (Lower right hand corner). This image is a 28x28 matrix with values from 0-255 which indicate its color.
Now, we need to flatten each image before we can pass it into our neural network. Instead of a 28 x 28 matrix, we build our network to accept a 784-length vector.
Each image needs to be then reshaped (or flattened) into a vector. We’ll also normalize the inputs to be in the range [0-1] rather than [0-255] to make our network easier to train.
Every Keras model is either built using the Sequential class, which represents a linear stack of layers, or the functional Model class, which is more customizeable. We’ll be using the simpler Sequential model, since our network is indeed a linear stack of layers.
We start by instantiating a Sequential model:
The Sequential constructor takes an array of Keras layers. Since we’re just building a standard feedforward network, we only need the Dense layer.
The first two layers have 64 nodes each and use the ReLU activation function. The last layer is a Softmax output layer with 10 nodes, one for each class.
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
model = Sequential([
Dense(512, activation='relu', input_shape=(784,)),
Dense(512, activation='relu'),
Dense(10, activation='softmax'),
])
Before we can begin training, we need to configure the training process. We decide 3 key factors during the compilation step:
model.compile(
optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'],
)
Using Keras model.fit() we can measure how well the model generalizes to similar data to that on which it was trained. A well-fitted model produces a more accurate outcome. One that is over fitted matches the data too closely, vice versa to one that is under fitted.
Of the 60,000 testing images, we can see that we achieved a loss of 0.0178 and an accuracy of 0.9951 with 8 epochs.
To allow the computer to process an image we will be using OpenCV which is aimed mainly for real-time computer vision.
After loading in our model, we can turn the image from color to black and white since no important information will be lost. We also decrease the dimensionality of the image from its resolution (600x600 shown in photo) to an 28x28 pixel image. Next, we normalize the data and run it through our model.
Shown below the model accurately predicts this number to be a 7