Pull to refresh

Traffic sign recognition using CNN

Reading time 7 min
Views 699
Machine learning *
Sandbox

During the last years, one of the modern directions of technology development is computer vision. The main task of this direction is the classification of objects from a photo or video camera. In usual problems, it is solved using case-based machine learning methods. This work presents the application of computer vision for traffic sign recognition using machine learning algorithm. A road sign is a flat artificial object with a fixed appearance. There are two applied problems in which road sign recognition algorithms are used. The first task is to control an autonomous vehicle. Self-driving cars are cars that can drive on roads without a driver. A key component of an unmanned vehicle control system is object recognition. Objects of interest are primarily pedestrians, other vehicles, traffic lights and road signs.

Traffic signs are a fundamental portion of our day to day lives. They contain critical information that ensures the safety of all the people around us. A road sign is a flat artificial object with a fixed appearance. There are two applied problems in which road sign recognition algorithms are used. The first task is to control an autonomous vehicle. Self-driving cars are cars that can drive on roads without a driver. A key component of an unmanned vehicle control system is object recognition. Objects of interest are primarily pedestrians, other vehicles, traffic lights and road signs. The second task that uses traffic sign recognition is automatic mapping based on data from DVRs installed on cars. The task is relevant, because currently, compiling and maintaining detailed roadmaps requires either significant financial costs or a large amount of human time.

Basic concepts

Computer vision and image classification

Convolutional Neural Network (CNN) is the primary instrument for classifying objects, faces in photographs, speech recognition. The convolutional neural network, using a special operation - the convolution itself - allows to simultaneously reduce the amount of information stored in memory. Transformation occurs in each layer, and in each layer in a different way as “edges”, “faces”, etc., then the concepts “texture”, “parts of objects” are used further. As a result of this study, it is possible to correctly classify the picture or select the desired object in the image at the final step. Convolutional neural networks are used quite widely and in various fields. The most trivial issue that can be illuminated with the assistance of neural networks has become the classification of images. Currently, the CNN and its alterations are considered the leading calculations for finding objects within the scene in terms of accuracy. Since 2012, neural systems have taken to begin with put within the well-known worldwide competition for picture acknowledgment ImageNet. Convolutional networks are a good middle ground between naturally conceivable systems and the regular multilayer perceptron. To date, the finest comes about in picture acknowledgment are gotten with their offer assistance. On normal, the acknowledgment exactness of such networks exceeds conventional artificial intelligence models by 10-15%. CNN is the core technology of Deep Learning.

Libraries

# data analysis and wrangling
import numpy as np
import pandas as pd
import os
import random
visualization
import matplotlib.pyplot as plt
from PIL import Image
machine learning
from keras.models import Sequential
from keras.layers import Dense
from tensorflow.keras.optimizers import Adam
from keras.utils.np_utils import to_categorical
from keras.layers import Dropout, Flatten
from keras.layers.convolutional import Conv2D, MaxPooling2D
import cv2
from sklearn.model_selection import train_test_split
from keras.preprocessing.image import ImageDataGenerator

Acquire data

The Python Pandas packages helps us work with our datasets. We start by acquiring the training and testing datasets into Pandas DataFrames. We also combine these datasets to run certain operations on both datasets together.

# Importing of the Images
count = 0
images = []
classNo = []
myList = os.listdir(path)
print("Total Classes Detected:",len(myList))
noOfClasses=len(myList)
print("Importing Classes.....")
for x in range (0,len(myList)):
    myPicList = os.listdir(path+"/"+str(count))
    for y in myPicList:
        curImg = cv2.imread(path+"/"+str(count)+"/"+y)
        curImg = cv2.resize(curImg, (30, 30))
        images.append(curImg)
        classNo.append(count)
    print(count, end =" ")
    count +=1
print(" ")
images = np.array(images)
classNo = np.array(classNo)

For proper training and evaluation of implemented systems, we have divided the dataset into 3 sets. Dataset split: 20% test set, validation dataset 20% of remaining and the rest to train dataset.

# Split Data
X_train, X_test, y_train, y_test = train_test_split(images, classNo, test_size=testRatio)
X_train, X_validation, y_train, y_validation = train_test_split(X_train, y_train, test_size=validationRatio)

The data set contains 34799 images and consists of 43 types of road signs. These include basic road signs such as speed limits, stop sign, yield, priority road, "no entry", "pedestrians", and others.

# DISPLAY SOME SAMPLES IMAGES OF ALL THE CLASSES
num_of_samples = []
cols = 5
num_classes = noOfClasses
fig, axs = plt.subplots(nrows=num_classes, ncols=cols, figsize=(5, 300))
fig.tight_layout()
for i in range(cols):
    for j,row in data.iterrows():
        x_selected = X_train[y_train == j]
        axs[j][i].imshow(x_selected[random.randint(0, len(x_selected)- 1), :, :], cmap=plt.get_cmap("gray"))
        axs[j][i].axis("off")
        if i == 2:
            axs[j][i].set_title(str(j)+ "-"+row["Name"])
            num_of_samples.append(len(x_selected))
# DISPLAY A BAR CHART SHOWING NO OF SAMPLES FOR EACH CATEGORY
print(num_of_samples)
plt.figure(figsize=(12, 4))
plt.bar(range(0, num_classes), num_of_samples)
plt.title("Distribution of the training dataset")
plt.xlabel("Class number")
plt.ylabel("Number of images")
plt.show()

There is a significant imbalance between the classes in the dataset. Some classes have less than 200 images, while others have more than 1000. This means that our model can be biased towards overrepresented classes, especially when it is not confident in its predictions. To fix this problem, we used existing image transformation techniques.

For better classification, all images from the dataset have been converted to grayscale images. Grayscale is the color mode of images that are displayed in shades of gray, placed in the form of a table as white color brightness standards.

# PREPROCESSING THE IMAGES
def grayscale(img):
img = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
return img
def equalize(img):
img =cv2.equalizeHist(img)
return img
def preprocessing(img):
img = grayscale(img)     # CONVERT TO GRAYSCALE
img = equalize(img)      # STANDARDIZE THE LIGHTING IN AN IMAGE
img = img/255            # TO NORMALIZE VALUES BETWEEN 0 AND 1 INSTEAD OF 0 TO 255
return img
X_train=np.array(list(map(preprocessing,X_train)))  # TO IRETATE AND PREPROCESS ALL IMAGES
X_validation=np.array(list(map(preprocessing,X_validation)))
X_test=np.array(list(map(preprocessing,X_test)))

Data Augmentation is a method of augmenting the original set of dataset. The more data, the higher the result, this is the basic rule of machine learning.

#AUGMENTATAION OF IMAGES: TO MAKEIT MORE GENERIC
dataGen= ImageDataGenerator(width_shift_range=0.1,   # 0.1 = 10%     IF MORE THAN 1 E.G 10 THEN IT REFFERS TO NO. OF  PIXELS EG 10 PIXELS
                            height_shift_range=0.1,
                            zoom_range=0.2,  # 0.2 MEANS CAN GO FROM 0.8 TO 1.2
                            shear_range=0.1,  # MAGNITUDE OF SHEAR ANGLE
                            rotation_range=10)  # DEGREES
dataGen.fit(X_train)
batches= dataGen.flow(X_train,y_train,batch_size=20)  # REQUESTING DATA GENRATOR TO GENERATE IMAGES  BATCH SIZE = NO. OF IMAGES CREAED EACH TIME ITS CALLED
X_batch,y_batch = next(batches)

Hot coding was used for our categorical values y_train, y_test, y_validation.

y_train = to_categorical(y_train,noOfClasses)
y_validation = to_categorical(y_validation,noOfClasses)
y_test = to_categorical(y_test,noOfClasses)

To create a neural network, the Keras library will be used. Here is the code for creating the model structure:

def myModel():
    model = Sequential()
    model.add(Conv2D(filters=32, kernel_size=(5,5), activation='relu', input_shape=X_train.shape[1:]))
    model.add(Conv2D(filters=32, kernel_size=(5,5), activation='relu'))
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Dropout(rate=0.25))
    model.add(Conv2D(filters=64, kernel_size=(3, 3), activation='relu'))
    model.add(Conv2D(filters=64, kernel_size=(3, 3), activation='relu'))
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Dropout(rate=0.25))
    model.add(Flatten())
    model.add(Dense(256, activation='relu'))
    model.add(Dropout(rate=0.5))
    model.add(Dense(43, activation='softmax'))
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model
# TRAIN
model = myModel()
print(model.summary())
history = model.fit(X_train, y_train, batch_size=batch_size_val, epochs=epochs_val, validation_data=(X_validation,y_validation))

The code above used 6 convolutional layers and 1 fully connected layer. First, convolutional layers with 32 filters were added to the model. Next, we add a convolutional layer with 64 filters. Behind each layer, a maximum pulling layer with a window size of 2 × 2 is added. Dropout layers with coefficients of 0.25 and 0.5 are also added so that the network does not retrain. In the final lines, we add a dense Dense layer that performs a classification among 43 classes using the softmax activation function.

At the conclusion of the last epoch, we got the following values: loss = 0.0523; accuracy = 0.9832; val_loss = 0.0200; val_accuracy = 0.9943, which is extremely good.

############################### PLOT
plt.figure(1)
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.legend(['training','validation'])
plt.title('loss')
plt.xlabel('epoch')
plt.figure(2)
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.legend(['training','validation'])
plt.title('Acurracy')
plt.xlabel('epoch')
plt.show()
score =model.evaluate(X_test,y_test,verbose=0)
print('Test Score:',score[0])
print('Test Accuracy:',score[1])
#testing accuracy on test dataset
from sklearn.metrics import accuracy_score
y_test = pd.read_csv('Test.csv')
labels = y_test["ClassId"].values
imgs = y_test["Path"].values
data=[]
for img in imgs:
image = Image.open(img)
image = image.resize((30,30))
data.append(np.array(image))
X_test=np.array(data)
X_test=np.array(list(map(preprocessing,X_test)))
predict_x=model.predict(X_test)
pred=np.argmax(predict_x,axis=1)
print(accuracy_score(labels, pred))

We checked the built model in the test dataset and got an accuracy of 96 percent.

Using the built-in function model_name.save(), we can save a model for later use. This feature saves the model in a local .p file so that you don't have to restart the model over and over without wasting a lot of time.

model.save("CNN_model_3.h5")

Results:

During the thesis, a CNN model was built using Python programming language and libraries Keras and OpenCV and successfully classified the traffic signs classifier with 96% accuracy. A traffic sign recognition application was developed, which has 2 working options, recognition by picture and real-time recognition of road signs using the webcam.

Link to Github

Tags:
Hubs:
Rating 0
Comments 0
Comments Leave a comment

Posts