MultiImage Classification & ResNet 50 From Scratch

In this article, we will classify 10 species of animals by developing the ResNet 50 from Scratch.

The data set is available on Kaggle and in this post, you will learn a lot

Introduction

In this notebook, we will learn how to classify images of Animals by developing ResNet 50 From Scratch

Load the images.

Visualize the Data distribution of all data.

Develop ResNet 50 From Scratch.

Train The Model.

Graph the training loss and validation loss.

Predict the results.

Confusion Matrix

Classification Report.

Prediction Comparison.

Install some of the Libraries

                ! pip install split-folders
            

I have installed Split folders. I found this library very useful. You can split your data easily into the desired ratio and then can see it in the folder.

Importing Libraries

We are Importing Libraries.

Libraries which need for.

Image Processing.

Data visualization.

Making Model Architecture.

                mport matplotlib.pyplot as plt
import numpy as np
import os
import PIL
import tensorflow as tf
import pathlib
import cv2
from keras.preprocessing.image import ImageDataGenerator
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential
from keras.models import Sequential, Model,load_model
from keras.callbacks import EarlyStopping,ModelCheckpoint
from keras.layers import Input, Add, Dense, Activation, ZeroPadding2D, BatchNormalization, Flatten, Conv2D, AveragePooling2D, MaxPooling2D, GlobalMaxPooling2D,MaxPool2D
from keras.preprocessing import image
from keras.initializers import glorot_uniform
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, cohen_kappa_score, roc_auc_score, confusion_matrix
from sklearn.metrics import classification_report
from keras.layers import Input, Add, Dense, Activation, ZeroPadding2D, BatchNormalization, Flatten, Conv2D, AveragePooling2D, MaxPooling2D, GlobalMaxPooling2D,MaxPool2D,Dropout
import tensorflow as tf
import splitfolders 
import pandas as pd
import glob
from sklearn.metrics import confusion_matrix
import plotly.graph_objects as go
import itertools
import plotly.express as px
#Suppressing Warnings
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
            

Setting Path

Now we are going to set the path of the data.

                data_dir = "../input/animals10/raw-img"
data_dir = pathlib.Path(data_dir)
            

Importing Images

Import the Images

                Total_Images = glob.glob('../input/animals10/raw-img/*/*.jpeg')
print("Total Number of Images", len(Total_Images))
Total_Images = pd.Series(Total_Images)
            

Now we going to make the data frame of the image data so we can do some anylasis on the data.

                Total_Df = pd.DataFrame()

Total_Df['FileName'] = Total_Images.map(lambda ImageName :ImageName.split("H")[-1])

Total_Df['ClassId'] = Total_Images.map(lambda ImageName :ImageName.split("/")[-2])

Total_Df.head()
            

you will get output like this.

Now count the total number of images of each class.

                Class_Id_Dist_Total = Total_Df['ClassId'].value_counts()
Class_Id_Dist_Total.head(10)
            

you would get output like this.

Total Data Distribution

Now let’s visualize the data so we can have a better understanding

we will use the Plotly library to make Bar charts and Pie chart

                fig = go.Figure(go.Bar(
            x= Class_Id_Dist_Total.values,
            y=Class_Id_Dist_Total.index,
            orientation='h'))

fig.update_layout(title='Data Distribution in Bars',font_size=15,title_x=0.45)


fig.show()
            

A barchart will be some thing like this

                fig=px.pie(Class_Id_Dist_Total.head(10),values= 'ClassId', names=Total_Df['ClassId'].unique(),hole=0.425)
fig.update_layout(title='Data Distribution of Data',font_size=15,title_x=0.45,annotations=[dict(text='Animas-10',font_size=18, showarrow=False,height=800,width=700)])
fig.update_traces(textfont_size=15,textinfo='percent')
fig.show()
            

Siplting the Data into Train test and Val

                splitfolders.ratio(data_dir, output="output", seed=101, ratio=(.8, .1, .1))
            

setting Path

                train_path='./output/train/'
val_path='./output/val'
test_path='./output/test'
class_names=os.listdir(train_path)
class_names_val=os.listdir(val_path)
class_names_test=os.listdir(test_path)
            

calculate the total amount of images in each sub-data set.

                train_image1 = glob.glob('./output/train/*/*.jpeg')

Total_TrainImages = train_image1 
print("Total number of training images: ", len(Total_TrainImages))


test_image1 = glob.glob('./output/test/*/*.jpeg')

Total_TestImages = test_image1
print("Total number of test images: ", len(Total_TestImages))



Val_image1 = glob.glob('./output/val/*/*.jpeg')

Total_ValImages = Val_image1 
print("Total number of val images: ", len(Total_ValImages))
            

The out put will be

                Total number of training images:  19366
Total number of test images:  2447
Total number of val images:  2396
            

lets explore the Train Data set.

First, make the data frame of the train data set.

                train_image_names = pd.Series(Total_TrainImages)
train_df = pd.DataFrame()

# generate Filename field
train_df['Filename'] = train_image_names.map( lambda img_name: img_name.split("/")[-1])


# generate ClassId field
train_df['ClassId'] = train_image_names.map(lambda img_name: img_name.split("/")[-2])

train_df.head()
            

Second count the number of Images of each class

                class_id_distribution_Train = train_df['ClassId'].value_counts()
class_id_distribution_Train.head(10)
            

You can visualize the data in form of a Bar and chart by repeating the process that we done previously.

Displaying The Images

                plot_df = train_df.sample(12).reset_index()
plt.figure(figsize=(15, 15))

for i in range(12):
    img_name = plot_df.loc[i, 'Filename']
    label_str = (plot_df.loc[i, 'ClassId'])
    plt.subplot(4,4,i+1)
    plt.imshow(plt.imread(os.path.join(train_path,label_str, img_name)))
    plt.title(label_str)
    plt.xticks([])
    plt.yticks([])
    plt.yticks([])
            

You will get output like this.

The data analysis part has been done. we just completed our first module of the project

Now Prepare the data set for the model.

Image Data Generator

                from keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(zoom_range=0.15,width_shift_range=0.2,height_shift_range=0.2,shear_range=0.15)
test_datagen = ImageDataGenerator()
val_datagen = ImageDataGenerator()
train_generator = train_datagen.flow_from_directory(train_path,target_size=(224, 224),batch_size=32,shuffle=True)
test_generator = test_datagen.flow_from_directory(test_path,target_size=(224,224),batch_size=32,shuffle=False)
val_generator = val_datagen.flow_from_directory(val_path,target_size=(224,224),batch_size=32,shuffle=False)
            

output

                Found 20938 images belonging to 10 classes.
Found 2627 images belonging to 10 classes.
Found 2614 images belonging to 10 classes.
            

You can see it automatically detected the number of the classes

ResNet50

It is very important to understand the ResNet. The advancement in the computer vision task was due to the breakthrough achievement of the ResNet architecture.

The architecture allows you to go deeper into the layers which are 150+ layers.

It is an innovative neural network that was first introduced by Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun in their 2015 computer vision research paper titled ‘Deep Residual Learning for Image Recognition.

More Details about the paper can be found here Deep Residual Learning for Image Recognition

Before Resnet, In theory, the more you have layers the loss value reduces and accuracy increases, but in practically that did not happen. The more you have layers the accuracy was decreasing.

Convolutional Neural Network has the Problem of the “Vanishing Gradient Problem” During the Backpropagation the value of gradient descent decreases and there are hardly any changes in the weights. To overcome this problem Resnet Comes with Skip Connections.

Skip Connection — Adding the original input to the output of the convolutional block.

Basically, resnet has two parts One is an Identity block and the other one is a convolutional block.

7.1 Identity Block

                def identity_block(X, f, filters, stage, block):
   
    conv_name_base = 'res' + str(stage) + block + '_branch'
    bn_name_base = 'bn' + str(stage) + block + '_branch'
    F1, F2, F3 = filters

    X_shortcut = X
   
    X = Conv2D(filters=F1, kernel_size=(1, 1), strides=(1, 1), padding='valid', name=conv_name_base + '2a', kernel_initializer=glorot_uniform(seed=0))(X)
    X = BatchNormalization(axis=3, name=bn_name_base + '2a')(X)
    X = Activation('relu')(X)

    X = Conv2D(filters=F2, kernel_size=(f, f), strides=(1, 1), padding='same', name=conv_name_base + '2b', kernel_initializer=glorot_uniform(seed=0))(X)
    X = BatchNormalization(axis=3, name=bn_name_base + '2b')(X)
    X = Activation('relu')(X)

    X = Conv2D(filters=F3, kernel_size=(1, 1), strides=(1, 1), padding='valid', name=conv_name_base + '2c', kernel_initializer=glorot_uniform(seed=0))(X)
    X = BatchNormalization(axis=3, name=bn_name_base + '2c')(X)
    X = Add()([X, X_shortcut])# SKIP Connection
    X = Activation('relu')(X)

    return X
            

The value of ‘x’ is added to the output layer if and only if the.

Input Size == Output Size.

Convolutional Block

if Input Size != Output Size.

we add a ‘convolutional block’ in the shortcut path to make the input size equal to the output size.

                def convolutional_block(X, f, filters, stage, block, s=2):
   
    conv_name_base = 'res' + str(stage) + block + '_branch'
    bn_name_base = 'bn' + str(stage) + block + '_branch'

    F1, F2, F3 = filters

    X_shortcut = X

    X = Conv2D(filters=F1, kernel_size=(1, 1), strides=(s, s), padding='valid', name=conv_name_base + '2a', kernel_initializer=glorot_uniform(seed=0))(X)
    X = BatchNormalization(axis=3, name=bn_name_base + '2a')(X)
    X = Activation('relu')(X)

    X = Conv2D(filters=F2, kernel_size=(f, f), strides=(1, 1), padding='same', name=conv_name_base + '2b', kernel_initializer=glorot_uniform(seed=0))(X)
    X = BatchNormalization(axis=3, name=bn_name_base + '2b')(X)
    X = Activation('relu')(X)
    X = Conv2D(filters=F3, kernel_size=(1, 1), strides=(1, 1), padding='valid', name=conv_name_base + '2c', kernel_initializer=glorot_uniform(seed=0))(X)
    X = BatchNormalization(axis=3, name=bn_name_base + '2c')(X)
    X_shortcut = Conv2D(filters=F3, kernel_size=(1, 1), strides=(s, s), padding='valid', name=conv_name_base + '1', kernel_initializer=glorot_uniform(seed=0))(X_shortcut)
    X_shortcut = BatchNormalization(axis=3, name=bn_name_base + '1')         (X_shortcut)

    X = Add()([X, X_shortcut])
    X = Activation('relu')(X)

    return X
            

let’s combine the identity and convolutional block

                def ResNet50(input_shape=(224, 224, 3)):

    X_input = Input(input_shape)

    X = ZeroPadding2D((3, 3))(X_input)

    X = Conv2D(64, (7, 7), strides=(2, 2), name='conv1', kernel_initializer=glorot_uniform(seed=0))(X)
    X = BatchNormalization(axis=3, name='bn_conv1')(X)
    X = Activation('relu')(X)
    X = MaxPooling2D((3, 3), strides=(2, 2))(X)

    X = convolutional_block(X, f=3, filters=[64, 64, 256], stage=2, block='a', s=1)
    X = identity_block(X, 3, [64, 64, 256], stage=2, block='b')
    X = identity_block(X, 3, [64, 64, 256], stage=2, block='c')


    X = convolutional_block(X, f=3, filters=[128, 128, 512], stage=3, block='a', s=2)
    X = identity_block(X, 3, [128, 128, 512], stage=3, block='b')
    X = identity_block(X, 3, [128, 128, 512], stage=3, block='c')
    X = identity_block(X, 3, [128, 128, 512], stage=3, block='d')
     X = convolutional_block(X, f=3, filters=[256, 256, 1024], stage=4, block='a', s=2)
    X = identity_block(X, 3, [256, 256, 1024], stage=4, block='b')
    X = identity_block(X, 3, [256, 256, 1024], stage=4, block='c')
    X = identity_block(X, 3, [256, 256, 1024], stage=4, block='d')
    X = identity_block(X, 3, [256, 256, 1024], stage=4, block='e')
    X = identity_block(X, 3, [256, 256, 1024], stage=4, block='f')

    X = X = convolutional_block(X, f=3, filters=[512, 512, 2048], stage=5, block='a', s=2)
    X = identity_block(X, 3, [512, 512, 2048], stage=5, block='b')
    X = identity_block(X, 3, [512, 512, 2048], stage=5, block='c')

    X = AveragePooling2D(pool_size=(2, 2), padding='same')(X)
    
    model = Model(inputs=X_input, outputs=X, name='ResNet50')

    return model
base_model = ResNet50(input_shape=(224, 224, 3))
headModel = base_model.output
headModel = Flatten()(headModel)
headModel=Dense(256, activation='relu', name='fc1',kernel_initializer=glorot_uniform(seed=0))(headModel)
headModel=Dense(128, activation='relu', name='fc2',kernel_initializer=glorot_uniform(seed=0))(headModel)
headModel = Dense( 10,activation='softmax', name='fc3',kernel_initializer=glorot_uniform(seed=0))(headModel)
            

In the last line “headModel = Dense( 10,activation=’softmax’, name=’fc3',kernel_initializer=glorot_uniform(seed=0))(headModel)”

you can change the number of classes according to your data set as well by changing the 10 to your number of classes , suppose you have 5 classes then the line would be (headModel = Dense( 5,activation=’softmax’, name=’fc3',kernel_initializer=glorot_uniform(seed=0))(headModel))

                model = Model(inputs=base_model.input, outputs=headModel)
model.summary()
            

From here there are two ways, Either you can load the weitghs and use the pre-trained weights for the classification or you train your model from the scratch.

For Training from scratch, you just compile the model and then train that model.

Pre-Trained Weights.

                base_model.load_weights("/File Path/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5")
for layer in base_model.layers:
layer.trainable = False
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=["accuracy"])
            

Train the model

A callback is an object that can perform actions at various stages of training (e.g. at the start or end of an epoch, before or after a single batch, etc).

Early Stopping is used to prevent the model from overfitting.

                es=EarlyStopping(monitor='val_accuracy', mode='max', verbose=1, patience=20)
mc = ModelCheckpoint('.model.h5', monitor='val_accuracy', mode='max' )
            

Train command

                History = modelT.fit_generator(train_generator,validation_data=val_generator,epochs=15,verbose=1, callbacks=[mc,es])
            

Model Evaluation

                test_loss, test_acc = modelT.evaluate(test_generator, steps=len(test_generator), verbose=1)
 print('Loss: %.3f' % (test_loss * 100.0))
 print('Accuracy: %.3f' % (test_acc * 100.0))
            

Classification Report

                y_val = test_generator.classes
y_pred = modelT.predict(test_generator)
y_pred = np.argmax(y_pred,axis=1)
print(classification_report(y_val,y_pred))
            

Prediction Comparison

                class_indices = test_generator.class_indices
indices = {v:k for k,v in class_indices.items()}
filenames = test_generator.filenames
            

Making Data Frame of the Prediction

                val_df = pd.DataFrame()
val_df['filename'] = filenames
val_df['actual'] = y_val
val_df['predicted'] = y_pred
val_df['actual'] = val_df['actual'].apply(lambda x: indices[x])
val_df['predicted'] = val_df['predicted'].apply(lambda x: indices[x])
val_df.loc[val_df['actual']==val_df['predicted'],'Same'] = True
val_df.loc[val_df['actual']!=val_df['predicted'],'Same'] = False
val_df.head(10)
            

You will get this output

                val_df = val_df.sample(frac=1).reset_index(drop=True)
            

For loading Images

                from tensorflow.keras.preprocessing.image import ImageDataGenerator, array_to_img, img_to_array, load_img
img_size = 224
def readImage(path):
    img = load_img(path,color_mode='rgb',target_size=(img_size,img_size))
    img = img_to_array(img)
    img = img/255.
    
    return img

def display_images(temp_df):
    temp_df = temp_df.reset_index(drop=True)
    plt.figure(figsize = (20 , 20))
    n = 0
    for i in range(15):
        n+=1
        plt.subplot(5 , 5, n)
        plt.subplots_adjust(hspace = 0.5 , wspace = 0.3)
        image = readImage(f"../input/animals10/raw-img/{temp_df.filename[i]}")
        plt.imshow(image)
        plt.title(f'A: {temp_df.actual[i]} P: {temp_df.predicted[i]}')
            

Correctly classified.

                display_images(val_df[val_df['Same']==True])
            

Misclassified.

                display_images(val_df[val_df['Same']!=True])
            

For detailed working code, you can go to this link and then copy and edit the code from Kaggle

Please hit the clap button Your little appreciation will boost my motivation 🙏🙏

GitHub Repo

https://github.com/106AbdulBasit/ResNet50-Weights

Don’t Forget to Vote up if you like my work


Only registered users can post comments. Please, login or signup.

Start blogging about your favorite technologies and get more readers

Join other developers and claim your FAUN account now!

Avatar

Abdul Basit

@ab_niazi
An artificial intelligence developer, passionate about cutting edge technology and solving real-world problems, with highly qualified in AI
Stats
6

Influence

170

Total Hits

1

Posts