Image processing with CNN | Beginner's guide to image processing (2023)

This article was published as part of theData Science Blogathon

introduction

The various deep learning methods use data to train neural network algorithms for a variety of machine learning tasks, such as classifying different classes of objects. Convolutional neural networks are very powerful deep learning algorithms for analyzing images. This article explains how to build, train, and evaluate convolutional neural networks.

You will also learn how to improve your ability to learn from data and how to interpret training results. Deep learning has various applications such as image processing, natural language processing, etc. It is also used in medicine, media and entertainment, self-driving cars, etc.

Image processing with CNN | Beginner's guide to image processing (1)
Image processing with CNN | Beginner's guide to image processing (2)

What is CNN?

CNN is a powerful image processing algorithm. These algorithms are currently the best algorithms we have for automated image processing. Many companies use these algorithms to identify objects in an image, for example.

The images contain data of an RGB combination. Matplotlib can be used to import an image from a file into memory. The computer doesn't see an image, it just sees a number field. Color images are stored in three-dimensional arrays. The first two dimensions correspond to the height and width of the image (the number of pixels). The final dimension corresponds to the red, green and blue colors present in each pixel.

Three CNN shifts

Convolutional Neural Networks, specialized in image and video recognition applications. CNN is mainly used in image analysis tasks such as image recognition, object recognition, and segmentation.

There are three types of layers in convolutional neural networks:

1) Convolutional Layer: In a typical neural network, each input neuron is connected to the next hidden layer. In CNN, only a small part of the neurons in the input layer are connected to the hidden layer of neurons.

2) Pooling layer: The pooling layer is used to reduce the dimensionality of the resource map. Within the hidden layer of CNN, there will be several layers of activation and grouping.

3) Fully connected layer:fully connected layersform the lastTo herIn the net. the entrance tofully connected layeris the output of the final or convolutional poolcapa, which is flattened and then inserted into thefully connected layer.

Image processing with CNN | Beginner's guide to image processing (3)

Source: Google Images

registro MNIST

In this article, we examine object detection in image data using the MNIST handwritten digit recognition dataset.

The MNIST dataset consists of digit images from a variety of scanned documents. Each image is a 28x28 pixel square. In this data set, 60,000 images are used to train the model and 10,000 images are used to test the model. There are 10 digits (0 to 9) or 10 classes to predict.

Image processing with CNN | Beginner's guide to image processing (4)

Source: Google Images


Loading the MNIST dataset

Install the TensorFlow library and import the dataset as a training and test dataset.

Draw sample output image

!pip install tensorflowfrom keras.datasets import mnistimport matplotlib.pyplot as plt(X_train,y_train), (X_test, y_test)= mnist.load_data()plt.subplot()plt.imshow(X_train[9], cmap=plt.get_cmap ('grau'))

Salida:

Image processing with CNN | Beginner's guide to image processing (5)

Deep learning model with multilayer perceptrons using MNIST

In this model, we will create a simple neural network model with a single hidden layer for the MNIST handwritten digit recognition dataset.

A perceptron is a model of a single neuron that is the basic building block for larger neural networks. The multilayer perceptron consists of three layers, namely, input layer, hidden layer, and output layer. The hidden layer is not visible to the outside world. Only the input layer and the output layer are visible. For all DL models, the data must be numeric in nature.

Stage 1: import key libraries

import numpy as npfrom keras.models import Sequentialfrom keras.layers import Densefrom keras.utils import np_utils

Step 2: Refactor the data

Each image has a size of 28 x 28, that is, 784 pixels. Thus, the output layer has 10 outputs, the hidden layer has 784 neurons, and the input layer has 784 inputs. The record is then cast to the float data type.

number_pix=X_train.shape[1]*X_train.shape[2] X_train=X_train.reshape(X_train.shape[0], number_pix).astype('float32')X_test=X_test.reshape(X_test.shape[0], número_pix).astype('float32')

Step 3: Normalize the data

NN models typically require scaled data. In this code snippet, the data is normalized from (0-255) to (0-1) and the target variable is hot-coded for further analysis. The target size has a total of 10 classes (0-9)

X_train=X_train/255X_test=X_test/255y_train= np_utils.to_categorical(y_train)y_test= np_utils.to_categorical(y_test)num_classes=y_train.shape[1]print(num_classes)

Salida:

10

Now we are going to create a function NN_model and compile it.

Step 4: Define the role of the model

def nn_model(): model=Sequential() model.add(Dense(number_pix, input_dim=number_pix, activación='relu')) mode.add(Dense(num_classes, activación='softmax')) model.compile(loss= 'categorical_crossentropy', Optimizer='Adam',metrics=['accuracy']) Rückgabemodell

There are two layers, one is the hidden layer with the ReLu activation function and the other is the output layer with the softmax function.

Step 5: Run the model

model=nn_model()model.fit(X_train, y_train, validation_data=(X_test,y_test),epochs=10, batch_size=200, detallado=2)score= model.evaluate(X_test, y_test, detallado=0)print(' Der Fehler lautet: %.2f%%'%(100-score[1]*100))

Salida:

Epoch 1/10300/300 - 11s - Loss: 0.2778 - Precision: 0.9216 - val_loss: 0.1397 - val_accuracy: 0.9604 /10300/300 - 2s - Loss: 0.0726 - Precision: 0.9790 - val_loss: 0.0750_3accuracy /08 - 4val_accuracy /0 300 - 2s - loss: 0.0513 - precision: 0.9851 - val_loss: 0.0656 - val_accuracy: 0.9 0 5/ 6: 0.9 0 5/ 6 /300 - 2s - loss: 0.0376 - precision: 0.9892 - val_loss: 0.0717 - val_accuracy 73 6/10300/300 - 2 s - loss: 0.0269 - precision: 0.9928 - val_loss: 0 .0637 - val_accuracy: 0.97/300.030 Epoch - 2 s - Loss: 0.0208 - Precision: 0.9948 - val_loss: 0.000.030 Epoch - 0.02 Loss08 - 0.02 Precision: 0.9948 - val_loss : 0.00.9600 10300 /300 - 2 s - loss: 0.0153 - precision: 0.9962 - loss_val: 0.0581 - precision_val: 0.9815 - 2.030 9/s - loss: 0.0111 - precision: 0.996s31 - 0.996saccuracy - 0.996s31 - 0.906s31 -0.906s31 loss: 0.0082 - precision: 0.9985 - val_loss: 0.0609 - val_accuracy: 0.9828 The error is: 1.72 %

From the model results, it can be seen that the accuracy improves as the number of epochs increases. The error is 1.72%, the smaller the error, the higher the precision of the model.

Convolutional neural network model using MNIST

In this section, we build simple CNN models for MNIST demonstrating convolutional layers, pooling layers, and dropout layers.

Step 1: Import all required libraries

import numpy as npfrom keras.models import Sequentialfrom keras.layers import Densefrom keras.utils import np_utilsfrom keras.layers import Dropoutfrom keras.layers import Flattenfrom keras.layers.convolutional import Conv2Dfrom keras.layers.convolutional import MaxPooling2D

Step 2 – Set the seed to reproducibility and load the MNIST data

Semilla=10np.random.seed(Seed)(X_train,y_train), (X_test, y_test)= mnist.load_data()

Step 3: Convert the data to float values

X_train=X_train.reshape(X_train.shape[0], 1,28,28).astype('float32')X_test=X_test.reshape(X_test.shape[0], 1,28,28).astype('float32 ')

Step 4: Normalize the data

X_train=X_train/255X_test=X_test/255y_train= np_utils.to_categorical(y_train)y_test= np_utils.to_categorical(y_test)num_classes=y_train.shape[1]print(num_classes)

A classic CNN architecture looks like this:

Image processing with CNN | Beginner's guide to image processing (6)

Source: Google Images

output layer
(10 outputs)
hidden layer
(128 neurons)
smooth plane
dropout layer
20%
Maximum Pool Level
2×2
convolution layer
32 cards, 5×5
nivel visible
1x28x28

The first hidden layer is a convolution layer called Convolution2D. It has 32 feature maps in 5×5 size and with a rectifying function. This is the input layer. Then comes the pooling layer, which takes the maximum value called MaxPooling2D. In this model it is configured as a 2×2 pool size.

Regularization takes place in the abandonment layer. It is set to randomly remove 20% of the neurons in the layer to avoid overfitting. The fifth layer is the flattened layer, which converts the 2D matrix data into a vector called flatten. It allows the output to be fully processed by a fully wired standard layer.

The fully connected layer with 128 neurons and the rectifier activation function is then used. Finally, the output layer has 10 neurons for the 10 classes and a softmax activation function to produce probability predictions for each class.

Step 5: Run the model

def cnn_model(): model=Sequential() model.add(Conv2D(32,5,5, relleno='mismo',input_shape=(1,28,28), activation='relu')) model.add(MaxPooling2D (pool_size=(2,2), padding='same')) model.add(Abandono(0.2)) model.add(Flatten()) model.add(Dense(128, activation='relu')) model. add(Dense(num_classes, activation='softmax')) model.compile(missing='categorical_crossentropy', Optimizer='adam',metrics=['accuracy']) Rückgabemodell
model=cnn_model()model.fit(X_train, y_train, validation_data=(X_test,y_test),epochs=10, batch_size=200, detallado=2)score= model.evaluate(X_test, y_test, detallado=0)print(' Der Fehler lautet: %.2f%%'%(100-score[1]*100))

Salida:

Epoch 1/10300/300 - 2s - Loss: 0.7825 - Precision: 0.7637 - val_loss: 0.3071 - val_accuracy: 0.9069 Epoch 2/10300/300 - 1s - Loss: 0.3505 - Precision: 0 .8908 - val_loss: 0.3003 -00 s - Loss: 0.3505 - Precision: 0.8908 - val_loss: 0.20300/300 - val0_30092 - val0_30092 - val0_accuracy/3 - 1s - loss: 0.2768 - precision: 0.9126 - val_loss: 0.1771 - val_accuracy: 03 03 -0 4/1 s 03 -0.942/6 loss: 0.2392 - precision: 0.9251 - val_loss: 0.1508 - val_epoch 53: 0.9 07 /0.9 /3000 - loss: 0.2164 - precision: 0.9325 - val_loss: 0.1423 - val_accuracy: 0.9546Epoch 6/10300 s -7 - loss: 399 Precision: 0.9380 - Val_loss: 0.1279 - Val_accuracy: 0.96 0.0307 -Soss from 1997 -poch - 1.8: 0.9415 - val_loss: 0.1179 - val_accuracy: 0.9632 epoch 8/10300/300 - 1s - loss: 0.1777 - precision: 0.9433 - Val - val_accuracy: 0.9642 - 1.030 9 epoch 10 - 89 0.1 - loss: 0.1 - loss: 0.10 0 .9469 - val_loss: 0.1093 - val_accuracy: 0.9667 epoch 10/10300/300 - 1s - loss05: -precision o: 0.9493 - val _los: 0.1053 - val_accuracy: 0.9659 The error is: 3.41%

From the model results, it can be seen that the accuracy improves as the number of epochs increases. The error is 3.41%, the smaller the error, the higher the precision of the model.

I hope you enjoyed reading it and feel free to use my code to test it for your purposes. If you have any comments on the code or just the blog post, please contact me at[Email protected]

Media featured in this CNN Image Processing article is not owned by Analytics Vidhya and is used at the author's discretion.

Related

Top Articles
Latest Posts
Article information

Author: Duane Harber

Last Updated: 02/13/2023

Views: 5805

Rating: 4 / 5 (51 voted)

Reviews: 82% of readers found this page helpful

Author information

Name: Duane Harber

Birthday: 1999-10-17

Address: Apt. 404 9899 Magnolia Roads, Port Royceville, ID 78186

Phone: +186911129794335

Job: Human Hospitality Planner

Hobby: Listening to music, Orienteering, Knapping, Dance, Mountain biking, Fishing, Pottery

Introduction: My name is Duane Harber, I am a modern, clever, handsome, fair, agreeable, inexpensive, beautiful person who loves writing and wants to share my knowledge and understanding with you.