Improvement in Accuracy and Speed of Image Semantic Segmentation via Convolution Neural Network Encoder- Decoder

AuthorsHassan Farsi,Sajad Mohamadzadeh
JournalJournal of Information Systems and Telecommunication
Page number128-135
Serial number3
Volume number6
Paper TypeFull Paper
Published At2019
Journal GradeScientific - research
Journal TypeTypographic
Journal CountryIran, Islamic Republic Of
Journal Indexisc،Scopus

Abstract

Recent researches on pixel-wise semantic segmentation use deep neural networks to improve accuracy and speed of these networks in order to increase the efficiency in practical applications such as automatic driving. These approaches have used deep architecture to predict pixel tags, but the obtained results seem to be undesirable. The reason for these unacceptable results is mainly due to the existence of max pooling operators, which reduces the resolution of the feature maps. In this paper, we present a convolutional neural network composed of encoder-decoder segments based on successful SegNet network. The encoder section has a depth of 2, which in the first part has 5 convolutional layers, in which each layer has 64 filters with dimensions of 3×3. In the decoding section, the dimensions of the decoding filters are adjusted according to the convolutions used at each step of the encoding. So, at each step, 64 filters with the size of 3×3 are used for coding where the weights of these filters are adjusted by network training and adapted to the educational data. Due to having the low depth of 2, and the low number of parameters in proposed network, the speed and the accuracy improve compared to the popular networks such as SegNet and DeepLab. For the CamVid dataset, after a total of 60,000 iterations, we obtain the 91% for global accuracy, which indicates improvements in the efficiency of proposed method.

Paper URL

tags: Semantic Segmentation; Convolutional Neural Networks; Encoder – Decoder; Pixelwise Semantic Interpretation