CV


HASSAN FARSI

HASSAN FARSI

Professor

Faculty: Electrical and Computer Engineering

Degree: Ph.D

CV
HASSAN FARSI

Professor HASSAN FARSI

Faculty: Electrical and Computer Engineering Degree: Ph.D |

A Deep Learning-Based Approach for Accurate Semantic Segmentation with Attention Modules

AuthorsHassan Farsi,Sajad Mohamadzadeh
Journaliranian journal of energy and environment
Page number692-705
Serial number16
Volume number4
Paper TypeFull Paper
Published At2025
Journal TypeTypographic
Journal CountryIran, Islamic Republic Of
Journal Indexisc

Abstract

Semantic segmentation plays a crucial role in various computer vision applications, requiring the accurate delineation of objects within an image. This study suggests a new segmentation network that is built on the U-Net architecture and has ResNet-50 as its main part to extract features efficiently at the hierarchical level. After the encoder, we add an Efficient Channel Attention Atrous Spatial Pyramid Pooling (ECA-ASPP) module to make the representation of features at different scales better. This module combines dilated convolutions with adaptive channel attention to capture long-range dependencies and improve contextual awareness. A Point-wise Spatial Attention (PSA) module is added to the decoder to improve feature maps by dynamically collecting global contextual information while keeping fine-grained spatial details. This is important for reconstructing spatial details. An extensive ablation study on the Stanford Background Dataset demonstrates the effectiveness of the proposed method. The results show that there was a steady improvement across all segmentation categories. The variant that did the best achieved a 78.65% mIoU, which was better than baseline models. Also, when the proposed method is tested on the Cityscapes dataset, it gets an mIoU of 80.46%, which is higher than some of the best methods currently available, like DeepLabV3 and DANet. For better segmentation accuracy, these improvements show how important it is to include adaptive attention mechanisms at both the encoder and decoder levels. The suggested network strikes a good balance between performance and computational efficiency, which means it can be used in real life for tasks that need precise segmentation.

Paper URL