CV


Hamid Saadatfar

Hamid Saadatfar

Associate Professor

Faculty: Electrical and Computer Engineering

Department: Computer

Degree: Ph.D

CV
Hamid Saadatfar

Associate Professor Hamid Saadatfar

Faculty: Electrical and Computer Engineering - Department: Computer Degree: Ph.D |

Dr. Hamid Saadatfar is currently an assistant professor of Computer Engineering Department at University of Birjand. He has received his B.Sc., M.Sc., and Ph.D. degrees from Ferdowsi university of Mashhad in 2007, 2009 and 2014, respectively. His research interests include:

  • Parallel and Distributed Processing (Cluster, Grid and Cloud Computing),
  • Data Mining and Machine Learning,
  • Big Data Analysis (Data Mining Methods for Big Data)
  • and Power-aware Computing.

نمایش بیشتر

Job failure prediction in Hadoop based on log file analysis

AuthorsHamid Saadatfar
JournalInternational Journal of Computers and Applications
Page number260-269
Serial number44
Volume number3
Paper TypeFull Paper
Published At2022
Journal GradeISI
Journal TypeElectronic
Journal CountryIran, Islamic Republic Of
Journal IndexScopus

Abstract

Hadoop is a popular framework based on MapReduce programming model to allow for distributed processing of large datasets across clusters with various number of computer nodes. Just like any dynamic computational environment, Hadoop has some problems and one of which is unsuccessful execution of MapReduce jobs. Job failures can cause significant resource wasting, performance deterioration, and user dissatisfaction. Therefore, a proactive and predictive management approach could be very useful in Hadoop systems. In this paper, we try to predict the futurity of MapReduce jobs in OpenCloud Hadoop cluster by using its log files. OpenCloud is a research cluster managed by CMU’s Parallel Data Lab which uses Hadoop to process big data. We first tried to study the log files and analyze the relationship between the jobs, resources, and workload characteristics and the failures in order to discover the effective features for the prediction process. After recognizing the job failure patterns, some popular machine learning algorithms are deployed to predict the success/failure status of the jobs before they start to execute. Eventually, we compared the learning methods and showed that the C5.0 algorithm had the best results with an accuracy of 91.37%, a recall of 74.43%, and a precision of 80.31%.

Paper URL