Hybrid Optimized Algorithm-Based Frequent Pattern Mining for GA-ANN Prediction Model on Genomes

Main Article Content

E. Sheeba Sugantharani and Dr. A. Subramani

Abstract

In genomic research, establishing a relationship among variables is usually of interest. Genomic research usually links factors. DNA microarray gene expression data introduces novel molecular biology and medical data analysis issues. In scientific data, Frequent Pattern Mining successfully applied association patterns discovered in microarray gene expression analysis. Discretization and association rule mining plays an important role in bioinformatics. Traditional pattern-finding approaches need many DNA sequence scans. The issue is recognizing intriguing patterns to make time-consuming and disagreeable judgments. The paper proposes a novel technique for identifying frequent patterns in DNA sequences and analyzing microarray gene expression profiling data. It first discovers frequent patterns from DNA sequences, then performs association rule mining and clustering discretization on the gene expression data using Fuzzy C-Means and PBMF index. A hybrid Diff-Eclat algorithm is employed to generate strong association rules and improves performance by microarray gene expression data’s prediction. Finally, a GA-ANN (Genetic Algorithm-Artificial Neural Network) model is developed to predict biological knowledge from the discriminant rules. The method is implemented on a gene expression dataset and shows improved performance compared with SVM(Support Vector Machine), CNN(Convolutional Neural Network), RF(Random Forest) based on various metrics. The frequent patterns found in DNA sequence that were discovered during this approach have significant implications for medical data analyses like disease etiology, treatment analysis, mutation, and genetic analysis.

Article Details

Section
Articles