COMPRESSION OF GENOMIC SEQUENCES BASED ON BURROWS-WHEELER TRANSFORM: A SURVEY
Abstract: With increasingly complete genomes becoming available and the completion of the human genome in the horizon, fundamental questions have been increased regarding the characteristics of these sequences. In this paper, one of the basic questions such as the compressibility of DNA or genomic sequences is discussed since the compression of DNA sequences is a very complex process. The primary objective of this study is investigating the approaches to genomic sequence data compression based on the Burrows-Wheeler Transform (BWT) algorithm. The BWT is an essential data structure of genome indexing which has several fundamental applications, but it is still non-trivial to constrict BWT for the huge collection of genomes. Therefore, this paper highlights some researches related to the compression of genomic or DNA sequence using BWT algorithm and is discussed briefly. In addition, a comparative analysis is carried out to observe the issues in those methods and proposed a modification in BWT to increase the further compression ratio while compressing the genomic sequence data.