Clustering of giant virus-DNA based on variations in local entropy

Ranjan Bose, Gerhard Thiel, Kay Hamacher

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

We present a method for clustering genomic sequences based on variations in local entropy. We have analyzed the distributions of the block entropies of viruses and plant genomes. A distinct pattern for viruses and plant genomes is observed. These distributions, which describe the local entropic variability of the genomes, are used for clustering the genomes based on the Jensen-Shannon (JS) distances. The analysis of the JS distances between all genomes that infect the chlorella algae shows the host specificity of the viruses. We illustrate the efficacy of this entropy-based clustering technique by the segregation of plant and virus genomes into separate bins.

Original languageEnglish
Pages (from-to)2259-2267
Number of pages9
JournalViruses
Volume6
Issue number6
DOIs
StatePublished - May 30 2014
Externally publishedYes

Keywords

  • Evolution
  • Genomic sequences
  • Information theory
  • Phylogeny
  • Virus

Fingerprint

Dive into the research topics of 'Clustering of giant virus-DNA based on variations in local entropy'. Together they form a unique fingerprint.

Cite this