ANDES: Statistical tools for the ANalyses
of DEep Sequencing
The
advancements in DNA sequencing technologies have allowed researchers to
progress from the analyses of a single organism towards the deep sequencing of
a sample of organisms. With sufficient
sequencing depth, it is now possible to detect subtle variations between
members of the same species, or between mixed species with shared biomarkers,
such as the 16S rRNA gene. However, traditional sequencing analyses of
samples from largely homogeneous populations are often still based on multiple
sequence alignments (MSA), where each sequence is placed along a separate row
and similarities between aligned bases can be followed down each column. While this visual format is intuitive for a
small set of aligned sequences, the representation quickly becomes cumbersome
as sequencing depths cover loci hundreds or thousands of reads deep.
We
have developed ANDES, a software library and a suite of applications, written
in Perl and R, for the statistical ANalyses of DEep Sequencing. The
fundamental data structure underlying
As
new sequencing technologies evolve, deep sequencing will become increasingly
cost-efficient and the inter and intra-sample
comparisons of largely homogeneous sequences will become more common. We have provided a software package and
demonstrated its application on various empirically-derived datasets.
Where to download ANDES from.
How to install ANDES, and the other programs, ie. R, that ANDES utilizes.
A reference for all the applications in the suite of ANDES tools.
A quick walk through of some of the more commonly used scripts with the included sample input data.
Answers to questions posed by users, that might be asked again.