DECA: Distributed Exome CNV Analyzer

Introduction

DECA is a distributed re-implementation of the XHMM exome CNV caller using ADAM and Apache Spark.

Getting Started

Installation

Note: These instructions are shared with other tools that build on ADAM.

Building from Source

You will need to have Maven installed in order to build ADAM.

Note: The default configuration is for Hadoop 2.7.3. If building against a different version of Hadoop, please edit the build configuration in the <properties> section of the pom.xml file.

$ git clone https://github.com/.../deca.git
$ cd deca
$ export MAVEN_OPTS="-Xmx512m"
$ mvn clean package

Installing Spark

You'll need to have a Spark release on your system and the $SPARK_HOME environment variable pointing at it; prebuilt binaries can be downloaded from the Spark website. Currently, ADAM and thus DECA defaults to Spark 2.1.0 built against Hadoop 2.7 with Scala 2.11, but any more recent Spark distribution should also work.

Helpful Scripts

The bin/deca-submit script wraps the spark-submit commands to set up DECA and launch DECA.

Commands

$ deca-submit

Usage: deca-submit [<spark-args> --] <deca-args> [-version]

Choose one of the following commands:

           normalize : Normalize XHMM read-depth matrix
            coverage : Generate XHMM read depth matrix from read data
            discover : Call CNVs from normalized read matrix
                 cnv : Discover CNVs from raw read data

You can learn more about a command, by calling it without arguments or with --help, e.g.

$ deca-submit cnv --help

 -I STRING[]              : One or more BAM, Parquet or other alignment files
 -L VAL                   : Targets for XHMM analysis as interval_list, BED or other feature file
 -cnv_rate N              : CNV rate (p). Defaults to 1e-8.
 -h (-help, --help, -?)   : Print help
 -max_sample_mean_RD N    : Maximum sample mean read depth prior to normalization. Defaults to 200.
 -max_sample_sd_RD N      : Maximum sample standard deviation of the read depth prior to normalization. Defaults to 150.
 -max_target_mean_RD N    : Maximum target mean read depth prior to normalization. Defaults to 500.
 -max_target_sd_RD_star N : Maximum target standard deviation of the read depth after normalization. Defaults to 30.
 -mean_target_distance N  : Mean within-CNV target distance (D). Defaults to 70000.
 -mean_targets_cnv N      : Mean targets per CNV (T). Defaults to 6.
 -min_mapping_quality N   : Minimum mapping quality for read to count towards coverage. Defaults to 20.
 -min_sample_mean_RD N    : Minimum sample mean read depth prior to normalization. Defaults to 25.
 -min_target_mean_RD N    : Minimum target mean read depth prior to normalization. Defaults to 10.
 -o VAL                   : Path to write discovered CNVs as GFF3 file
 -print_metrics           : Print metrics to the log on completion
 -save_rd VAL             : Path to write XHMM read depth matrix
 -save_zscores VAL        : Path to write XHMM normalized, filtered, Z score matrix
 -zscore_threshold N      : Depth Z score threshold (M). Defaults to 3.

License

ADAM is released under an Apache 2.0 license.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
bin		bin
deca-cli		deca-cli
deca-core		deca-core
deca-distribution		deca-distribution
docs		docs
scripts		scripts
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE.txt		LICENSE.txt
LICENSE_header.txt		LICENSE_header.txt
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DECA: Distributed Exome CNV Analyzer

Introduction

Getting Started

Installation

Building from Source

Installing Spark

Helpful Scripts

Commands

License

About

Uh oh!

Releases

Packages

Languages

License

fwallacevt/deca

Folders and files

Latest commit

History

Repository files navigation

DECA: Distributed Exome CNV Analyzer

Introduction

Getting Started

Installation

Building from Source

Installing Spark

Helpful Scripts

Commands

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages