dentist 1.0.0-beta.1

Close assembly gaps using long-reads with focus on correctness.


To use this package, run the following command in your project's root directory:

Manual usage
Put the following dependency into your project's dependences section:

dentist

standard-readme compliant

Close assembly gaps using long-reads with focus on correctness.

Today, many genome sequencing project have been conducted using second-generation sequencers which produce short reads. Such assemblies have many gaps. dentist closes these gaps using a (small) set of long reads. Furthermore, it can be used to scaffold contigs freely using a set of long reads. This can be used to fix known scaffolding errors or to further scaffold output of a long-read assembly pipeline.

Table of Contents

Install

Use Pre-Built Binaries

Download the latest pre-built binaries from the releases section and extract the contents. The tarball contains a dentist binary as well as the snakemake workflow, example config files and this README. In short, everything you to run DENTIST.

Build from Source

Be sure to install the D package manager DUB. Install using either

dub install dentist

or

git clone https://github.com/a-ludi/dentist.git
cd dentist
dub build

Runtime Dependencies

The following software packages are required to run dentist:

Please see their own documentation for installtion instructions. Note, the available packages on Bioconda are outdated and should not be used at the moment.

Usage

Suppose we have the genome assembly reference.fasta that is to be updated and a set of reads reads.fasta with 25× coverage.

Quick execution with snakemake

Install snakemake version >=5.10.0 and copy these files into your working directory:

  • ./snakemake/Snakefile
  • ./snakemake/workflow_helper.py
  • ./snakemake/snakemake.example.yml./snakemake/snakemake.yml

Next edit snakemake.yml to fit your needs and test your configuration with

snakemake --configfile=snakemake.yml extend_dentist_config

If no errors occurred the whole workflow can be executed using

snakemake --configfile=snakemake.yml

For small genomes of a few 100 Mbp this should run on a regular workstation. Larger data sets may require a cluster in which case you can use Snakemake's cloud or cluster facilities.

Executing on a Cluster

To make execution on a cluster easy DENTIST comes with examples files to make Snakemake use SLURM via DRMAA. Please read the documentation of Snakemake if this does not suit your needs. Another good starting point is the Snakemake-Profiles project.

Start by copying these files to your working directory:

  • ./snakemake/profile-slurm.yml~/.config/snakemake/<profile>/config.yaml
  • ./snakemake/cluster.example.yml./snakemake/cluster.yml

Next adjust the profile according to your cluster. This should enable Snakemake to submit and track jobs on your cluster. You may use the configuration values specified in cluster.yml to configure job names and resource allocation for each step of the pipeline. Now, submit the workflow to your cluster by

snakemake --configfile=snakemake.yml --profile=<profile>

Note, parameters specified in the profile provide default values and can be overridden by specififying different value on the CLI.

Manual execution

Please inspect the Snakemake workflow to get all the details. It might be useful to execute Snakemake with the -p switch which causes Snakemake to print the shell commands. If you plan to write your own workflow management for DENTIST please feel free to contact the maintainer!

Maintainer

Arne Ludwig <arne.ludwig@posteo.de>

Contributing

Contributions are warmly welcome. Just create an issue or pull request on GitHub. If you submit a pull request please make sure that:

  • the code compiles on Linux using the current release of dmd,
  • your code is covered with unit tests (if feasible) and
  • dub test runs successfully.

It is recommended to install the Git hooks included in the repository to avoid premature pull requests. You can enable all shipped hooks with this command:

git config --local core.hooksPath .githooks/

If you do not want to enable just a subset use ln -s .githooks/{hook} .git/hooks. If you want to audit code changes before they get executed on your machine you can you cp .githooks/{hook} .git/hooks instead.

License

This project is licensed under MIT License (see license in LICENSE.

Authors:
Dependencies:
darg, vibe-d:data, string-transform-d
Versions:
4.0.0 2022-Sep-14
3.0.0 2021-Dec-09
2.0.0 2021-Jun-21
1.0.2 2021-Apr-26
1.0.1 2021-Feb-22
Show all 20 versions
Download Stats:
  • 0 downloads today

  • 0 downloads this week

  • 0 downloads this month

  • 56 downloads total

Score:
1.3
Short URL:
dentist.dub.pm