Skip to main content

Command-line tool to cut VCF (variant call files) into smaller batches, intended to be used for multiprocessing or distributed computing.

Project description

📑️ VCF Batcher

Rust

This is a Rust crate to cut VCF (variant call files) into smaller batches, intended to be used for multiprocessing or distributed computing.

🧰️ Installation

Depending on what your goals are, you can use this tool as a CLI or as a library in 🦀️ Rust or 🐍️ Python.

Installing the CLI

In order to install the program as a CLI, you will need to have cargo installed. Instructions to install cargo

Once you have it, you can run the following command in your terminal to install the VCF batcher.

cargo install vcf_batcher

Installing Rust Crate

In order to install the tool as a rust crate, you can add it to your Cargo.toml dependencies or run:

cargo add vcf_batcher

You can find the crate documentation on docs.rs.

Installing python bindings

We provide python bindings for the VCF batcher which can be installed via pip.

pip install vcf-batcher

🪄️ Usage

CLI

Using the CLI after installing can be done through the vcf_batcher_cli command.

vcf_batcher_cli path/to/your_file.vcf path/to/ouput/directory

By default, this will create batches with 25'000 samples each. If you'd like to override this default, you can do so by providing a custom --batch-size or -b argument:

vcf_batcher_cli -b 1000 path/to/your_file.vcf path/to/ouput/directory

Library

After installing either the rust crate or python module, you can use the provided function.

🦀️ Rust

pub fn extract_variants_to_batches(
    file_path: &str,
    batch_size: usize,
    output_path: &Path,
    compression_level: Option<Compression>
)

🐍️ Python

vcf_batcher.py_extract_variants_to_batches(
        input_file,
        batches_folder,
        batch_size,
)

License

The software is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

vcf_batcher-0.2.1-cp311-none-win_amd64.whl (252.6 kB view hashes)

Uploaded CPython 3.11 Windows x86-64

vcf_batcher-0.2.1-cp311-none-win32.whl (247.1 kB view hashes)

Uploaded CPython 3.11 Windows x86

vcf_batcher-0.2.1-cp311-cp311-macosx_11_0_arm64.whl (383.5 kB view hashes)

Uploaded CPython 3.11 macOS 11.0+ ARM64

vcf_batcher-0.2.1-cp311-cp311-macosx_10_7_x86_64.whl (396.8 kB view hashes)

Uploaded CPython 3.11 macOS 10.7+ x86-64

vcf_batcher-0.2.1-cp310-none-win_amd64.whl (252.6 kB view hashes)

Uploaded CPython 3.10 Windows x86-64

vcf_batcher-0.2.1-cp310-none-win32.whl (247.0 kB view hashes)

Uploaded CPython 3.10 Windows x86

vcf_batcher-0.2.1-cp310-cp310-manylinux_2_34_x86_64.whl (333.5 kB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.34+ x86-64

vcf_batcher-0.2.1-cp310-cp310-macosx_11_0_arm64.whl (383.5 kB view hashes)

Uploaded CPython 3.10 macOS 11.0+ ARM64

vcf_batcher-0.2.1-cp310-cp310-macosx_10_7_x86_64.whl (396.8 kB view hashes)

Uploaded CPython 3.10 macOS 10.7+ x86-64

vcf_batcher-0.2.1-cp39-none-win_amd64.whl (252.6 kB view hashes)

Uploaded CPython 3.9 Windows x86-64

vcf_batcher-0.2.1-cp39-none-win32.whl (247.1 kB view hashes)

Uploaded CPython 3.9 Windows x86

vcf_batcher-0.2.1-cp38-none-win_amd64.whl (252.6 kB view hashes)

Uploaded CPython 3.8 Windows x86-64

vcf_batcher-0.2.1-cp38-none-win32.whl (247.0 kB view hashes)

Uploaded CPython 3.8 Windows x86

vcf_batcher-0.2.1-cp37-none-win_amd64.whl (252.6 kB view hashes)

Uploaded CPython 3.7 Windows x86-64

vcf_batcher-0.2.1-cp37-none-win32.whl (247.0 kB view hashes)

Uploaded CPython 3.7 Windows x86

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page