This page contains information about the software developed by the Bioinformatics Core. For information on initiating a project with the core, please see the Collaborating with BSPC page.
The Bioinformatics Core maintains open-source software packages used by researchers worldwide. Some examples include:
Pybedtools makes it possible to use the BEDTools "genome algebra" suite of programs with the Python programming language and offers feature-level access and automatic handling of temporary files. This allows much more complex manipulation of BED-like files than is possible on the command line.
Dale RK, Pedersen BS, Quinlan AR (2011). Pybedtools: a flexible Python library for manipulating genomic datasets and annotations. Bioinformatics 27, 3423-4. https://doi.org/10.1093/bioinformatics/btr539
Gffutils is a package for working with gene annotations in GFF and GTF formats. It has a very robust parser and can work with features in an hierarchcal fashion.
Trackhub streamlines the sharing of genomic data by making it easy to create track hubs for the UCSC Genome Browser.
The Bioconda project maintains thousands of bioinformatics software packaged in a way such that they can be installed on MacOS or Linux into isolated environments. This greatly enhances the reproducibility of computational analyses and eases the installation of complex packages. Bioconda has been adopted by the Galaxy team as their package manager of choice, and packages have been downloaded over 6 million times by researchers worldwide.
Grüning B*, Dale RK*, Sjödin A, Chapman BA, Rowe J, Tomkins-Tinch CH, Valieris R, Köster J, Bioconda Team (2018). Bioconda: sustainable and comprehensive software distribution for the life sciences. Nature Methods 15, 475-476
A well-tested, best-practices suite of workflows for analyzing high-throughput sequencing data (primarily ChIP-seq and RNA-seq), downloading and managing external published data, and tying everything together into manuscript figures, using the Snakemake workflow language.
A novel peak-calling algorithm for identifying 3’ end of transcripts in bacteria, handling replicates in a statistically robust manner. Used in Adams et al 2021 https://doi.org/10.7554/eLife.62438
HTtools maps and quantifies retrotransposon integrations, particularly in S. pombe. This is a complete re-implementation of legacy Perl code into Python that includes a config system, testing, documentation, and several bug fixes.
GEO Submission Prepper
A tool that partially automates the submitting high-throughput sequencing data to NCBI’s GEO. Given a samplesheet and config file that point to data at arbitrary locations on the filesystem, the tool generates files that can be used to populate the GEO sample submission spreadsheet.