Bioinformatics

BioJulia

The BioJulia organization collects a lot of great packages related to bioinformatics.

File Parsers

An important task in bioinformatics is parsing files in various standard formats. Here we list some file formats and packages with parsers:

FormatExtensionsDescriptionPackages
FASTA.fas, .fasta, .faDNA or protein sequences without annotationsFASTX
FASTQ.fq, .fastqDNA sequences with quality informationFASTX
GENBANK.gb, .gbkDNA or protein sequences with annotationsGenomicAnnotations.jl
EMBL.emblDNA or protein sequences with annotationsGenomicAnnotations.jl
SAM.samAligned DNA sequences (typically from read mapping). Text based.XAM.jl
BAM.bamAligned DNA sequences (typically from read mapping). Binary.XAM.jl
PDB.pdbProtein 3D structure.BioStructures.jl, MIToS
mmCIFMacromolecular Crystallographic Information File (mmCIF) also known as PDBx/mmCIF is a standard text file format for representing macromolecular structure dataBioStructures.jl, MIToS
MMTFMacroMolecular Transmission Format (MMTF) is a binary encoding of biological structures.BioStructures.jl
DSSPProtein Secondary StructureProteinSecondaryStructures.jl
STRIDEProtein Secondary StructureProteinSecondaryStructures.jl
PAF.pafPairwise mApping Format.PairwiseMappingFormat.jl
Stockholm.sto, .stk, .stockholmStockholm format is a multiple sequence alignment format used by Pfam, Rfam and DfamMIToS.jl
A3M.fasA2M/A3M are a family of FASTA-derived formats used for sequence alignmentsMIToS.jl
PIR.pirMultiple sequence alignment formatMIToS.jl

Data Structures

The basic data structures for representing DNA, RNA and protein sequences are in BioSequences.jl

Pairwise Sequence Alignments

A core task in bioinformatics is aligning sequences. This can be done with BioAlignments.jl which includes algorithms for the following pairwise alignment types:

  • GlobalAlignment: global-to-global alignment

  • SemiGlobalAlignment: local-to-global alignment

  • LocalAlignment: local-to-local alignment

  • OverlapAlignment: end-free alignment

Multiple Sequence Alignment (MSA)

I'm not aware of tools in Julia to compute multiple sequence alignment, but MIToS.jl can read the most common MSA formats: stockholm, FASTA, A3M, A2M, PIR or Raw format

Package descriptions

BioSequences.jl

GitHub Repo stars deps BioSequences Downloads
Stable Dev GitHub last commit (branch) version Coverage

Biological sequences for the julia language

BioSequences.jl BioSequences provides data types and methods for common operations with biological sequences, including DNA, RNA, and amino acid sequences.

It can do sequence search and pattern matching in sequences, and compute simple sequence statistics.

BioAlignments.jl

GitHub Repo stars deps BioAlignments Downloads
Stable Dev GitHub last commit (branch) version Coverage

Sequence alignment tools

BioAlignments.jl provides sequence alignment algorithms and data structures. It includes algorithms for the following pairwise alignment types:

  • GlobalAlignment: global-to-global alignment

  • SemiGlobalAlignment: local-to-global alignment

  • LocalAlignment: local-to-local alignment

  • OverlapAlignment: end-free alignment

GenomicFeatures.jl

GitHub Repo stars deps GenomicFeatures Downloads
Stable Dev GitHub last commit (branch) version Coverage

Tools for genomic features in Julia.

GenomicFeatures.jl

GenomicAnnotations.jl

GitHub Repo stars deps GenomicAnnotations Downloads
Stable Dev GitHub last commit (branch) version Coverage

GenomicAnnotations is a package for reading, modifying, and writing genomic annotations in the GenBank, GFF3, GFF2/GTF, and EMBL file formats.

GenomicAnnotations.jl

BioStructures.jl

A Julia package to read, write and manipulate macromolecular structures

GitHub Repo stars deps BioStructures Downloads
Stable Dev GitHub last commit (branch) version Coverage
BioStructures.jl

From the package README:

BioStructures provides functionality to read, write and manipulate macromolecular structures, in particular proteins. Protein Data Bank (PDB), mmCIF and MMTF format files can be read in to a hierarchical data structure. Spatial calculations and functions to access the PDB are also provided. It compares favourably in terms of performance to other PDB parsers - see some benchmarks online - and should be lightweight enough to build other packages on top of.

GeneticVariation.jl

GitHub Repo stars deps GeneticVariation Downloads
Stable Dev GitHub last commit (branch) version Coverage

Datastructures and algorithms for working with genetic variation

GeneticVariation.jl

From the package README:

GeneticVariation provides types and methods for working with datasets of genetic variation. It provides a VCF and BCF parser, as well as methods for working with variation in sequences such as evolutionary distance computation, and counting different mutation types.

Phylogenies.jl

GitHub Repo stars deps Phylogenies Downloads
Stable Dev GitHub last commit (branch) version Coverage

The BioJulia package for working with phylogenetic trees and geneologies.

Phylogenies.jl

This looks stale.

From the package README:

A julia package providing an abstract type and interface for phylogenies, a concrete phylogeny type implementation, and higher-level methods for working with phylogenies.

In development.

GenomeGraphs.jl

GitHub Repo stars deps GenomeGraphs Downloads
Stable Dev GitHub last commit (branch) version Coverage

A modern genomics framework for julia

GenomeGraphs.jl

From the package README:

GenomeGraphs provides a representation of sequence graphs. Such graphs represent genome assemblies and population graphs of genotypes/haplotypes and variation.

BioServices.jl

GitHub Repo stars deps BioServices Downloads
Stable Dev GitHub last commit (branch) version Coverage

Julia interface to APIs for various bio-related web services

BioServices.jl

NCBIBlast.jl

GitHub Repo stars deps NCBIBlast Downloads
GitHub last commit (branch) version Coverage

Thin wrapper around NCBI's BLAST+ CLI https://www.ncbi.nlm.nih.gov/books/NBK569856/

NCBIBlast.jl

From the package README:

This package is a thin wrapper around the Basic Local Alignment Search Tool CLI, better known as BLAST, developed by the National Center for Biotechnology Information (NCBI).

For now, this uses CondaPkg.jl to install BLAST+.

FASTX.jl

GitHub Repo stars deps FASTX Downloads
Stable Dev GitHub last commit (branch) version Coverage

Parse and process FASTA and FASTQ formatted files of biological sequences.

FASTX

FASTX provides I/O and utilities for manipulating FASTA and FASTQ, formatted sequence data files.

XAM.jl

GitHub Repo stars deps XAM Downloads
Stable Dev GitHub last commit (branch) version Coverage

Parse and process FASTA and FASTQ formatted files of biological sequences.

XAM.jl

FASTX provides I/O and utilities for manipulating FASTA and FASTQ, formatted sequence data files.

PairwiseMappingFormat.jl

GitHub Repo stars deps PairwiseMappingFormat Downloads
Stable Dev GitHub last commit (branch) version Coverage

Parser for the PAF format in bioinformatics

PairwiseMappingFormat.jl

PairwiseMappingFormat.jl provide a parser for Pairwise Mapping Format (PAF) files. PAF is a simple, tab-delimited format created by programs such as minimap2.

ProteinSecondaryStructures.jl

GitHub Repo stars deps ProteinSecondaryStructures Downloads
Stable Dev GitHub last commit (branch) version Coverage

Wrapper to protein secondary structure calculation packages

ProteinSecondaryStructures.jl

From the package README:

This package parses STRIDE and DSSP secondary structure prediction outputs, to make them convenient to use from Julia, particularly for the analysis of MD simulations.

BioMakie.jl

GitHub Repo stars deps BioMakie Downloads
Doc GitHub last commit (branch) version Coverage

Plotting and interface tools for biology.

BioMakie.jl

BioMakie.jl has functions to visualize

  • Protein 3D structures

  • Multiple Sequence Alignments

MIToS.jl

GitHub Repo stars deps MIToS Downloads
Stable Dev GitHub last commit (branch) version Coverage

A Julia package to analyze protein sequences, structures, and evolutionary information

MIToS

From the package README:

MIToS provides a comprehensive suite of tools for the analysis of protein sequences and structures. It allows working with Multiple Sequence Alignments (MSAs) to obtain evolutionary information in the Julia language [1]. In particular, it eases the analysis of coevoling position in an MSA using Mutual Information (MI), a measure of covariation. MI-derived scores are good predictors of inter-residue contacts in a protein structure and functional sites in proteins [2,3]. To allow such analysis, MIToS also implements several useful tools for working with protein structures, such as those available in the Protein Data Bank (PDB) or predicted by AlphaFold 2.

XSim.jl

GitHub Repo stars deps XSim Downloads
Doc GitHub last commit (branch) version Coverage

Simulate sequence data and complicated pedigree structures

XSim.jl

From the package README:

XSim is a fast and user-friendly tool to simulate sequence data and complicated pedigree structures.

Features

  • An efficient CPOS algorithm

  • Using founders that are characterized by real genome sequence data

  • Complicated pedigree structures among descendants

GeneFinder.jl

GitHub Repo stars deps GeneFinder Downloads
Stable Dev GitHub last commit (branch) version Coverage

A Gene Finder framework for Julia.

GeneFinder.jl is a species-agnostic, algorithm extensible, sequence-anonymous (genome, metagenomes) gene finder library framework for the Julia Language.

From the package README:

The GeneFinder package aims to be a versatile module that enables the application of different gene finding algorithms to the BioSequence type, by providing a common interface and a flexible data structure to store the predicted ORFI or genes. The package is designed to be easily extensible, allowing users to implement their own algorithms and integrate them into the framework.

This package is currently under development and is not yet ready for production use. The API is subject to change.

Star History

Bio.jl is Deprecated

Note that the Bio.jl package is deprecated. In this blogpost, the main developer of Bio.jl, describes where the functionalities have gone:

This website is a community effort covering a lot of ever-changing information. It will therefore never be complete or without error. If you see something wrong, or have something to contribute, please see the "Contributing" section in the github repository.

Last modified: August 24, 2025. Built with Franklin.jl