Skip to content
/ Sequoya Public

Solving Multiple Sequence Alignment (MSA) problems with multi-objective metaheuristics

License

Notifications You must be signed in to change notification settings

benhid/Sequoya

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


Logo


Solving Multiple Sequence Alignments with Python

Build Status PyPI License PyPI Python version

Sequoya is an open source software tool aimed at for solving Multiple Sequence Alignment problems with multi-objective metaheuristics.

This tool implements a distributed async version of the M2Align algorithm as shown in:

"M2Align: parallel multiple sequence alignment with a multi-objective metaheuristic". Cristian Zambrano-Vega, Antonio J. Nebro José García-Nieto, José F. Aldana-Montes. Bioinformatics, Volume 33, Issue 19, 1 October 2017, Pages 3011–3017 (DOI).

Features

  • Score functions:
    • Sum of pairs,
    • Star,
    • Minimum entropy,
    • Percentage of non-gaps,
    • Percentage of totally conserved columns,
    • STRIKE.
  • Algorithm:
    • NSGA-II,
    • Distributed NSGA-II
  • Crossover operator:
    • Single-point crossover (GapSequenceSolutionSinglePoint).
  • Mutation operators:
    • Shift closest gap group (ShiftClosedGapGroups),
    • Shift gap group (ShiftGapGroup),
    • Random gap insertion (OneRandomGapInsertion),
    • Merge two random adjacent gaps group (TwoRandomAdjacentGapGroup),
    • Multiple mutation (MultipleMSAMutation).

Install

To download and install Sequoya just clone the Git repository hosted in GitHub:

git clone https://github.com/benhid/Sequoya.git
cd Sequoya
python setup.py install

Or via pip:

pip install Sequoya

Usage

Examples of running Sequoya are located in the examples folder:

Dask distributed

For running Sequoya in a cluster of machines, first setup a network with at least one dask-cheduler node and several dask-worker nodes:

conda create --name dask-cluster
conda activate dask-cluster

pip install git+https://github.com/benhid/Sequoya.git@develop

Then, on the master node run:

dask-scheduler

On each slave node run:

dask-worker <master-ip>:8786 --nprocs <total-cores> --nthreads 1

Authors

Active development team

License

This project is licensed under the terms of the MIT - see the LICENSE file for details.

About

Solving Multiple Sequence Alignment (MSA) problems with multi-objective metaheuristics

Topics

Resources

License

Stars

Watchers

Forks

Languages