Sequoya is an open source software tool aimed at for solving Multiple Sequence Alignment problems with multi-objective metaheuristics.
This tool implements a distributed async version of the M2Align algorithm as shown in:
"M2Align: parallel multiple sequence alignment with a multi-objective metaheuristic". Cristian Zambrano-Vega, Antonio J. Nebro José García-Nieto, José F. Aldana-Montes. Bioinformatics, Volume 33, Issue 19, 1 October 2017, Pages 3011–3017 (DOI).
- Score functions:
- Sum of pairs,
- Star,
- Minimum entropy,
- Percentage of non-gaps,
- Percentage of totally conserved columns,
- STRIKE.
- Algorithm:
- NSGA-II,
- Distributed NSGA-II
- Crossover operator:
- Single-point crossover (
GapSequenceSolutionSinglePoint
).
- Single-point crossover (
- Mutation operators:
- Shift closest gap group (
ShiftClosedGapGroups
), - Shift gap group (
ShiftGapGroup
), - Random gap insertion (
OneRandomGapInsertion
), - Merge two random adjacent gaps group (
TwoRandomAdjacentGapGroup
), - Multiple mutation (
MultipleMSAMutation
).
- Shift closest gap group (
To download and install Sequoya just clone the Git repository hosted in GitHub:
git clone https://github.com/benhid/Sequoya.git
cd Sequoya
python setup.py install
Or via pip:
pip install Sequoya
Examples of running Sequoya are located in the examples
folder:
For running Sequoya in a cluster of machines, first setup a network
with at least one dask-cheduler
node and several dask-worker
nodes:
conda create --name dask-cluster
conda activate dask-cluster
pip install git+https://github.com/benhid/Sequoya.git@develop
Then, on the master node run:
dask-scheduler
On each slave node run:
dask-worker <master-ip>:8786 --nprocs <total-cores> --nthreads 1
This project is licensed under the terms of the MIT - see the LICENSE file for details.