|
|
|
ReHap - Reconstruct Haplotypes
ReHap - Reconstruct Haplotypes, is a web
application aiming to provide users with a common interface to five
algorithms for the parental haplotype reconstruction problem (also known as
Haplotyping Assembly Problem).
|
ReHap allows you to:
-
generate a matrix of fragments simulating the shotgun sequencing procedure
coming from real human haplotypes from HapMap project;
-
run up to five state of the art algorithms for the parental haplotype
reconstruction;
-
view the SNP matrix highlighting errors, the original haplotype strands and
the reconstructed solutions;
-
compare algorithms solutions with the ground truth;
-
upload user data.
|
|
Please, send comments, bug reports, opinions to:
|
|
|
|
|
How to use ReHap
|
The ReHap interface is divided in four tabs. The
first tab allows to set all the parameters for the SNP matrix
generator. The second tab provides a simple interface to inspect the generated
haplotypes and SNP matrix. Implanted errors are highlighted to be better
identified.
The third tab gives the possibility to select the algorithms and to set
their parameters. To finish, the last tab provides a simple way to compare the
results of all the selected algorithms.
|
The SNP matrix is generated according to a certain number of parameters. Among
them: the haplotypes length, the error rate, the coverage and whether to insert
gaps in the matrix or not. It is possible to select one of the chromosomes of
a certain individual among four different populations from the HapMap Project. It is also possible to
generate the SNP matrix starting from user haplotyping data. The generation
process simulate the shotgun sequencing process according with the
description in CelSim. Default parameters reflect actual
technology.
|
ReHap allows to run five algorithms:
-
SpeedHap is an algorithm developed in
our labs.
-
Fast Hare is a well known heuristic algorithm
for the parental haplotyping problem
-
MLF is a randomized algorithm that attempts
to minimize the MEC (Minimum Error Correction) objective function
-
2 distance MAC: is a clustering algorithm that
combine two distance definition to split fragments in two clusters
-
SHR-3 use a probabilistic approach to the
MFR (Minimum Fragment Removal) problem in order to reconstruct the haplotypes
|
|
References:
|
Speedhap: An accurate
heuristic for the single individual SNP haplotyping problem with many
gaps, high reading error rate and low coverage
L.M. Genovese, F. Geraci, and M. Pellegrini
EEE/ACM Transactions on Computational Biology and Bioinformatics, 2008
|
Linear Time Probabilistic Algorithms for the Singular Haplotype Reconstruction Problem from SNP Fragments
Zhixiang Chen and Bin Fu and Robert T. Schweller and Boting Yang and Zhiyu Zhao and Binhai Zhu
APBC - Imperial College Press, 2008
|
A clustering algorithm based on two distance functions for MEC model
Y. Wang, E. Feng, and R. Wang
Computational Biology and Chemistry , 2007
|
Haplotype assembly from aligned weighted SNP fragments
Y.Y. Zhao, L.Y. Wu, J.H. Zhang, R.S. Wang, and X.S. Zhang
Computational Biology and Chemistry, 2005
|
Fast hare: A fast heuristic for single individual SNP haplotype reconstruction
A. Panconesi and M. Sozio
Proc. WABI, 2004
|
A dataset generator for whole genome shotgun sequencing
G. Myers
International Conference on Intelligent Systems for Molecular Biology, 1999
|
|
|
|
|
Styled and mantained by:
M. Elena Renda
|