Symbiose seminars

  • Fast and Accurate RNA-Seq read alignments with PALMapper

    Géraldine Jean (LINA, Université de Nantes)
    Thursday, May 10, 2012 - 10:30
    Room Minquiers
    Talk abstract: 

    High throughput sequencing of mRNA enhances transcriptome analysis and offers great opportunities for the discovery of new genes and the identification of alternative transcripts. However, the sheer amount of high throughput sequencing data requires efficient methods for accurate spliced alignments of reads against the reference genome, which is further challenged by the limited length and quality of the sequence reads.In this talk, I will present an original RNA-Seq read mapper, called PALMapper, that combines a faster extension of the high accurate alignment method QPALMA with the fast short read aligner GenomeMapper. PALMapper quickly carries out an initial read mapping which then guides a Banded Semi-Global alignment algorithm that allows for long gaps corresponding to introns. It computes both spliced and unspliced alignments at high accuracy by taking advantage of base quality information and computational splice site predictions brought together in an extended alignment scoring model.

  • LNA: Fast Protein Classification Using A Laplacian Characterization of Tertiary Structure

    Nicolas Bonnel (Université Bretagne Sud)
    Thursday, May 3, 2012 - 10:30
    Room Aurigny
    Talk abstract: 

    In the last two decades, a lot of protein 3D shapes have been discovered, characterized and made available thanks to the Protein Data Bank (PDB), that is nevertheless growing very quickly. New scalable methods are thus urgently required to search through the PDB efficiently. We present in this paper an approach entitled LNA (Laplacian Norm Alignment) that performs structural comparison of two proteins with dynamic programming algorithms. This is achieved by characterizing each residue in the protein with scalar features. The feature values are calculated using a Laplacian operator applied on the graph corresponding to the adjacency matrix of the residues. The weighted Laplacian operator we use estimates at various scales local deformations of the topology where each residue is located. On some benchmarks widely shared by the community we obtain qualitatively similar results compared to other competing approaches, but with an algorithm one or two order of magnitudes faster. 180,000 protein comparisons can be done within 1 seconds with a single recent GPU, which makes our algorithm very scalable and suitable for real-time database querying across the Web.

  • Métagénomique humaine : impacts cliniques

    Nicolas Pons (INRA Jouy en Josas)
    Thursday, April 26, 2012 - 10:30
    Room Aurigny
    Talk abstract: 

    La métagénomique humaine consiste à caractériser les associations entre les espèces et gènes microbiens et les phénotypes humains afin de développer des outils diagnostiques et pronostiques et des approches de modulation des populations microbiennes dans le but d'optimiser la santé de chacun. Les études de métagénomiques ont été facilitées ces dernières années avec le développement des technologies de séquençage et de criblage à très-haut débit. Dans ce séminaire, il sera présenté les quatre grands volets de la métagénomique : métagénomique fonctionnelle, métagénomique phylogénétique, métagénomique dite "whole sequencing" et métagénomique quantitative. Il sera porté une plus grande attention sur les deux derniers volets avec une illustration détaillée des derniers résultats obtenus dans les projets MicroObese et MetaHIT visant notamment à identifier les associations entre populations microbiennes et obésité.