Symbiose seminars

  • L'utilité des codes Fibonacci

    Tomi Klein (Bar Ilan University)
    Thursday, December 8, 2016 - 10:30
    Room Aurigny
    Talk abstract: 

    The talk reviews several properties and applications of Fibonacci codes. These are fixed codeword sets, using binary representations of integers based on the Fibonacci sequence rather than on powers of 2. Applications range from robust data compression, over faster modular exponentiation to boosting the compression performance of rewriting codes for flash memory. No previous knowledge is assumed. 

  • De novo assembly of bacterial genomes from the seed microbiome.

    Matthieu Barret (INRA Angers)
    Thursday, November 24, 2016 - 10:30
    Room Aurigny
    Talk abstract: 

    Seeds are involved in the vertical transmission of microorganisms from one plant generation to another and consequently act as reservoirs for the plant microbiome. However, little is known about the structure of seed-associated microbial assemblages and the regulators of assemblage structure. In this talk, I will present recent data obtained by our research group on the taxonomic and functional composition of the seed microbiome. 

  • Inventaire et diversité des cibles génomiques potentielles des isoformes SETMAR

    Yves Bigot (INRA-CNRS tours)
    Thursday, November 17, 2016 - 10:30
    Room Aurigny
    Talk abstract: 

    Le gène Setmar est un néogène dérivé de la fusion de deux gènes, le premier codant un domaine SET et le second une transposase Hsmar1. Suivant la lignée de cellule cancéreuse, 11 isoformes protéiques d'une taille variant de 40 à 78 kDa sont des produits d'expression de ce gène. Diverses études se sont concentrées sur les fonctions de la plus grosse isoforme (78 kDa), en particulier sur celles portées par le domaine SET. Ces études montrent son implication potentielle dans la réparation de l'ADN, la décondensation de la chromatine, et la methylation de certains résidus lysines de l'histone H3 et du facteur d'épissage SnRNP70.

    En dehors de ses activités de clivage et de transfert de brin, pas (ou peu) de travaux se sont intéressés aux fonctions de la partie HSMAR1 de SETMAR in vivo. Pourtant, le domaine de liaison à l'ADN d'HSMAR1 est parfaitement fonctionnel et devrait in vivo au moins être capable de se lier aux ITRs des éléments Hsmar1 et de son MITE MADE1 dans les cellules qui expriment des isoformes SETMAR.

    Avant de traiter de la question des modes de liaison de SETMAR in vivo, nous nous avons été confrontés aux défauts de qualité de l'annotation RepeatMasker (RM) pour les Hsmar1 et des MADE1 dans le génome humain. Un nouvel inventaire a été produit en utilisant trois outils RM, logol, et la dernière version de BLAST et un processus de validation utilisant à la fois une approche bio-informatique et des données de CHIP-seq obtenues à partir de cellules exprimant différentes isoformes à partir du gène Setmar. Les conclusions de notre étude sont que les isoformes du gène Setmar contenant le domaine HSMAR1 se lient spécifiquement à l'ADN avec des modalités de liaison en partie différentes de celles observées in vitro.

  • High sensitivity sequence identification in metagenomics. Application to sequence space exploration of viruses in extreme environment.

    Clovis Galiez
    Thursday, November 10, 2016 - 10:30
    Room Métivier
    Talk abstract: 

    It is now common to deal with millions of protein sequences coming from metagenomic projects. A sizable part of it is usually left unannotated, up to 70% in the case of viruses. High sensitivity sequence search methods for remote homologs (PSI-Blast, HHblits) are not capable to cope with the amount of data generated by metagenomics.
    We will present, in the context of building a catalog of viral protein sequences, new tools that overcome this limitation. Specifically, we clustered the Uniprot down to 30% which allows to generate rich sequence profiles to identify previously unknown sequences. Moreover, the tool used for this clustering, MMseqs2, can search and cluster big sequence databases with high sensitivity at very high speed (speedup of 270x PSI-Blast with the same sensitivity). We will introduce the current method as well as the future improvements for profile-profile comparison and clustering in linear time that will respectively improve the sensitivity and speed up.

  • Sequence similarity networks, a pragmatical tool to explore processes acting on the microbial dark matter

    Lucie Bittner (Université Pierre et Marie Curie - Paris)
    Thursday, October 6, 2016 - 10:30
    Room Minquiers
    Talk abstract: 

    Sequence similarity networks (SSN) are extremely useful for biologists because, in addition to allowing a user-friendly visualization of the genetic diversity from huge high-throughput sequencing data sets, they can be studied analytically and statistically using graph topology metrics. SSN have recently been adapted to address an increasing number of biological questions investigating both patterns and processes: e.g. population structuring; genomes heterogeneity; microbial complexity and evolution; microbiome adaptation or to explore the microbial dark matter. In metagenomic microbial studies, SSN offer indeed an alternative to classical and potentially biased methods, and thus facilitate large-scale analyses and hypotheses generation, while notably including unknown/dark matter sequences in the global analysis. During this seminar, I will present concrete examples of microbial dark matter mining using SSN developed in my research team: (i) population structure of uncultivated microbes in the global ocean, (ii) research of ecological/life-traits biomarkers in massive functionally unannotated datasets.


  • The Common Workflow Language v1.0: ready for production, ready for the futur

    Michael R. Crusoe Community Engineer and co-founder, the Common Workflow Language project
    Thursday, September 29, 2016 - 10:30
    Room Markov
    Talk abstract: 

    The Common Workflow Language project produces standards that describe POSIX command line tools and workflows made from them. They also have a reference implementation and a growing community repository of CWL descriptions with an emphasis on bioinformatics. This talk will review the capabilities of the 1.0 release of the CWL standards and sketch out where the speaker sees these standards and community going over the next few years. Topics will include using and propagating metadata, models for community development and maintenance of workflows, and prospects for workflow conversion. The speaker is very interested in learning about related research efforts; time will be allotted for discussion.



     Common Workflow Language project:

  • Problème à frontière libre pour la formation de protrusions à l’échelle de la cellule

    Clair Poignard (Inria Bordeaux)
    Thursday, September 22, 2016 - 10:30
    Room Aurigny
    Talk abstract: 

    Nous présentons un modèle à frontière libre à deux phases pour modéliser la formation de protrusions à l’échelle de la cellule. La membrane cellulaire est décrite à l’aide d’une fonction level-set dont le mouvement est dû au gradient d’un signal chimique. Le modèle consiste en un couplage  à travers la membrane entre une phase interne (le signal chimique) et une phase externe (la dégradation de la matrice extracellulaire). La vitesse de la frontière libre est proportionnelle au gradient du signal intérieur, ce qui génère des difficultés à la fois théoriques et numériques. Bien que simpliste, notre modélisation présente des avantages par rapport aux modélisations existantes, notamment car la vitesse de formation de la protrusion n’est pas donnée a priori mais résulte du modèle. Un travail à long terme consisterait à enrichir ce modèle de la cascade de réactions chimiques intra-cellulaires générant le signal.

  • TBA

    Marc Cuggia et Guillaume Bouzillé (Université Rennes 1)
    Thursday, September 8, 2016 - 10:30
    Room Aurigny
    Talk abstract: 


  • BWT-based indexing structure for metagenomic classification

    Karel Brinda (Université de Marne-la-Vallée)
    Thursday, July 7, 2016 - 10:30
    Room Aurigny
    Talk abstract: 
    Metagenomics is a powerful approach to study genetic content of environmental samples, which has been strongly promoted by NGS technologies. One of the main tasks is the assignment of reads of a metagenome to taxonomic units, and the subsequent abundance estimation. Most of recently developed programs for this task (such as LMAT, KRAKEN, KALLISTO) perform the assignment based on shared k-mers between reads and references. In such an approach, two major algorithmic subproblems can be distinguished: designing a k-mer index for a huge database of reference genomes and a given taxonomic tree, and designing an algorithm for assigning reads to taxonomic units from information on shared k-mers. In this talk, we consider the problem of index design and present a novel data structure that provides a full list of genomes containing a queried k-mer. The structure is based on BWT-index applied to sequences encoding k-mers proper to each node of the taxonomic tree. We analyse the usefulness of this index and evaluate it in terms of speed and memory requirements.
  • Parallel Shortest-Path Queries in Planar Graphs

    Hristo N. Djidjev (Los Alamos National Laboratory)
    Thursday, June 23, 2016 - 10:30
    Room Aurigny
    Talk abstract: 
    The query version of the shortest path problem allows the user to precompute information in the form of a data structure that would allow subsequently, when given any pair of vertices, to compute the shortest path between them very fast. It has been used in route-planning services such as Google Maps. We develop several parallel algorithms for shortest path queries in planar graphs that use graph partitioning in the preprocessing phase to precompute and store distances between selected pairs of vertices. In the query phase, given a pair of arbitrary vertices v and w, the stored information is used to find the distance between them. The algorithms are implemented and tested on a high performance cluster with upto 256 16-core CPUs and their performances are analyzed and compared.