OK, more updates 2017

Well, the last post was about updates, but it was more about 2016 than 2017. Here a couple graphs for your delight. One before filtering, the other after, with the counts of prokaryotic genomes in each NCBI category as of July 2017.

The four categories are: Complete, Chromosome, Scaffold, and Contigs. My filtering used to include redundant TaxIDs, but I learned that TaxIDs wasn’t a good idea. Now I filter only by strain, substrain, etc, as provided by the NCBI list of features. Not perfect, but I seem to keep most genomes.



Updates 2017

You might already know, but if you didn’t, NCBI changed the organization of its genome database. They used to have a BACTERIA directory containing all the complete genomes (with a few caveats), and a DRAFT_BACTERIA containing, well, draft genomes. Today, the genomes are scattered and organized somewhat taxonomically, so you have to look at some files to figure out if the genomes are drafty or not so drafty. Now they have four categories: Complete, Chromosome, Scaffold, and Contig. I think that’s the order of completeness, though I’m still not sure how Chromosome


Growth of genome data at NCBI

differs from Complete, but I suspect that’s what used to be the caveats (maybe only one replicon, of many, was sequenced). Anyway, last December I finished some BLASTP comparisons of a Complete genomes dataset that I downloaded by August (2016). The dataset contains 4085 complete prokaryotic genomes (I eliminated genomes from the same strains or the same taxid). Updates are thus starting to appear in the data I offer through this web site and my server at Laurier. Check frequently if you need newer data than what you found previously.

Happy new year!

Half Sabbatical 2015!

I spent four great months working with Milton Saier at UCSD. Milton built a very useful database on transporter proteins, The Transporter Classification database (TCDB), and his lab has developed several pieces of software to play and analyze the database looking for such things as homologs that have diverged beyond the limits of detection by common sequence comparison tools. It was my privilege and honor to help Milton’s lab update and improve some of these tools, and develop a couple new ones. The tasks also gave me a lesson about sharing software, no matter how complex or simple.

In any event,  I’m still working on some specific projects that we started during my visit, and feel full of new ideas, for example about detection of protein domains. I expect that these ideas will complement work that’s been going on in my lab on assignment of functions to homologs with highly divergent sequences.

In short, this was a sabbatical as they should be. I learned a lot and got inspiration for new projects that I would have never thought about before this visit.


The Latinamerican bioinformatics force


The Latin-American conSequences force

Since Julie was leaving on Saturday, those present in the lab last Thursday had lunch together.

Julie is a PhD student co-supervised by me and Dr. Santoyo. She came from Mexico for a few months to learn some bioinformatics that she will apply to her PhD project on the rhizospheric microbiome associated to a few crops.

See ya later Julie!

Three-minute thesis 2014

Today, Scott Dobson-Mitchell was the runner up at the three-minute thesis competition (3MT) at Laurier. image

Marc is done with the M.Sc.!

undergrads_fall2013Marc presented his thesis defense last Wednesday (Oct 30). All is well. Some corrections to make, but that’s that. Anyway, the photo presents the undergrad force of the lab of Computational conSequences (Brigitte, Erum, and Thomas), plus Marc. Taken that very day.

Congrats Marc!

Visitors this summer

We have two visitors this summer to the lab of Computational conSequences:

  1. Karla Valenzuela, originally Chilean, working for her master’s at Dalhousie in Halifax (NS, Canada). Karla is doing some analyses I always was curious to do: evolutionary trace analyses, plus a few other, related, thingies. This means going back to my structural biology roots.
  2. Ismael Hernández, originally from Mexico, working for his PhD in CINVESTAV-Irapuato. Ismael is analyzing several strains of Bacillus isolated from Cuatro Ciénegas, Coahuila, Mexico.

We are talking a lot in Spanish, which is inspiring the Canadian Students in the lab to keep learning the language. Of course, we have had to explain differences between Chilean Spanish and Mexican Spanish, and it’s been fun.

