Genomics

High-throughput pyro-sequencers for the price of a PCR machine in three years

I knew that sequencers are getting cheaper all the time but in genome technology this week they're talking with the inventors of the technology that 454 is licencing, discussing future pyro-sequencing updates and how that should lead to very cheap machines. Prospects are that any lab can sequence it's own genome in three years, the technology seems almost ready: Basically a cheaper and smaller version of 454's current machines. If you believe that sequence databases are exploding at the moment, better prepare for a new wave.


A Sanger/pyrosequencing hybrid approach for the generation of high-quality draft assemblies of marine microbial genomes, from PN

Based on the comparison of different sequencing strategies in six small marine microbial genome, the paper evaluated the utility and cost-effectiveness of a hybrid sequencing approach using 3730xl Sanger sequecing and 454 run to generate higher-quality lower-quality lower-cost assemblies compared to current Sanger sequencing strategies alone. For the genome more than 3Mb with many sequencing gaps and hard stops, the sequence strategy of 5.3X Sanger sequencing plus two 454 runs is the best choice.

Proc Natl Acad Sci U S A. 2006 Jul 13; [Epub ahead of print] Books, LinkOut

A Sanger/pyrosequencing hybrid approach for the generation of high-quality draft assemblies of marine microbial genomes.


De Novo Identification of Repeat Families in Large Genomes

Alkes Price

The talk is on "De Novo Identification of Repeat Families in Large Genomes", Alkes Price is giving the presentation. The slides are available here. A repeat family is a collection of similar sequence which appear many times in the genome e.g. Alu repeats. Pull out Alu sequences, align them, consensus. We don't know the regions, we don't know the boundaries, repeats don't appear of full copies only partial. Eddy concludes that the problem is messy.

Why do this ? Repeats are biological meaningful, genome rearrangements, drivers of evolution etc. For pragmatic reasons we need repeat masking. Why ? to do comparative genomics. You need to mask repeats before alignment, RepeatMasker is effective only if you know the library of repeats. So how do you identify the repeat families in large genomes.


Key note - Pavel Pevzner

I'm now in the main hall for the final keynote presentation. The hall is packed with people, currently there is a promotional video for Brazil, which is where ISMB2006 will be held. I'm definitely requesting travel funds for that, but then again, I'll have to travel with the boss... We have the governer of Michigan here apparently, Jennifer Granholm, will appere ? She's here, she talked, we discovered that Michigan is shaped like a hand.

Pavel Pevzner's key note has just started. He's talking on the topic of genome rearrangements


Neil's bioinformatics paper of the month

As promised, it's April 1st and here's my publication pick. Published in February, so not exactly paper of the month, but not to worry. "Serendipitous discovery of Wolbachia genomes in multiple Drosophila species" by Salzberg et al. is in the open access journal Genome Biology, abstract here and full access here. Hit "read more" for the details.


MILANO - Microarray Literature-based Annotation

An article describing MILANO is now published in BMC Bioinformatics: http://www.biomedcentral.com/1471-2105/6/12/
MILANO (http://milano.md.huji.ac.il) is a web based tool for automatic literature searches on lists of genes. It helps in identifying significant genes out of a list of, e.g. upregulated genes from a microarray experiment, by cross-searching the genes with user-provided terms.


Human genome hits halfway mark

Confused? Well, there was the "first draft genome" (2000), then there was the "almost finished genome" (2003) and now they are filling in the gaps. It's all explained admirably here.


RIP Francis Crick

I guess we can't let the news of Francis Crick's passing go unmentioned. OK, it's mentioned.


Rat genome sequenced

Rat genome sequenced.

More correctly, around 90% has been assembled into a final draft, the remaining 10% is considered "insignificant". This seems to be the pattern with eukaryotic genomes - I wonder why? I've heard various explanations, from "not worth the effort/expense to get the last bit" to "some regions can't be sequenced". All those nasty introns and telomeres - give me a nice microbial genome anyday.


GeneSweep

Genomeweb Birney Announces Winners of Wager on Number of Human Genes - "GeneSweep, a friendly wager in which participants bet on the number of genes on the human genome, is officially over, and there are three winners -- although genome scientists have not definitively agreed on a number of genes in the human genome." "Birney said that there were 24,500 genes in the latest Ensembl build, human build 33. But he stopped short of saying that this number represents any final tally on the number of human genes. "We are confident we have about 21,000 genes in the human [Snowdeal: Bioinformatics]


Syndicate content