Transcriptome sequencing in E. necator:
In collaboration with Lance Cadle-Davidson, USDA-ARS, Grape Genetics Research Unit, Geneva, NY, we have begun a project to sequence the transcriptome of E. necator. The aim of this project is to develop tools for expression profiling and to discover single-nucleotide polymorphisms (SNPs) for genetic studies.
We decided to sequence the transcriptome because of the complexity of powdery mildew genomes. The genome of Blumeria graminis f. sp. hordei, the barley powdery mildew fungus, was sequenced recently (http://www.blugen.org/) and is approximately 118 MB in size, almost 3 times the size of most other ascomycetes. Approximately 70% of the B. graminis genome comprises repetitive DNA, particularly retroelements. Given our limited resources, we needed a strategy for reducing genome complexity and decided to sequence only the transcriptome in E. necator. Furthermore, transcriptome sequences can be adapted for expression profiling projects planned for the future.
The first draft of transcriptome sequence is now in hand. To date, we have sequenced the transcriptomes of two E. necator isolates at the Cornell Core Laboratories Center (CLC). We sequenced a normalized cDNA library of a New York isolate (G14) using 454 GS FLX technology and obtained a total of 82 Mbp of sequence, in more than 32,000 contigs. Nearly 100% the 454 reads were placed into contigs, meaning that we obtained multiple reads per cDNA transcript. More than 99% of these contigs had 100 or fewer reads, suggesting that as a result of the normalization, highly expressed genes were not over-represented in sequencing.
We have also resequenced the non-normalized transcriptome of another New York isolate (G19) using Illumina Genome Analyzer sequencing. Sequences from Illumina sequencing are much shorter than from 454 GS FLX, but they can be aligned to the 454 reference sequences. Approximately 99% of the transcripts in the 454 sequence of isolate G14 were also found in the Illumina sequences from isolate G19. This close match underscores the utility of the 454 reference sequences for resequencing.
cDNA of six additional isolates will be resequenced by Illumina sequencing in the near future. These isolates include: i) an isolate from V. rotundifolia (muscadine grapes), ii) two isolates from V. vinifera in North Carolina, iii) an isolate from V. aestivalis in Georgia (southeastern USA), iv) an isolate from V. riparia in New York, v) an isolate from V. vinifera before and after sporulation. We expect these sequences to be available in late 2010. Comparison of transcriptome sequences will enable us to find a large set of SNPs for use in further genetic studies.