Polyploidy and Fiber Evolution
The cotton genus presents its own evolutionary mysteries associated with domestication, some of which has begun to be clarified. Phylogenetic analysis by Wendel and Alpert (Wendel and Albert 1992) , Seelanan et al. (Seelanan, Schnabel et al. 1997) and Cronn et al. (Cronn, Small et al. 2002) has clearly outlined a sequence based phylogeny for members of Gossypium as well as its tribe, Gossypieae. These studies have led to a good understanding of the relationships between the diploid genome groups, which contain two domesticated species (A genome) and how a single merger of an A genome and D genome species resulted in the formation of a polyploidy clade, which also contains two domesticated species (Wendel and Cronn 2003).
Capitalizing on the wealth of phylogenetic data for the genus and the current accessibility of genomic technologies, the purpose of this project will be to describe patterns of gene expression in cotton trichomes that are coincident with domestication at both the diploid and allotetraploid levels, and present them in a comparative, phylogenetic framework. Expression variation has played an important role in both tomato and corn domestication, and is emerging as an important source of heritable variability for selection to act upon in natural systems as well (a la epigenetics) (Kalisz 2004) . The phylogenetic perspective, combined with intrinsic role of gene expression in development, make a study of this nature a prudent and likely asset to exploring the nature of trichome morphology variation. Here, developmental timecourses compared between wild and domesticated cotton will likely illustrate pathways, modules, and individual genes whose expression were up or down regul ated in the course of domestication for a novel morphology. Such information will provide, in turn, a better understanding of the genetic reprogramming required for macroevolution.
Roughly 4,000 years ago, four separate species of cotton were domesticated from wild antecedents, two at the diploid level and two allotetraploids (Wendel, Olson et al. 1989; Wendel, Small et al. 1999; Wendel and Cronn 2003) . Early cytogenetic typing and observations of chromosome pairing at mitosis by Beasley (Beasley 1941) determined that domesticated species belonged to an entanglement of three genome groups, the A genome from the heart of Africa, the D genome whose distribution is restricted to Central and South America, and a novel group present only in Central America, the AD tetraploids (Beasley 1940) . Both Old World diploid species, Gossypium arboreum L. and G. herbaceum L. are members of the A genome and have spinable fiber, but modern cotton commerce is dominated by fiber from the New World allotetraploids, G. hirsutum, Upland Cotton which produces the bulk of all fiber, and G. barbadense L., Egyptian or Pima Cotton, usually reserve d for finer linen (Wendel 1995) .
Construction of Cotton Oligo-based DNA Microarrays
The cotton microarrays will be based on a collection of ESTs generated by both the Wendel lab and collaborators under a NSF Plant Genome Grant and an international collection of G. hirsutum ESTs. Preliminary analysis and tentative clustering of the ESTs shows roughly ~4,000 contigs for G. arboreum, ~12,000 contigs for G. raimondii and ~11,000 for G. hirsutum. The total number of singleton ESTs for all three species exceeds 40,000.
Table 1. ESTs available for the genus Gossypium
The table below represents our current understanding of Gossypium ESTs, contigs and singletons .
Species |
Libraries |
No. ESTs |
No. Contigs |
No. Singletons |
G. arboreum (A 2) |
GA__Ea |
31,242 |
4,838 |
17,658 |
G. raimondii (D 5) |
GR__Ea |
33,671 |
12,322 |
17,325 |
GR__Eb |
35,061 |
|||
Total |
68,732 |
|||
G. Hirsutum (AD 1) |
* |
6,421 |
6,015 |
14,187 |
GH_BNL |
8,022 |
|||
* |
2,018 |
|||
GH_FOX |
6,897 |
|||
* |
3,171 |
|||
GH_pAR |
1,036 |
|||
GH_SUO |
1,240 |
|||
GH_STEM |
6,588 |
|||
* |
5,084 |
|||
GH_SCW |
8,912 |
|||
GH_CBAZ |
1,306 |
|||
* |
17,096 |
|||
Total |
67,971 |
Further exploration into the EST collection using motif annotation should help identify contigs of interest to includes on the microarrays. For example, analysis of the EST contigs using Pfam, a curated database of functional motif alignments, has found nearly 500 transcription factors (Bateman 2004) . Below is a table showing the most common, and their frequency.
Table 2. Pfam Search for Transcription Factors
A total analysis of all EST contigs using a hidden Markov model has found numerous classes of transcription factors. Determinations were made by searching the Pfam database, a search requiring 24 days. The functional breakdown of abundance by class is currently being compiled. Below are some of the most abundant transcription factor motifs found in the EST collection.

Fiber Isolation
To study the gene expression in fibers, it is necessary to create a homogenous pool of fiber tissue at each time point during the developmental timecourse. Such a collection of tissue is not easily acquired, as on the day of anthesis the fiber initials, which are modified epidermal cells on the seed coat, appear as small half-domes barely differentiated from the rest of the epidermis (Applequist, Cronn et al. 2001) .
Figure 4. Electron Microscopy Images of Developing Fibers.
Images of developing cotton fibers. The top image shows fiber initials beginning the process of differentiation. The bottom row, from left to right: close up of fiber initials on the day of anthesis, developing fibers, and fibers at maturity, showing the characteristic twist that imparts cotton’s spinable quality.

As development progresses from 0 to 30 DPA, these single cells become some of the largest in the plant kingdom, at a final length of 6cm (Kim and Triplett 2001) . At later stages, it is no longer difficult to isolate fibers, but the abundance of cellulose and the reduced volume of cytoplasm make extracting RNA difficult (Wan and Wilkins 1994) . In order to overcome these two obstacles, this study will make use of laser capture microdissection. In specific implementation, the Zeiss PALM system (pressure ablation laser microdisseciton) will be used to isolate individual fiber cells from the surface of fixed ovules. This technology is available on campus in the laboratory of Dr. Patrick Schnable, and soon to be available through the Plant Science Institute on a user fee basis. There are many technical issues to applying this technology to plants, and to Gossypium seed coat trichomes in particular, but these techniques are currently being explored. The figure below illustrat es preliminary work done at the College of Veterinary Medicine using a similar system by Arcturus.
RNA Amplification and Microarray Analysis
Subsequent to isolation of fiber, enough high quality RNA must be isolated from the fiber for labeling with flurophores and subsequent hybridization to microarrays. Using a microdissection system, this translates into the capture of roughly 10,000 cells. Such a high volume of cell capture is not possible, but protocols do exist to linearly amplify mRNA in a fashion suitable for microarray applications. Currently RNA extraction and amplification is being done according to standard protocols (Wang 2000) and kits available from RNA reagent suppliers, such as Ambion.
After sample generation, RNA will be indirectly labeled with Cy-3 and Cy-5 via amino-allyl modified dUTP (DeRisi 2003) . To capture the most amount of data, two types of contrasts will be made: one between time points within an accession and the other within time points between accessions. The figure below illustrates this design.
Figure 5. Setup of Fiber Microarray Comparisons
The above figure shows two sets of contrasts. In one contrast, each time point is compared to the subsequent, with the last time point contrasted with the first. This closed loop type of system ensures the design is balanced, as well as keeping the number of dye swaps to a minimum. The second set of contrasts, between species, will provide additional sensitivity in the analysis. For example, both wild and domesticated individuals may experience a 2 fold induction of a gene between 0 and 2 DPA, but there may only be half as much transcript for that gene in the wild accession. The latter contrast, between species, will be more sensitive to this type of difference. This framework was constructed with Dr. Dan Nettleton, in the Dept. of Statistics, and has been used in other studies (Barley expression, Dr. Roger Wise, Dept of Plant Pathology).