Utility of the dataset
The dataset we
assembled can be used in several ways. First, it provides a rich source of
candidate genes for further in depth study. Researchers interested in a
particular developmental process, for example morphogenesis of salivary gland,
can simply search our annotations and retrieve an all-inclusive list of genes
that were annotated as having expression in that structure. Such a gene set can
be further subdivided by manual curation by experts in the particular field,
using our primary image data. Second, the clustering classification presented
in this report allows one to address more abstract questions about the dataset
such as: Which genes are expressed in a regulated manner at cellular blastoderm
stage? Which genes are involved in organogenesis during late embryogenesis? Finally,
the dataset represents a starting point for a systematic analysis of the
sequence determinants of gene expression patterns. Clustering provides gene
groupings based on spatio-temporal gene expression specificity ranging from
unique patterns, through small tightly co-regulated gene sets, to large gene
expression classes. These data groupings can be tested against cis-regulatory
prediction pipelines to identify significant associations between gene
expression specificity and genomic sequence motifs.
Determining expression
patterns is only a first step towards further understanding gene function and
therefore it is important to intersect our spatial expression data with other
genomic datasets. The tools that we developed allow anyone with an independent
result that can be summarized as a list of genes, for example a cluster derived
from targeted microarray analysis, to quickly determine the spatio-temporal
expression patterns of these genes in the Drosophila embryo.
To address the
difficulty of summarizing the gene expression patterns of a group of genes, we
developed a new visual aide – the anatogram. Anatograms show the “position” of
a given gene set in the complex space of spatio-temporal gene expression
patterns and represent a convenient way to summarize such data for groups of
genes. Anatograms also provide an intuitive comparison of differences among
groups of genes, which can supplement more rigorous statistical comparisons. In
principle, any list of genes can serve to generate an anatogram; for example,
the list of Drosophila genes homologus to a gene group in another organism, or
the genes that contain a particular sequence motif. In this way, anatograms can
be used to compare results from gene expression studies among different
species. The color code is based on organ systems shared by metazoan organisms
and can be adapted to spatio-temporal gene expression data in other animals
providing a organism-independent way to express this newly emerging class of
gene expression data.
Post Comment
No comments