Environmental Genomics: A Revolution for Ecology
On 24 November 2015 Professor Chris Austin, a specialist of Tropical Genomics at the Monash University in Malaysia, delivered an interactive session on environmental genomics. His talk was particularly focused on the need for DNA sequencing and its powerful application to deal with the challenges of biodiversity conservation, as it can be used to answer almost all questions regarding genetic variation (mutation coupled with evolution) within and among populations that otherwise wouldn’t be answerable. Evolution, genetics and ecology have an intimate relationship analogous to the flipping of a coin; they are part of the same concept.
Looking back through history, Charles Darwin is the father of evolutionary ecology with his theory of natural selection being a milestone in the fields of genetics and evolution. The theory involves the selection of inherited variation making certain individuals in a population more likely to survive, breed and pass on their genes. Gregor Johann Mendel, the father of modern genetics, stated the Law of heredity. He produced highly influential work on chromosomes through his studies of the pea plant, linking chromosomes to natural selection. Watson and Crick made significant contributions through the discovery of the double helix structure of DNA and its constituent four bases; Adenine, Guanine, Cytosine and Thymine, present in the triplet format of codons. The presence and function of amino acids were soon discovered, but despite these advances knowledge about and analysis of DNA through sequencing methods remained complex and expensive for a long time. Later the scientist Frederick Sanger solved this puzzle and won two noble prizes. He is the pioneer of DNA sequencing, producing the first complete genome sequence (>5,000 base pairs) of a virus. He is also the inventor of the radiography sequencing method. Kary Banks Mullis, a Biochemist, also won a noble prize for his invention of the Polymerase Chain Reaction (PCR) technique to amplify the short lengths of DNA necessary for sequencing.
DNA sequencing is a modern tool for not only gene based research but also to obtain sufficient information about multiple species. Primers are short known DNA sequences (usually around 20 bases) located upstream and downstream of the target gene fragment, involved in DNA sequencing by PCR. This means that in order to study unknown DNA some prior knowledge is required, which can be ascertained from conserved regions of DNA within closely related species. Initially DNA sequencing process involved high monetary costs to maintain many machines producing and processing data for a longer period of time. For instance, the first Human Genome Project (HGP) project took 10 years to sequence the 3 billion base pairs of the human. During the last 5 years Next Generation Sequencing (NGS) has emerged, generating hundreds of DNA reads from shotgun sequencing. This method is very fast and getting cheaper (e.g. average cost is about 1$ for 10 million DNA base pairs), becoming more feasible for more people. Despite this it has some complications, such as the need for identification of the anonymous DNA fragments that are produced. Moore’s law states that computing power doubles every year, but the evolution of the field of molecular genetics is occurring at a much faster rate. Consequently there is so much genomic data available that the new challenge is the management of this large volume of information, which is achieved with the use of bioinformatics.
Earth’s biodiversity is decreasing rapidly with the condition of freshwater biodiversity being an important management target. Freshwater ecosystems are valuable due to their diverse species composition, with only 0.01% of the Earth consisting of freshwater habitats which harbour 12% of all known species. This is present as a result of the high level of fragmentation between freshwater habitats, encouraging speciation and diversification. They are under severe human pressure due to the presence of extractible resources (e.g. drinkable water; vertebrates, fish species etc.). Consequently it is important to assess the species at a genetic level to provide more specific information for conservation and management, such as the identification of separate populations or rare genotypes. There are a range of genetic techniques each with their own costs and benefits, but the important requirement of genetic studies is the conformity to Mendelian laws and the application of the Hardy-Weinberg equilibrium.
Genetic information is becoming available for almost all known species on earth, providing you can extract and sequence DNA. There are different ways of expressing such genomic data; phylogenies can infer species relatedness, phylogeographic methods represent the number of mutations which separate groups, and GIS can show the spatial distribution of groups. Functional genomics is also possible, referring to the grouping of organisms with a similar function within their environment. The Tree of Life web project (ToL) is an example of a collective approach consisting of more than 10,000 web pages with phylogenetic information belonging to particular groups (e.g. salamander, phlox flowers, Heliconius butterflies etc.) of organisms. These pages are linked together with the common root of genetic information, and organisms are arranged hierarchically based on their evolutionary history indicating the evolutionary relatedness (visit: www.tolweb.org). Previously, distinguishable morphological characteristics (e.g. size, shape, skin color, or other body parts etc.) were used to classify species by experienced taxonomists, but many difficulties (e.g. damaged specimens, evolutionary stage of species, missing part of body, de-coloration, cryptic species and lack of efficiency of personnel etc.) confused the identification process.
DNA Barcoding, using a small DNA sequence from a standard region of a specie’s genome, is a recent approach to species identification. The barcoding project is comprised of four consecutive processes; specimen collection, producing DNA barcode sequence, placing the DNA sequence in public barcode database (e.g. BOLD) and analysis of the sequence to identify the specimen through comparison with reference DNA sequences (visit: www.barcodeoflife.org). The Barcode of life (CBOL) provides an up to date platform of species identification with high accuracy which can be generated, validated and used by non-taxonomists or non-specialists at any time. In this way, the barcode of life serves as an effective tool for identification of cryptic species (i.e. phenotypically similar but different genotype) due to sufficient taxonomic information, and can validate the results of new taxonomic studies.
Besides identification, DNA barcoding is widely applicable in biodiversity conservation in a way that allows the quantification of the ecosystem health and integrity from the soil and water microbial community composition of a particular area. The recent application of DNA Barcoding in biodiversity conservation is exemplified by the determination of soil and water microbial community composition through genetic analysis of DNA present within an environmental sample. Metagenomics (assessment of genetic diversity within environmental samples) can also be used to identify the presence of endangered species living in or around the water body and consequently measure ecosystem health by assessing microbial diversity. A drop of sea water can contain up to 10 million viruses, 100 000 bacteria, 1000 phytoplankton, 10 multicellular organisms and remaining body parts (e.g. scale, hair, feather etc.) of other species that use the resource or move through the area.
With all of the genetic information available within just a single drop of water, can you imagine the potential advances in knowledge and technology genetic studies can provide?
Sonia Rashid; Emma Bradley