Review of gene structure

A gene, in the most fundamental sense, is a linear sequence of nucleotides. Nucleotides are joined to one another in sequence via a phosphodiester bond between the 5' and 3' carbons of the deoxyribose moiety of the nucleotide. The deoxyribonucleic acid is a double stranded molecule with anti-parallel polarity. The coding strand (that strand which is transcribed into RNA) is referred to as the sense strand and is conventionally depicted in the 5' to 3' orientation. The anti-sense strand is thus complementary (A-T, G-C) in sequence to the sense strand and is depicted by convention in the 3' to 5' orientation. The linear sequence of a gene includes a minimum of four structural domains. The most 5' domain is the regulatory region which includes the structural attributes (nucleotide sequence) required for transcriptional control of gene expression including, cell-type or tissue specific regulation and the regulatory response to intra or inter-cellular signals. Induction signals can be the presence of a therapeutic compound, or other endogenous or foreign compound. The regulatory domain terminates at the nucleotide where the RNA polymerase initiates transcription. This nucleotide is the +1 position of the gene. Nucleotides 5' to the start site of transcription are numbered with sequential negative numbers and conversely nucleotides 3' to the start-site of transcription are numbered with positive numbers. Typically the gene contains multiple exons separated by introns. The exons are regions translated into mRNA, while the introns will be removed from the RNA transcript. The DNA is transcribed to RNA, then the RNA is translated into protein. Both of these steps can have alternate paths, resulting in different gene products produced from the same gene but this is the exception rather than the rule.

Introduction to genetic variation

With the general structure of genes in mind we can now turn to mechanisms of variation in gene structure referred to as genetic polymorphism. Genetic polymorphism occurs in the form of gross structural changes including complete gene deletion, gene duplication and genetic translocation where portions of similar genes are combined creating a new gene hybrid. However by far the most common form of genetic polymorphism is the single nucleotide polymorphism (SNP) where the nucleotide sequence at one specific position is changed, inserted or deleted. All polymorphism types introduce a variant form of the gene into the population gene pool and are designated as an allele of the original gene.

An allele is an inherited gene, present in every nucleated cell of the body. Due to the diploid nature of the human genome, each cell carries two copies of each gene. Two copies of the same allele yields a homozygous genotype and any combination of two different alleles yields a heterozygous genotype. The various types of genetic polymorphism can be generally classified by their resulting influence on protein expression or phenotype. Polymorphisms resulting in gene deletion invariably lead to loss of function and no production of the protein, enzyme or receptor. In contrast gene duplication and multi-duplication most commonly leads to increased expression of the gene product and a hyperactivity phenotype. An exception is duplication of an allele that includes additional structural variation leading to loss of function. Genetic translocation typically yields a non-functional gene. SNPs can result in a variety of changes in the expressed protein function depending upon where the polymorphism occurs in the overall gene structure. SNPs in the 5' regulatory domain may influence gene regulation (5). SNP's in the coding exons only influence function if there is a resulting amino acid change that alters the protein function. SNP's within the intron regions are typically silent unless the SNP alters a nucleotide critical for splicing of the RNA during maturation in which case typically leading to loss or decrease in protein function.

Vocabulary

Pharmacogenomics: Moving Away from "One-Size-Fits-All" Therapeutics

Within the next decade, researchers will begin to correlate DNA variants with individual responses to medical treatments, identify particular subgroups of patients, and develop drugs customized for those populations. The discipline that blends pharmacology with genomic capabilities is called pharmacogenomics.

More than 100,000 people die each year from adverse responses to medications that are beneficial to others. Another 2.2 million experience serious reactions, while others fail to respond at all. DNA variants in genes involved in drug metabolism, particularly the cytochrome P450 multigene family, are the focus of much current research in this area. Enzymes encoded by these genes are responsible for metabolizing most drugs used today, including many for treating psychiatric, neurological, and cardiovascular diseases. Enzyme function affects patients' responses to both the drug and the dose. Future advances will enable rapid testing to determine the patient's genotype and drastically reduce hospitalization resulting from adverse reactions.

Genomic data and technologies also are expected to make drug development faster, cheaper, and more effective. Most drugs today are based on about 500 molecular targets; genomic knowledge of the genes involved in diseases, disease pathways, and drug-response sites will lead to the discovery of thousands of new targets. New drugs, aimed at specific sites in the body and at particular biochemical events leading to disease, probably will cause fewer side effects than many current medicines. Ideally, the new genomic drugs could be given earlier in the disease process. As knowledge becomes available to select patients most likely to benefit from a potential drug, pharmacogenomics will speed the design of clinical trials to bring the drugs to market sooner.