The Evolution of Gene Nomenclature: From white to Defb24

Author:  Moorhouse Anna
Institution:  English and Cell and Molecular Biology
Date:  September 2005

Names are not always what they seem. The common Welsh name BZJXXLLWCP is pronounced Jackson.

- Mark Twain

In the beginning, naming was simple. Thomas Hunt Morgan, a geneticist who spent the earlier part of his career researching fruit flies at Columbia University, found a single mutant fly with white eyes and named it white. At the time of Morgan's discovery in 1910, no one knew how genetic information was passed on from one generation to the next. The term gene' had only come into use the year before and the idea that chromosomes had anything to do with the process was generally not accepted. A year later, in 1911, Morgan's student, Alfred Sturtevant, was able to construct the first gene map in which he suggested that chromosomes really were the carriers of genes and that each gene had a specific, linear location within a chromosome. The genes he used to make his map were named as simply as had been white: flies with a dysfunctional copy of the yellow gene had yellow bodies, just as flies with a mutant copy of the rudimentary wing gene had wings that were usually blistered, wrinkled and very short.

Figure 1. T. H. Morgan with fly drawings. Courtesy of The Archives, California Institute of Technology.

Figure 1. T. H. Morgan with fly drawings. Courtesy of The Archives, California Institute of Technology.

Since then, the world of genes and gene nomenclature has grown beyond the wildest dreams of Morgan and his first protégés. Gene databases such as Flybase, the Zebrafish Information Network (ZFIN), and The Arabidopsis Information Resource (TAIR) have risen up in droves to regulate the massive amounts of information that have been accumulated in gene laboratories around the world. One organization, the HUGO Gene Nomenclature Committee (HGNC), has taken on the challenge of sorting through the 20,000 human gene names currently in use and assigning each gene a unique symbol. Their mission is straightforward, if not overly ambitious: "Giving unique and meaningful names to every human gene."

The committee chair, Dr. Sue Povey, a professor of human genetics at the University College London, relishes the task.

"The major effort of our group is now an attempt to make the human genome data more human-friendly," says Povey. "Our plan is to persuade the whole scientific community to agree on a single unique name and abbreviation...for each human gene."

Despite the commitment of the HGNC, critics of human gene naming standards still abound.

"Although some of our correspondents describe in no uncertain terms our unsuitability for the job, the attempt to ensure that for each human gene there is one name and one standard abbreviation (usually known as a symbol) has occupied the HUGO gene nomenclature committee since 1979," Povey said in response to one disgruntled geneticist. "You won't like every symbol (neither do we) but they are at least all unique, and wherever humanly possible they have been settled by negotiation."

The autonomy of research groups makes this task particularly difficult. Typically, researchers in different labs tend to be confidential about their work and, in doing so, the same gene may be given different names in different labs. Which name wins out is usually dependant on who publishes first, and even then, you can end up with duplicates. For example, the gene cheap date produces mutants that are especially sensitive to alcohol. This same gene is also known as amnesiac as its mutants likewise are known to have poor memories.

Confusion in gene nomenclature can also arise when the same gene appears as an ortholog across more than one species. When two genes are considered orthologs, they are said to have evolved from a common ancestor and to have retained the same function throughout the course of evolution. However, because researchers who work with different model organisms tend not to frequent the same social circles, ortholog names rarely to match up. "...Recently a paper in PNAS describing many defensin genes referred to Defb19I [a mouse gene] as the ortholog of DEFB17 [human] and DEFB19 [human] as the ortholog of Defb24 [mouse]," Povey said.

Creativity in gene naming is also under scrutiny. Dr. Gregory A. Petsko, a biochemist at the Rosenstiel Basic Medical Sciences Research Center, Brandeis University, Massachusetts, was noted for his scathing commentary in Genome Biology: "[n]othing, however, seems to engender more passion and provoke more quarrels than the matter of assigning names to things." Petsko also said, "Scientists defend their choices with the tenacity of a mother tiger protecting her cubs, with the result that the scientific literature is awash with names that range from the cute to the stupid."

These names, which have both researchers and students scratching their heads, seem to come in two basic forms: those that provide an overabundance of information in their title, and those that provide none. One particularly gruesome example of the latter is tyrosine kinase with immunoglobulin and epidermal growth factor homology domains, a name whose abbreviation would require some whittling down.

Of the latter form, Petsko had his own opinion: "My personal favorite, p53," he stated, "It isn't even a good uninformative name; didn't it occur to this person that there might just be a few other proteins around with a molecular weight of about 53,000?"

At the other end of the scale, some researchers have gone out of their way to spice up the naming process. "Jack Peisach, a biophysicist at Albert Einstein College of Medicine in New York, named the copper-containing electron-transfer protein he discovered in 1967 stellacyanin after his wife." Petsko remarked, "[This caused] generations of biochemists to be grateful that he hadn't married someone named Gertrude."

More recently, in the January edition of the Proceedings of the National Academy of Sciences (PNAS), the KiSS-1 gene was unveiled. This human gene is involved in the production of a protein that stimulates the onset of puberty in both sexes. Again in the April edition of Science, biologists at the University of California, San Diego, dubbed their fruit fly gene grainyhead. These flies lack a functional copy of this gene and cannot activate wound repair genes in the cells surrounding an injury to their cuticle, thus resulting in an unfortunate, yet not surprisingly, grainy head.

From Morgan to HGNC we can see how rapidly the field of gene nomenclature has expanded. The HGNC, which began in 1979 on the force of a single scientist, now comprises a board of eight members. Similarly, Morgan's work, which began in relative obscurity of the edge of bioscience, is now a multibillion dollar industry set to map the tens of thousands of genes that belong to a wide cross-section of species. As Povey wrote, "It is excellent that the need for a common currency in the language of genes and gene products is now recognized... We may soon have a vacancy for another post-doctoral scientist in our group. Would you like to apply?"

References and Suggested Reading

HGNC Homepage ZFIN Homepage Flybase Homepage The Arabidopsis Information Resource "What's in a name?" by Gregory Petsko, Genome Biol. 2002; 3(4). "Smelling of roses?" by Sue Povey and Hester Wain, Genome Biol. 2002; 3(6). "Thomas Hunt Morgan and His Legacy" by Edward B. Lewis. Retrieved 18 July 2005. Time, Love, Memory: A Great Biologist and His Quest for the Origins of Behavior. By Jonathan Weiner, Vinatge Publishing, 2000.