Thursday, April 26, 2007

(Still)Birth of a Neologism?

A proposal has been published in the open access journal Molecular Systems Biology for a new term, or really family of terms, for various elements of genetic information.

First they propose a whole host of types of genes. For example, a protein coding gene is a P-gene and these are further subdivided into structural protein genes (sP-genes) and regulatory protein genes (rP-genes). A cynic might point out there are already proteins refusing to choose sides, such as transcription factors with enzymatic activities (I know I've come across them, but alas memory is failing to return their names). Actually, they already lump the proteasome subunits into the P-gene class (subclass 4). Are indirect regulators of transcription (e.g. IkB and IKK, which regulate the transcription factor NFkB) in here too?

RNAs in this taxonomy come in two basic flavors: structural (sR-genes) and regulatory (cR-genes) c=control? cR-genes are subdivided into discriminating (regulating specific genetic subprograms; e.g. miRNAs or XIST) and non-discriminating (broadly acting; e.g. tRNAs and snoRNAs).

There's more. For all of the cis-acting elements controlling a gene are its 'genon' and the trans acting factors the 'transgenon'. We also have pre-genons, holo-genons, proto-genons, holo-transgenons.

In any complicated endeavor jargon is inevitable, as complex topics can't be explained in detail every time you go to talk about them -- rather, the jargon serves as a shorthand to enable actually getting something done. Attempting to generate such taxonomies is useful, but it's hard to think of much success in that department. This exercise is reminiscent of Brosius & Gould's attempt to create a nomenclature for pseudogenes. Very clever, but it never caught on.

A cynical pedant might be inclined to ask "what's the point of inventing new jargon when nobody can be bothered to properly use the old jargon". For example, periodically the popular press (and sometimes new iMedia of blogs) trumpet the discovery of a new human gene, which might be a bit disconcerting to various taxpayers who thought they had paid to have them all found already. While there are almost certainly some new genes to be found, in most cases what is new is an association of alleles with disease, and in these SNP-saturated times even the alleles can't claim to be new. Too often also is the overuse of 'gene' when the more specific 'locus' would do, or gene where 'gene product' or 'protein' would be a better fit. And other times, not only was there no discovery of a new gene, but the phenotypic association was linked only to a large stretch of DNA.

One speculation this all leads to is what does drive the acceptance or non-acceptance of new terms, particularly ones intended to be pronouncable (nobody, other than the once extant company, tries to read siRNA as two syllables!). 'omes and 'somes seem to have a better bet than some terms, but I'm sure there's been duds for that (and of course, words that didn't enter the language by that route -- do I really live in the collection of all things beginning with the letter 'h'?). Some good terms rise & fall, or only survive through their derivatives. Virtually nobody talks about a cistron, but polycistronic survives -- a pity, since cistron is a perfectly wieldy word -- luckily I can stay gruntled without solving the mystery.

1 comment:

Neil said...

Most of us would agree that the simple word "gene" is not enough, means different things to different people and needs new definitions. However, I think we need to approach this as a community using ontologies and controlled vocabularies.

The problem with this kind of paper is that it's easy to come across as crazy mavericks inventing their own crazy language, rather than an honest attempt to address the problem. That's why it seems like more meaningless jargon. There's a real difference between useful, defined vocabularies and jargon.

It's a fun read though.