RefMet Naming Conventions

The names used in RefMet are generally based on common, officially accepted terms and incorporate notations which are appropriate for the type of analytical technique used. In general, high-throughput untargeted MS experiments are not capable of deducing stereochemistry, double bond position/geometry and sn position (for glycerolipids/lycerophospholipids). Secondly, the type of MS technique employed, as well as the mass accuacy of the instrument will produce identifications at different levels of detail. For example, MS/MS methods are capable of identifying acyl chain substituents in lipids (e.g. PC 16:0/20:4) whereas MS methods only using precursor ion information might report these ions as "sum-composition" species (e.g. PC 36:4). RefMet covers both types of notations in an effort to enable data-sharing and comparative analysis of metabolomics data, using an analytical chemistry-centric approach.
The "sum-composition" lipid species indicate the number of carbons and number of "double bond equivalents" (DBE), but not chain positions or double bond regiochemistry and geometry. The concept of a double bond equivalent unites a range of chemical functionality which gives rise to isobaric features by mass spectometry. For example a chain containing a ring results in loss of 2 hydrogen atoms (compared to a linear structure) and thus has 1 DBE since the mass and molecular formula is identical to a linear structure with one double bond. Similarly, conversion of a hydroxyl group to ketone results in loss of 2 hydrogen atoms, therefore the ketone is assigned 1 DBE. Where applicable, the number of oxygen atoms is added to the abbreviation, separated by a semi-colon. Oxygen atoms in the class-specific functional group (e.g. the carboxylic acid group for fatty acids or the phospholcholine group for PC) are excluded. In the case of sphingolipids, all oxygen atoms apart from the amide oxygen are included, in order to discrminate, for example, between 1-deoxyceramides (;O), ceramides (;O2) and phytoceramides (;O3).
Some notes pertaining to different metabolite classes are outlined below.

RefMet classification:

Classification table

Amino acids:

The amino acids are listed as Glycine, Arginine, Tyrosine, etc. without specifying an "L-" prefix and are linked to the structures of the predominant "L-" forms. D-amino acids, on the other hand are explicitly listed as such (D-Arginine, D-Asparagine, etc.).

Dipeptides:

Dipipetides are listed by their 3-letter amino acid abbreviations such as Lys-Arg, Asn-Leu and linked to the L-structures where applicable.

Sugars:

Monosaccharide sugars are generally listed without the "D/L-" prefix and linked to the structure of the most abundant enantiomer in nature. Thus, Glucose, Galactose and Fucose are linked to the strucures of D-Glucose, D-Galactose and L-Fucose, respectively. Other enantiomers such as D-Fucose or D-Galactose are explicitly listed as such.

Sphingolipids:

In cases where N-acyl chain-containing sphingolipids such as ceramides and sphingomyelins are identified at the MS precursor ion level the species level ("sum-composition") abbreviation is used, such as Cer 34:1;O2 and SM 42:1;O2 where the number of hydroxyl groups in the entire molecule is represented by ';Ox'. The traditional nomenclature where the lower-case letter prefix (m(mono), d(di), t(tri)) denotes the number of hydroxyl groups (e.g. Cer d34:1), is retained as an alternative notation. Where MS/MS methods or other techniques were used to identify the nature of the N-acyl chain and/or sphingoid base, a nomenclature such as Cer 18:1;O2/16:0 and SM18:0;O2/24:1 is used. In this case the ';Ox' portion refers to the number of hydroxyl groups in the sphingoid base only. An identified hydroxyl group in the N-acyl chain is represented as Cer 18:0/24:0;OH.
Cer: Ceramide
CerP: Ceramide-1-phosphate
SM: Sphingomyelin
GlcCer: Glucosyl ceramide
GalCer: Galactosyl ceramide
HexCer: Hexosyl ceramide (the nature of the hexose sugar could not be determined)
LacCer: Lactosyl ceramide
SHexCer: 3-sulfo-galactosyl ceramide

Glycerolipids:

Mono-, di- and tri-radylglycerols are designated by the MG, DG and TG prefixes respectively. In cases where the nature of the chains has not been determined ,"bulk" abbreviations such as TG(54:2) and DG(36:0) are used. Species containing an alkyl ether chain in place of an acyl chain are designated with an "O-" prefix, e.g DG O-36:0. Glycerolipids whose chain constituents have been identified by MS/MS methods or other techniques are designated as TG 16:0_18:1_20:4 or DG 18:1_18:2 where the underscore "_" indicates that the sn position on the glycerol backbone is unknown. In cases where the chains are the same (e.g. TG 16:0/16:0/16:0) a forward slash "/" is used because there is no sn position ambiguity. In the case of diradylglycerols, the sn location of the chains (1,2- , 1,3- or 2,3-) is not assumed unless explictly specified. Similarly for the monoradylglycerols, an abbreviation such as MG 16:0 represents a general species covering sn1, sn2 and sn3 substitution. Triradylglycerols containing one alkyl chain (MonoEther-DIacylGlycerols) are listed as MeDAG 54:2, etc.

Glycerophospholipids:

In cases where the nature of the chains has not been determined ,"bulk" abbreviations such as PC 34:2 and PE 36:0 are used. Species containing an alkyl ether chain in place of an acyl chain are designated with an "O-" prefix and species containing a (1Z) vinyl ether chain (i.e. Plasmalogens) are designated with an "P-" prefix. Since a phospholipid with a plasmenyl group (e.g PC P-32:0) is isobaric with an alkyl ether species containing a double bond at a chain position other than C1 (e.g. PC O-32:1), MS methods generally cannot distinguish between these isomers, and they are listed as PC P-32:0/PC O-32:1. Glycerophospholipids whose chain constituents have been identified by MS/MS methods or other techniques are designated as PC 16:0_20:4 where the underscore "_" indicates that the sn position on the glycerol backbone is unknown. In cases where the chains are the same (e.g. PE 18:0/18:0) a forward slash "/" is used because there is no sn position ambiguity. Lysophophospholipids are preceded with an "L".
(L)PC: (Lyso)Glycerophosphocholines
(L)PE: (Lyso)Glycerophosphoethanolamines
(L)PS: (Lyso)Glycerophosphoserines
(L)PG: (Lyso)Glycerophosphoglycerols
(L)PI: (Lyso)Glycerophosphoinositols
(L)PA: (Lyso)Glycerophosphates
CL: Cardiolipins (Glycerophosphoglycerophosphoglycerols)

Fatty acids:

Fatty acids and esters are generally listed by their common names, (Palmitic acid, Arachidonic acid, Lauryl oleate, Linoleyl arachidonate, etc.)
Acyl carnitines are listed with the "CAR" prefix, e.g. CAR 18:1.

Common names of saturated, straight-chain fatty acids up to C40

Common nameSystematic nameAbbreviation
Formic acidMethanoic acidFA 1:0
Acetic acidEthanoic acidFA 2:0
Propionic acidPropanoic acidFA 3:0
Butyric acidButanoic acidFA 4:0
Valeric acidPentanoic acidFA 5:0
Caproic acidHexanoic acidFA 6:0
Heptylic acidHeptanoic acidFA 7:0
Caprylic acidOctanoic acidFA 8:0
Pelargonic acidNonanoic acidFA 9:0
Capric acidDecanoic acidFA 10:0
Undecylic acidUndecanoic acidFA 11:0
Lauric acidDodecanoic acidFA 12:0
Tridecylic acidTridecanoic acidFA 13:0
Myristic acidTetradecanoic acidFA 14:0
Pentadecylic acidPentadecanoic acidFA 15:0
Palmitic acidHexadecanoic acidFA 16:0
Margaric acidHeptadecanoic acidFA 17:0
Stearic acidOctadecanoic acidFA 18:0
Nonadecylic acidNonadecanoic acidFA 19:0
Arachidic acidEicosanoic acidFA 20:0
Heneicosylic acidHeneicosanoic acidFA 21:0
Behenic acidDocosanoic acidFA 22:0
Tricosylic acidTricosanoic acidFA 23:0
Lignoceric acidTetracosanoic acidFA 24:0
Hyenic acidPentacosanoic acidFA 25:0
Cerotic acidHexacosanoic acidFA 26:0
Carboceric acidHeptacosanoic acidFA 27:0
Montanic acidOctacosanoic acidFA 28:0
Nonacosylic acidNonacosanoic acidFA 29:0
Melissic acidTriacontanoic acidFA 30:0
Hentriacontylic acidHentriacontanoic acidFA 31:0
Lacceroic acidDotriacontanoic acidFA 32:0
Psyllic acidTritriacontanoic acidFA 33:0
Gheddic acidTetratriacontanoic acidFA 34:0
Ceroplastic acidPentatriacontanoic acidFA 35:0
Hexatriacontylic acidHexatriacontanoic acidFA 36:0
Heptatriacontanoic acidHeptatriacontanoic acidFA 37:0
Octatriacontanoic acidOctatriacontanoic acidFA 38:0
Nonatriacontanoic acidNonatriacontanoic acidFA 39:0
Tetracontanoic acidTetracontanoic acidFA 40:0
  logo