In order to facilitate collaboration and data sharing, it is essential that standard vocabularies and common data elements be used across studies. A nutritional ontology has been developed that integrates into the Thesaurus of the National Cancer Institute (NCI). The project is part of the NCI Cancer Biomedical Informatics Grid (caBIG™) initiative to build an informatics infrastructure to connect data, research tools, scientists, and organizations. Several agencies provide useful information about dietary components. In particular, the US Department of Agriculture has developed databases with information on specific dietary components and their composition in foods. The International Network of Food Data Systems (InFoods) of the United Nations Food and Agriculture Organization provides an administrative framework for the development of standards and guidelines for collection, compilation, and reporting of food component data. Other resources include the NCI Office of Dietary Supplements and Nutritional Science Research Group, the International Union of Pure and Applied Chemistry, and general nutrition books. However, there was no unified set of definitions for dietary components, with links to these sources. Our project developed a vocabulary that includes a list of commonly analyzed nutrients and other dietary components, with definitions, units and interrelationships between items. In addition, properties, such as the USDA nutrient number, the InFoods TAGNAME, and whether the component has a dietary recommendation, were added. The ontology focuses on dietary components found in food and is structured under a hierarchical tree called “Bioactive Food Component”. This tree includes macronutrients, such as dietary alcohol, carbohydrates, lipids, proteins and all subcomponents; micronutrients, such as dietary minerals and vitamins; and other components, such as dietary fiber, lignans, non-starch polysaccharides, caffeine, flavonoids, heterocyclic aromatic amines, nitrosamine, isoflavonoids, energy and ash. Several challenges were encountered. Definitions had to be developed for superconcepts such as macronutrient that would hold for all of the subconcepts. Additionally, dietary fiber is considered a carbohydrate by some but not all nutritionists, and there is no general consensus on which dietary components have antioxidant properties. Advice from the NCI Enterprise Vocabulary System (EVS) group and our advisory committee of nutritional researchers was essential in developing the nutritional ontology. The final vocabulary will serve as a resource where researchers can unambiguously identify the dietary components under study and can search for dietary components that are included in a specific class, such as short chain fatty acids. In addition, new research studies will be to use the concepts (or NCI concept codes) to declare the definitions in use and thus help make the resulting study data more interoperable.

[Proc Amer Assoc Cancer Res, Volume 47, 2006]