Chemical Senses Advance Access originally published online on July 19, 2006
Chemical Senses 2006 31(8):713-724; doi:10.1093/chemse/bjl013
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Identification of Latent Variables in a Semantic Odor Profile Database Using Principal Component Analysis
Corporate Research, Modeling, and Simulations Department, Procter & Gamble Co., Miami Valley Innovation Center, 11810 East Miami River Road, Cincinnati, OH 45252, USA
Correspondence to be sent to: Manuel Zarzo, Corporate Research, Modeling and Simulations Department, Procter & Gamble Co., Miami Valley Innovation Center, 11810 East Miami River Road, Cincinnati, OH 45252, USA. e-mail: zarzo.mz{at}pg.com
| Abstract |
|---|
|
|
|---|
Many classifications of odors have been proposed, but none of them have yet gained wide acceptance. Odor sensation is usually described by means of odor character descriptors. If these semantic profiles are obtained for a large diversity of compounds, the resulting database can be considered representative of odor perception space. Few of these comprehensive databases are publicly available, being a valuable source of information for fragrance research. Their statistical analysis has revealed that the underlying structure of odor space is high dimensional and not governed by a few primary odors. In a new effort to study the underlying sensory dimensions of the multivariate olfactory perception space, we have applied principal component analysis to a database of 881 perfume materials with semantic profiles comprising 82 odor descriptors. The relationships identified between the descriptors are consistent with those reported in similar studies and have allowed their classification into 17 odor classes.
Key words: cluster, dimension, odor classification, odor descriptor, semantic profile
| Introduction |
|---|
|
|
|---|
Since the discovery of the large family of genes encoding putative olfactory receptors (ORs) (Buck and Axel 1991
Odor description
Smell is a sensation that is difficult to describe, measure, and predict, and hence, perfume research is still rather empirical. Attempting to provide certain standards in fragrance technology, perfumers have tried for many decades to develop an accurate description of odors. To characterize odor profiles, one option is to rate the smell similarity by direct comparison with a series of reference odorants (Schutz 1964
; Yoshida 1975
). This is an objective approach but becomes time consuming and impractical with a high number of references. On the contrary, semantic methods allow for the rapid generation of data and consequently are the most commonly used procedures. They consist of assigning the words that come to mind when smelling a substance. Our odor memory "compares" the perceived sensation with those of other substances previously smelled. If there is a good match, one word can be enough, but usually several are necessary to describe how the smell resembles other common odors. These words are called odor character descriptors or notes. The most useful ones in fragrance chemistry are those generally understood and are usually associated with the source of that smell, making it easy for any observer to use after some training. An open or unrestricted description of a smell usually produces subjective characterizations such as "dry," "fresh," "powerful," "rich," "feminine," "natural," "tender," or "warm," etc. Their use should be avoided since they reflect one person's opinion and are open to discussion. In order to generate certain consensus, panelists are usually requested to assign for a given odorant those objective descriptors that best apply from a fixed list (Harper et al. 1968
; Moskowitz and Barbe 1977
). Because the use of verbal odor descriptors requires observers to assign the same words in the same way, training and some experience are required. Although semantic methods have been considered significantly "noisy" because of interindividual differences in the interpretation of descriptors, the use of a panel provides an average odor profile that tends to stabilize if a large number of panelists are used (Dravnieks 1982
). Moreover, although reference-odorant methods seem a priori more accurate, an experiment conducted with 49 panelists revealed that semantic methods were almost as reproducible as in direct comparisons (Dravnieks et al. 1978
).
Odor classification
Although for an inexperienced observer the large list of descriptors for odor profiling seems to reflect a high dimensionality of odor space, some of the descriptors are related, and the basic smell attributes are easily identified after some training. Understanding the different relationships, associations, or similarities between these notes is the basis to define more accurately the olfactory universe of perfumers. Additionally, it may provide some insight into the basis of olfaction.
The first scientific approach for odor classification was proposed by Linnaeus based on his deep experience in botany (Linnaeus 1756
). This 7-category system was revised later, and 2 new classes were added (Zwaardemaker 1925
). For many decades, researchers have proposed a relatively small number of odor classes or dimensions in odor space, ranging from 4 to 9 (Henning 1916
; Lovell 1923
; Crocker and Henderson 1927
; Klein 1947
; Amoore 1962
; Schutz 1964
). Conversely, other authors consider that there are likely to be >20 descriptive terms that are essential to cover the complete range of odor stimuli (Harper et al. 1968
). A classification employing 45 groups has also been proposed (Cerbelaud 1951
). Many other efforts have been conducted to develop a consensus for odor classification (Woskow 1968
; Kastner 1973
; Yoshida 1975
; Schiffman 1981
; Jaubert et al. 1986
; Lawless 1988
, 1993
; Higuchi et al. 2004
), but none of them have yet gained wide acceptance.
Odor profile databases
Over time, fragrance chemical companies have developed databases of thousands of odorants with their corresponding odor profiles. Unfortunately, most of these data are not available to the scientific community, seriously limiting efforts to develop an accurate characterization of odor space. Such information would be beneficial for providing a means for the development of new odorants. But despite many efforts in obtaining structureodor relationships (SORs) that may guide a rational approach for odorant discovery (Rossiter 1996
), this goal is still mainly achieved by trial and error.
The most comprehensive published databases of odor profiles are the Arctander's handbook (Arctander 1969
) and the Fenaroli's handbook (Burdock 2004
). From both sources, a set of 1396 pure substances was compiled and analyzed, leading to a descriptive model of olfactory perception space (Jaubert et al. 1986
, 1987
). Arctander's handbook contains the odor description of 3102 perfume and flavor chemicals. Considering that the perfumer's palette now consists of approximately 4000 raw materials, this is a valuable reference for perfumers and flavorists. But odors have been basically characterized by only one person (S Arctander), resulting in an arguable degree of personal subjectivity. Moreover, many chemical structure drawings are not accurate. In total, about 270 odor descriptors are used, many of them subjective. In a first attempt to identify associations among these descriptors, a reported study (Chastrette et al. 1986
) selected 24 notes and analyzed them using principal component analysis (PCA) and ascending hierarchical taxonomy (AHT). In a further effort to analyze this database, 74 notes were selected for 2467 pure substances (Chastrette et al. 1988
). After calculating the similarity for every pair of descriptors, 2 hierarchical agglomerative classification methods were applied to identify statistically significant associations. As a result, 60 notes were regrouped in 27 clusters, each containing 24 notes, and 14 remained as isolated notes. In another study of the Arctander's handbook, 126 odor descriptors were selected for 1573 compounds, and a cluster analysis resulted in 19 clusters (Abe et al. 1990
).
In order to assess the reproducibility of these results, a similar analysis was performed in a later study (Chastrette et al. 1991
) of another database of 628 pure compounds compiled by SA Firmenich, La Plaine, Switzerland. Each product was described by a team of 7 perfumers, who assigned 24 notes chosen among 32 possible descriptors, and the 3 most frequent ones were considered as the odor profile. A similarity matrix was calculated as in the previous case (Chastrette et al. 1988
) and was analyzed using 4 multivariate methods: nonlinear mapping, AHT, minimal spanning trees, and PCA. Several clusters of descriptors were obtained, consistent with the perfumers' point of view. Although this odor profile database is more representative of the olfactive universe of perfumery than that of Arctander, similar results were obtained. These studies confirmed that odor descriptors used in perfumery are generally rather independent, with no strict hierarchy among them, ruling out the existence of a small number of primary odors.
Another detailed database is the Atlas of odor character profiles (Dravnieks 1985
) that contains the odor profile of 138 pure odorant chemicals. Data were collected from 120140 panelists at 12 participant laboratories. A list of 146 commonly used descriptors was provided to the panelists, who smelled the sample and described its odor by rating the applicability of each descriptor on a numeric scale from 0 to 5. In a previous publication (Dravnieks 1982
), these average profiles exhibited an impressive reliability. Because the number of chemicals in this database is not large enough for a proper characterization of odor space, additional compounds were profiled using the same descriptors with a panel of about 20 individuals (Jeltema and Southwick 1986
), resulting in a compilation of 415 odorants. Further experiments indicated that the results from this reduced panel correlated well with those from the Dravnieks' panel. This database was analyzed using factor analysis. The identification of descriptors that contributed the most to each factor allowed their classification into 17 groups of terms.
The "SigmaAldrich Fine Chemicals (SAFC) flavors and fragrances" catalog is another large database of semantic odor profiles. A recent work (Madany-Mamlouk et al. 2003
; Madany-Mamlouk and Martinetz 2004
) has compiled 278 descriptors from the 1996 edition of this catalog that comprised 851 perfume raw materials (PRMs). The application of multidimensional scaling (MDS) to this database revealed approximately 32 dimensions in the olfactory perception space, which agrees with the long-held belief that olfactory space is high dimensional. Afterward, a 2-dimensional self-organizing mapping was used to visualize the MDS results on a low-dimensional map. This map provides some sort of clustering for odor descriptors, but those that appear as neighbors might actually be very distant in the high-dimensional space. Given this drawback, a new statistical effort is described in this paper that was conducted to determine if a clearer classification of odor descriptors could be achieved and allow for a better understanding of the underlying structure in human odor perception.
| Materials |
|---|
|
|
|---|
The SAFC flavors and fragrances catalog 20032004 (Sigma-Aldrich 2003
|
The information was organized by collecting the descriptors assigned to each PRM. As reported in the analysis of similar databases (Chastrette et al. 1988
| Methods |
|---|
|
|
|---|
A descriptive analysis was conducted with 2 variables: number of descriptors assigned to a given PRM and number of PRMs assigned to a given odor descriptor. In order to identify associations between descriptors, the correlation coefficient for all possible pairs of descriptors was computed and those with higher values were identified. Each odor descriptor of the SAFC database is a dichotomic variable with 2 values (0 and 1), so that the sum corresponds to the number of PRMs labeled with that particular descriptor. If the entries of a given descriptor are randomly scrambled, it results in a new random descriptor with the same sum but uncorrelated with the original one. This procedure was applied to the 82 descriptors, resulting in a matrix of independent (orthogonal) variables. This new matrix of random descriptors presents the same size as the original one and was referred to as the "random matrix."
Principal components (PCs) are directions of maximum data variance obtained as linear combinations of the original variables. The projections of observations (PRMs in this case) over these directions are called "scores," and the contributions of the variables (odor descriptors) in the formation of a given component are called "loadings." A scatter plot of the loadings corresponding to 2 different components is referred to as the "loading plot." PCA was used to evaluate both the randomized and original matrices. The comparison of these analyses allowed for the identification of those descriptors to be discarded because they represent too few PRMs to be useful. A new PCA was fitted using the remaining descriptors. An examination was made of the loading plots for the different components in order to explore the odor space of this database and to identify clusters of similar notes. Once a set of descriptors was identified that clearly defined a component or dimension of odor space, these variables were set aside and a new PCA was calculated in order to identify the next dominant latent structure. Using this cascaded PCA approach, a classification of descriptors was finally obtained. All PCAs were carried out using the software SIMCA-P 10.0 (http://www.umetrics.com). The data were centered and scaled to unit variance prior to analysis.
| Results and discussion |
|---|
|
|
|---|
Descriptive analysis of the SAFC database
The organoleptic properties section of the SAFC catalog presents the compounds under the 82 odor character categories. Another section lists the compounds alphabetically, and most of them contain an unrestricted odor description with a few words chosen from a larger list; many of them (like the unpleasant notes) are not included in the set of 82 descriptors. This additional information has been used in a reported analysis of the 1996 edition of this catalog (Madany-Mamlouk et al. 2003
; Madany-Mamlouk and Martinetz 2004
), but because this odor profile is not available for all compounds, we have used exclusively the information under the organoleptic properties section.
The number of odor descriptors assigned for a given PRM ranges from 1 to 9, with an average value of 2.2. The occurrences of a given descriptor (number of PRMs labeled with that descriptor) range from 1 to 141, with an average of 24 (Figure 1). In comparison, the Arctander database contains 233 descriptors that provide the relevant olfactory information. Each note was cited an average 29 times, and the average number of words used to describe the odor of a particular compound was 2.7 (Chastrette et al. 1988
). Thus, these characteristics are similar in the SAFC database.
|
Identification of associated odor descriptors
The linear correlation coefficient (r) was calculated for all possible pairs of descriptors, except those with an occurrence of <6 PRMs (too few to provide relevant information). This coefficient is widely used to study the correlation between 2 continuous variables but not often for dichotomic ones as this case. It provides an easy interpretation: r = 1 if 2 descriptors are identical and r will approach 0 if there is no similarity or correlation. Thus, r can be used as a measure of similarity. The highest 60 values out of the 3321 possible pairs are shown in Table 1. In all cases, the correlation is statistically significant (P value < 0.002). The similarity detected between most of these odor character descriptors was intuitively appealing, and this information will be used later to discuss the PCA results.
|
Other workers have reported using the product of the original dichotomic matrix and its transpose (XXT) to generate an occurrence/co-occurrence matrix where the diagonal terms are the occurrences of notes and nondiagonal terms represent the co-occurrences between notes (Chastrette et al. 1986
Identification of descriptors that do not provide relevant information
In the analysis of the Arctander database (Chastrette et al. 1988
), 2467 compounds were selected and combined descriptors were used to replace pairs of very similar notes (amber/ambergris, citrus/lemon, and raspberry/berry). Next, 37 descriptors were eliminated for being considered either rather subjective or related with intensity, such as "dry," "fresh," "strong," "weak," "warm," or "deep." Lastly, odor descriptors with fewer than 12 occurrences were discarded (156 in total). In our case, we skipped both of these steps in order to let the analysis identify notes with high similarity. If we were to perform the same analysis for the SAFC database of 881 PRMs, we would need to discard descriptors yielding 4.3 or fewer occurrences (maintaining the proportion: 4.3 = 12 x 881/2467).
To check if this criterion is adequate, a PCA was conducted with the random matrix, obtained by randomly scrambling the entries of the original matrix, as described above. This analysis showed that the descriptors "soapy," "mossy," "pepper," "lime," and "gardenia" (with 4, 4, 2, 1, and 1 occurrences, respectively) were forcing the first 2 PCs. Discarding the 20 descriptors with <5 occurrences and repeating the PCA, those with 5 or 6 occurrences did not force components. This criterion coincides with the one previously observed.
For this PCA with the random matrix and for an equivalent PCA with the original matrix (62 descriptors), the eigenvalues and percentage of data variance explained by each component (goodness of fit,
) have been compared (Figure 2). A weak correlation structure is clearly observed in the dichotomic matrix, with each component explaining just slightly more of the variance than the random case. This suggests that odor descriptors are quite independent and the correlation between most descriptors is too weak to define a PC.
|
Regarding the data pretreatment more convenient for PCA, 3 choices are possible: centered variables, scaled to unit variance, or both. In this case, the average and variance of an odor descriptor increase according to the number of occurrences. If a PCA is fitted with the original data (no pretreatment), the components are forced by the descriptors appearing most frequently. However, this observation is not related to the similarities between descriptors. For this reason, all PCA models have been fitted with data centered and scaled to unit variance.
Identification of clusters: noncitrus fruity
Conducting a PCA with the 62 descriptors assigned to at least 5 PRMs and checking the score plot for PC1 and PC2 (projection of PRMs over the 2 PCs), 2 orthogonal directions of variability appear (Figure 3). The corresponding loading plot reveals that PC1 corresponds to the noncitrus fruity descriptors. Thus, the strongest dimension of the SAFC database is defined by the fruity odorants. The highest loadings in absolute value along this direction correspond to the descriptors that best characterize the fruity odor, and the most representative is "apricot." This is the fruity attribute most frequent in Table 1. Highest loadings and proximity in the loading plot correspond to correlated descriptors that are used by panelists interchangeably. This is the case for "pineapplebanana," the third pair with the highest correlation (Table 1). The fact that the notes "raspberry," "strawberry," "grape," and "melon" are closer to the center might indicate that these notes are less characteristically fruity. However, this conclusion is uncertain, given that the number of PRMs labeled with these descriptors is lower compared with the rest of descriptors in the fruity cluster (Table 2). "Coconut" appears in the SAFC catalog under the fruity category, but according to Figure 3, it is the only noncitrus fruit excluded from the cluster. Other authors have considered that coconut odor is related with nuts (Jeltema and Southwick 1986
; Chastrette et al. 1988
; Madany-Mamlouk et al. 2003
). However, the highest correlation of this descriptor corresponds to creamy and peach (Table 1). Thus, it was classified as intermediate of fruity and butter (Table 2).
|
Strikingly, the descriptor "fruity-other" is far separated from the fruity cluster. This note is assigned to 131 PRMs, and only 4 of them are also labeled with another noncitrus fruity note. Thus, this descriptor was used only when the PRM smells fruity, but not like any one in particular, which appears as a negative correlation in the PCA: if "fruity-other" = 1, then the rest of fruity notes are more likely to be 0. As a consequence, the PCA reflects no similarity between "fruity-other" and the rest of fruity descriptors, but obviously, this note should be included in the noncitrus fruity cluster. "Ethereal" is correlated with "fruity-other" (Table 1) and appears close to the fruity cluster. Actually, ethereal and fruity are close odors, according to the experience of perfumers (Chastrette et al. 1988
Different studies have reported that when a wide range of odors are sampled, a common result is a configuration of odor space with one underlying hedonic dimension, related with the pleasantunpleasant degree of odor perception (Woskow 1968
; Davis 1979
). In this case, the hedonic dimension does not appear clearly, probably because the database is not representative of odor perception space, being biased toward those odors most frequent in perfumery that are in general rather pleasant. However, the loading plot PC12 (Figure 3) suggests that the second component might be related with the hedonic dimension. So, if the descriptors are orthogonally projected over the dashed line, the most pleasant notes tend to be located in one extreme, whereas the most unpleasant seem to appear on the other extreme (dashed cluster).
Identification of clusters: butter and alliaceous
Because noncitrus fruity descriptors are clearly grouped, these notes used to form the cluster are set aside in order to identify which other descriptors define a clear direction of variability. The analysis was conducted using 48 descriptors remaining after the removal of the noncitrus fruity descriptors: "apricot," "pineapple," "apple," "plum," "cherry," "banana," "pear," "peach," "berry," "strawberry," "raspberry," "grape," "melon," and "fruity-other."
The loading plot PC12 of the previous model (Figure 3) reveals that the components are rotated. Moreover, the loadings of PC2 are scattered, with no clear distinction of clusters. This situation is common in PCA. Consequently, instead of using automatic procedures for cluster identification, it is preferable to rely on visual inspection methods, checking loading plots with different combinations of components in order to find which plot reveals some descriptors clustered together and clearly separated from the rest. The loading plots PC12 and PC34 usually provide the most relevant information, given that the first components account for the highest data variability. But this is not necessarily the case in this PCA with 48 descriptors because there are no clear dominant components (the different PCs explain a similar amount of the total data variance). The loading plot for PC35 (Figure 4) reveals that PC3 is dominated by the notes "butter," "cheese," and "creamy." A significant similarity between "buttery" and "creamy" was also identified in the Arctander database (Chastrette et al. 1988
). In the other reported study of the SAFC catalog (Madany-Mamlouk et al. 2003
), a cluster was formed with the notes "butter," "creamy," and "milk." "Oily" has been classified as an intermediate odor between fatty and butter. "Coconut" is also close to this cluster, revealing the buttery smell of this note. The presence of "vanilla" and "caramel" not far from this cluster reveals that butter, creamy, and cheese are pleasant smells with a certain similarity to balsamic notes.
|
The descriptors "alliaceous" (garlic, onion smell) and "sulfurous" form an independent dimension, revealed by PC5. Other authors have also proposed this cluster as an odor class (Zwaardemaker 1925
Identification of clusters: balsamic, nutty, and camphoraceous
As before, the next analysis was conducted using the odor descriptors not already accounted for in previous clusters. Thus, a new PCA was fitted using the 43 descriptors remaining after discarding from the previous model the descriptors "butter," "cheese," "creamy," "alliaceous," and "sulfurous." The loading plot PC12 (Figure 5) reveals that the second component is related with balsamic descriptors. The SAFC catalog considers as balsamic notes: "vanilla," "sweet," "honey," "cinnamon," "chocolate," "caramel," "balsam," and "anise." Most of them correspond to the highest loadings in PC2, suggesting that they are related odors, but they do not form a compact cluster. Thus, their classification is discussed in the next model.
|
In this PCA with 43 descriptors, the first component is related to notes with a rather different smell: "woody," "meaty," "coffee," "smoky," and "nutty." Checking different loading plots (PC12, PC13, PC14, PC23, etc.), the one for PC14 (Figure 5) reveals that the nutty descriptors are located close to each other, separated from "meaty," "coffee," and "smoky". This cluster appears close to "cinnamon," "earthy," and "woody," 3 notes with a certain similarity to nutty (Table 1). The descriptor "nutty-other" is close to "hazelnut," "walnut," and "almond" in the loading plot PC12, but this proximity is not reflected in the loading plot PC14. The reason for this is likely to be similar to the observation regarding the "fruity-other" descriptor in the case of the noncitrus fruity cluster. Other authors have also proposed an independent nutty category (Jeltema and Southwick 1986
In the loading plot for PC35, a cluster can be clearly observed that comprises "minty," "camphoraceous," and "medicinal," and the score plot indicates a set of PRMs following this direction. This cluster has also been proposed in other studies (Jeltema and Southwick 1986
). In the analysis of the Firmenich database (Chastrette et al. 1991
), "minty" appeared related to "hay," whereas "camphoraceous" was similar to "piney" and not distant from "woody." A close look at this loading plot reveals that although the 3 clustered descriptors are separated from the rest, "mintyherbaceous" are located not too far away, and the same occurs for "camphoraceouswoody" and "medicinalchemical." Other works have also found a similarity between "camphor" and "minty" (Chastrette et al. 1988
; Madany-Mamlouk et al. 2003
) but not so with "medicinal," reported to be found similar to "chemical," "etherish" (Jeltema and Southwick 1986
), and "phenolic" (Chastrette et al. 1988
; Abe et al. 1990
).
From the previous model with 43 variables, a new PCA was conducted by first eliminating the notes "minty," "camphoraceous," "almond," "walnut," "hazelnut," and "nutty-other." "Medicinal" was also included to check other possible similarities. The loading plot PC12 (Figure 6) reveals that the notes in the upper part of the plot correspond to "spicy" and the 8 descriptors classified as balsamic in the SAFC catalog except "honey" and "anise." These results complement the previous PCA (Figure 5), highlighting that balsamic notes form an independent dimension in odor space.
|
"Honey" and "rose" are close in the loading plots, and they present the highest correlation of all pairs of descriptors (Table 1). This high similarity has also been pointed out in other studies (Chastrette et al. 1988
The "spicy" descriptor seems controversial. A significant similarity between "spicy," "herbaceous," and "aromatic" was identified in the Arctander database (Chastrette et al. 1988
). In a reported study using the Dravnieks' descriptors, "spicy" was included with the cinnamon group but not with "vanilla," "chocolate," "caramel," or "honey" (Jeltema and Southwick 1986
). In the SAFC catalog, "spicy" is not included within the balsamic category. But according to our results, the correlation coefficient for "spicycinnamon" is the second highest of all pairs of descriptors (Table 1). Figures 5 and 6 show that "spicy" is strongly associated with the balsamic notes. This result agrees with the reported analysis of the Firmenich database (Chastrette et al. 1991
), where a significant similarity between "spicy" and "balsamic" was found.
Another descriptor with a troublesome classification is "sweet." According to our results, "sweet" can clearly be considered within the balsamic cluster but remains close to the "floral-other" note (Figures 5 and 6) because the highest correlation of "sweet" corresponds to "floral-other" and "almond" (Table 1). A reported analysis of the SAFC catalog (Madany-Mamlouk et al. 2003
) also showed the "sweet" classifier to be related to the descriptors "pleasant" and "spicy." But other authors (Jeltema and Southwick 1986
) classify this note with the noncitrus fruits, not with other balsamics. In this analysis, we have decided to classify it as intermediate between balsamic and floral.
"Anise" is listed under the balsamic category in the SAFC catalog, but in Figures 5 and 6, it is the balsamic note closest to the center of the loading plot and hence can be considered as the least balsamic. Other authors have also classified "anise" in a group different to other balsamic notes (Zwaardemaker 1925
; Jeltema and Southwick 1986
).
Identification of cluster: cooked
Discarding from the previous model "spicy," "coconut," and the 8 balsamic descriptors and conducting a new PCA with the remaining 27 variables, a cluster was identified in the loading plot PC12. The new cluster comprises the notes "woody," "meaty," "coffee," and "smoky" (figure not shown). This cluster appears more clearly with PC15 (Figure 7) and is also apparent in Figure 6.
|
The relationship "coffeesmoky" is intuitively appealing. Checking the correlation with the rest of descriptors (Table 1), it appears that this cluster is related with the nutty and the alliaceous odors. Different works have found similarities between "woody" and other descriptors: "cognac" (Jeltema and Southwick 1986
Identification of clusters: floral, citrus, and green
As before, a new analysis was conducted by discarding these 4 descriptors ("woody," "meaty," "coffee," and "smoky") and fitting a new PCA using the remaining 23 variables. The first component of the new model is dominated by "waxy" and "fatty-other." The descriptors shown in the loading plot PC23 (Figure 8) above the dashed line correspond to the categories floral, citrus, and green. The proximity between "floral," "rose," "citrus," "herbaceous," and "green" was also found in other studies (Chastrette et al. 1991
). "Rose" presents the highest negative loadings in PC2 and hence is the most representative of floral notes. Regarding citrus fruits ("orange" and "lemon"), the results (Figures 3 and 8) show that their smell resembles more the floral notes than the rest of noncitrus fruits. Consequently, they have been classified in an independent cluster (Table 2), as reported in other cases (Jeltema and Southwick 1986
). In the analysis of the Firmenich database, citrus was found to be more similar to herbaceous and floral than to fruity. The descriptor "citrus-other" appears separated for the same reason as "fruity-other" and "nutty-other."
|
"Green" refers to the smell of fresh-cut grass, whereas "vegetable" refers to fresh vegetables like green pepper, cucumber, or green beans. Although both appear as independent odor categories in the SAFC catalog, our results (Table 1, Figure 8) reveal that they are related odors, and we grouped them as a single odor class (Table 2). Because vegetables can be described botanically as herbaceous plants (nonwoody annual), one would expect to observe more similarity between the descriptors "green," "vegetable," and "herbaceous." But the term "herbs" is commonly assigned to plants or plant parts used for medicinal, flavoring, or aromatic purposes. In the loading plot PC23 (Figure 8), "herbaceous" is clearly separated from "green" and "vegetable" but closer to floral, and a similar result has been reported in other studies (Madany-Mamlouk et al. 2003
Identification of the remaining clusters
At this point, discarding the floralcitrusgreen notes would leave only 11 descriptors. This number is too small, and the results do not highlight clear similarities between descriptors. So, the remaining notes have been classified according to information from the literature. Some authors have considered that "musty," "earthy," and "moldy" can be included within the greenvegetable category (Jeltema and Southwick 1986
). But according to Figure 8, the odor appears clearly different and a cluster referred to as "wet" has been formed with these descriptors. Actually, one of the proposed odor classification systems (Klein 1947
) considers earthy-fungoid as one of the 8 odor classes.
In a reported study, solvent-related descriptors were considered as an independent class, comprising notes like "chemical," "etherish," and "medicinal" (Jeltema and Southwick 1986
). In a similar way, we have created a cluster with the "chemical" note; "medicinal" has been considered between camphoraceous and chemical, and "ethereal" between fruity and chemical. The classification of "winelike" is uncertain. There are 25 PRMs labeled with this descriptor and also with other notes like "fruity" (14), "sweet" (5), "green" (5), "fatty" (5), or "ethereal" (4). The most reasonable classification is fruityalcoholic, and consequently, we have regarded it as intermediate of fruity and chemical (Table 2).
"Fatty-other" has been considered as part of a fatty cluster, classifying "oily" as intermediate between fatty and butter. The "sour" note appears in the SAFC catalog under the fatty category, and some authors have classified "sour" with other descriptors like "oily," "fatty," and "cheese" (Jeltema and Southwick 1986
). But there are only 5 PRMs labeled as sour, and none of them are additionally described with any other fatty-related descriptor. On the contrary, 2 of them are also labeled as fruity and 1 as orange, which makes sense given that green fruits are perceived as sour. Thus, we decided to classify "sour" as intermediate of fatty and fruity. A reported study of the Arctander database classified "waxy" in the fatty cluster (Abe et al. 1990
). In the SAFC catalog, there are 28 PRMs described as waxy, and most of them are also labeled with different descriptors: "fatty-other" (7), "fruity" (8), "floral" (7), "sweet" (6), "citrus-other" (6), etc. These are rather different odors, and consequently, we decided to classify "waxy" as an independent odor class, following the criteria of the SAFC catalog.
Although some works have classified the "animal" note with other very different odors like "putrid," "fecal," or "oily" (Jeltema and Southwick 1986
), it is usually considered associated with "musk" (Chastrette et al. 1988
, 1991
). In perfumery, musk defines an independent odor category, and amber and musk have long been considered as "ambrosiac" (Zwaardemaker 1925
). So, because "musk" does not appear explicitly in the SAFC database, we have considered "animal" as an isolated note (Table 2). This is also justified by the fact that "animal" has a low correlation with the rest of the descriptors, appearing nearly in the last place in Table 1.
Classification of odor descriptors
Regarding the 20 descriptors with an occurrence <5 that were not included in the multivariate analysis, the classification of 5 of them remains uncertain ("jam," "grapefruit," "clove," "pepper," and "soapy"), and the rest were classified like those descriptors with a similar source of the smell according to the criteria of the SAFC catalog: floral ("lily," "gardenia," "blossom," "carnation," "lilac," "narcissus," "marigold," "jonquil," and "iris"), herbaceous ("sage" and "caraway"), fruity ("quince"), citrus ("lime"), and nutty ("peanut").
With the information gathered from all PCAs, 74 odor descriptors have been regrouped in 14 odor classes (some of them as intermediate of 2 categories), and 3 descriptors were considered as independent odors (waxy, woody, and animal), as shown in Table 2. This analysis of odor perception space is restricted by the available descriptors in the SAFC database. Obviously, other descriptors not explicitly included may also form additional odor dimensions.
| Conclusions |
|---|
|
|
|---|
The results of our statistical analysis of the SAFC database appear to be consistent with the long-held theory that odor space is highly multidimensional. The results suggest that it is reasonable to classify odor descriptors in >9 classes, contrary to many odor classification systems proposed several decades ago and in accordance to more recent statistical analyses of odor profile databases. Understanding the similarities between descriptors and their classification will be helpful in training sensory panels for odor profiling and in providing a standard means of communication among perfumers. Moreover, this information is also of interest in SOR studies, which can be of considerable value in elucidating the mechanisms of olfaction. These studies are usually focused on particular descriptors, but a different approach would be to use the latent variables of odor space. So, specific SORs for "apricot" or "apple" might be difficult to obtain, but given that all fruity descriptors are related odors, as revealed in this study, it is more reasonable to start with SORs for the whole fruity category and, once the molecular features responsible for this odor class were identified, proceed afterward with a particular fruity odor. Similarly, it seems reasonable to derive SORs for the combined sulfurousalliaceous category and try next to discriminate between both odors.
| Acknowledgements |
|---|
|
|
|---|
M.Z. is grateful for a postdoctoral grant jointly sponsored by the Fulbright Program and the Spanish Ministry of Education and ScienceState Secretariat of Universities and Research. We thank S. Teremi for the data assembly and B. Murch for valuable discussion and comments.
| References |
|---|
|
|
|---|
Abe H, Kanaya S, Komukai T, Takahashi Y, Sasaki S. (1990) Systematization of semantic descriptions of odors. Anal Chim Acta 239:7385.[CrossRef]
Amoore JE. (1962) The stereochemical theory of olfaction. I. Identification of seven primary odors. Proc Sci Sect Toilet Goods Assoc 37:(Suppl) pp. 113.
Arctander S. (1969) Perfume and flavor chemicals (aroma chemicals). (S. Arctander publisher, Montclair, NJ) Volumes 1 and 2:.
Buck L and Axel R. (1991) A novel multigene family may encode odorant receptors: a molecular basis for odor recognition. Cell 65:17587.[CrossRef][Web of Science][Medline]
Fenaroli's handbook of flavor ingredients. (2004) 5th ed (CRC PressIn Burdock GA (Ed.). , Boca Raton, FL).
Cerbelaud R. (1951) Formulaire de Parfumerie. (Opera, Paris).
Chastrette M, de Saint Laumer JY, Sauvegrain P. (1991) Analysis of a system of description of odors by means of four different multivariate statistical methods. Chem Senses 16:8193.
Chastrette M, Elmouaffek A, Sauvegrain P. (1988) A multidimensional statistical study of similarities between 74 notes used in perfumery. Chem Senses 13:295305.
Chastrette M, Elmouaffek A, Zakarya D. (1986) Etude statistique multidimensionnelle des similarités entre 24 notes utilisées en parfumerie. C R Acad Sci Ser II Paris 303:120914.
Crocker EC and Henderson LF. (1927) Analysis and classification of odors: an effort to develop a workable method. Am Perf Essent Oil Rev 22:32556.
Davis RG. (1979) Olfactory perceptual space models compared by quantitative methods. Chem Senses 4:2133.
Dravnieks A. (1982) Odor quality: semantically generated multidimensional profiles are stable. Science 218:799801.
Dravnieks A. (1985) Atlas of odor character profiles, data series DS 61. (American Society for Testing and Materials, Philadelphia PA).
Dravnieks A, Bock FC, Powers JJ, Tibbetts M, Ford M. (1978) Comparison of odors directly and through profiling. Chem Senses 3:191225.
Harper R, Bate Smith EC, Land DG. (1968) Odor description and odor classification: a multidisciplinary examination. (Elsevier, New York).
Henning H. (1916) Der Geruch. (Barth, Leipzig, Germany).
Higuchi T, Shoji K, Hatayama T. (2004) Multidimensional scaling of fragrances: a comparison between the verbal and non-verbal methods of classifying fragrances. Jpn Psychol Res 46:109.[Medline]
Jaubert JN, Gordon G, Doré JC. (1986) Classification of odors and their sensorial perception. Quintessenza 5:2742.
Jaubert JN, Gordon G, Doré JC. (1987) Une organisation du champ des odeurs. II. Modèle descriptif de l'organisation de l'espace odorant. Parfum Cosmét Arômes 78:7182.
Jeltema MA and Southwick EW. (1986) Evaluations and application of odor profiling. J Sens Stud 1:12336.[Medline]
Kastner D. (1973) Die Beschreibung und klassifizierung von gerüchen. Parfüm Kosmet 54:97106.
Klein S. (1947) Primary odor element classification. Am Perf Essent Oil Rev 50:4534.
Lawless HT. (1988) Odor description and odor classification revisited. In Thomson DMH (Ed.). Food acceptability(Elsevier, London) pp. 2740.
Lawless HT. (1993) Characterization of odor quality through sorting and multidimensional scaling. In Manley CH and Ho CT (Eds.). Flavor measurement(Dekker, New York) pp. 15983.
Linnaeus C. (1756) Odores medicamentorum. Amoenitates Academicae(Lars Salvius, Stockholm, Sweden) 3: pp. 183201.
Lovell JH. (1923) Classification of flower odors. Am Bee J 63:3924.
Madany-Mamlouk A, Chee-Ruiter C, Hofmann UG, Bower JM. (2003) Quantifying olfactory perception: mapping olfactory perception space by using multidimensional scaling and self-organizing maps. Neurocomputing 5254:5917.[CrossRef]
Madany-Mamlouk A and Martinetz T. (2004) On the dimensions of the olfactory perception space. Neurocomputing 5860:101925.[CrossRef]
Malnic B, Hirono J, Sato T, Buck LB. (1999) Combinatorial receptor codes for odors. Cell 96:71323.[CrossRef][Web of Science][Medline]
Moskowitz HR and Barbe CD. (1977) Profiling of odor components and their mixtures. Sens Processes 1:21226.[Web of Science][Medline]
Rossiter KJ. (1996) Structureodor relationships. Chem Rev 96:320140.[CrossRef][Web of Science][Medline]
Schiffman SS. (1981) Characterization of odor quality utilizing multidimensional scaling techniques. In Moskowitz HR and Warren CB (Eds.). Odor quality and chemical structure(American Chemical Society, Washington, DC) pp. 119.
Schutz HG. (1964) A matching-standards method for characterizing odor qualities. Ann N Y Acad Sci 116:51726.[Web of Science][Medline]
Sigma-Aldrich. (2003) Flavors and fragrances 20032004 catalog. (Sigma-Aldrich Fine Chemicals Company, Milwaukee, WI).
Turin L and Yoshii F. (2003) Structure-odor relations: a modern perspective. In Doty RL (Ed.). Handbood of olfaction and gustation 2nd ed (Marcel Dekker, New York).
Woskow MH. (1968) Multidimensional scaling of odors. In Tanyolac N (Ed.). Theories of odors and odor measurement(Robert College Research Center, Bebek, Turkey) pp. 14788.
Yoshida M. (1975) Psychometric classification of odors. Chem Senses 1:44364.
Zwaardemaker H. (1925) L'Odorant. (Doin, Paris).
Accepted 22 June 2006
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
M. Zarzo and D. T. Stanton Understanding the underlying dimensions in perfumers' odor perception space as a basis for developing meaningful odor maps Atten Percept Psychophys, February 1, 2009; 71(2): 225 - 247. [Abstract] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||








