Chemical Senses Advance Access originally published online on April 4, 2008
Chemical Senses 2008 33(5):461-467; doi:10.1093/chemse/bjn013
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Test–Retest Reliability of the Olfactory Detection Threshold Test of the Sniffin' Sticks
1 Department of Neuroradiology, Ludwig-Maximilians-University Munich, Marchioninistrasse 15, 81377 Munich, Germany 2 Department of Psychology, Ludwig-Maximilians-University Munich, Marchioninistrasse 15, 81377 Munich, Germany 3 Authors contributed equally
Correspondence to be sent to: Jessica Albrecht, Department of Neuroradiology, Ludwig-Maximilians-University Munich, Marchioninistrasse 15, 81377 Munich, Germany. e-mail: jessica.albrecht{at}med.uni-muenchen.de
| Abstract |
|---|
|
|
|---|
The aim of the present study was to investigate the test–retest reliability of the olfactory detection threshold subtest of the Sniffin' Sticks test battery, if administered repeatedly on 4 time points. The detection threshold test was repeatedly conducted in 64 healthy subjects. On the first testing session, the threshold test was accomplished 3 times (T1 = 0 min, T2 = 35 min, and T3 = 105 min), representing a short-term testing. A fourth threshold test was conducted on a second testing session (T4 = 35.1 days after the first testing session), representing a long-term testing. The average scores for olfactory detection threshold for n-butanol did not differ significantly across the 4 points of time. The test–retest reliability (Pearson's r) between the 4 time points of threshold testing were in a range of 0.43–0.85 (P < 0.01). These results support the notion that the olfactory detection threshold test is a highly reliable method for repeated olfactory testing, even if the test is repeated more than once per day and over a long-term period. It is concluded that the olfactory detection threshold test of the Sniffin' Sticks is suitable for repeated testing during experimental or clinical studies.
Key words: n-butanol, odor, olfaction, repetition, sensitivity, smell
| Introduction |
|---|
|
|
|---|
Olfactory testing is very important in clinical testing for otorhinolaryngologic and neurological disorders as well as in olfactory research. Although a number of olfactory tests have been described in the literature (for review, see Doty et al. 1995; Doty 2007), only a few are currently commercially available. Among the available tests are the University of Pennsylvania Smell Identification Test (Doty, Shaman, and Dann 1984
The validity of an olfactory test depends upon its reliability, which is commonly measured by correlating scores of a test administered to the same subjects on 2 occasions. Doty et al. (1995)
performed a comparison between 10 tests of olfactory function administered on 2 points of time. They found correlation coefficients in a range of 0.43–0.90. Especially, the reliability coefficient for the detection threshold for n-butanol in 57 subjects reported by Doty et al. (1995)
was 0.49. This is similar to the results of another study in which the correlation between 1-butanol thresholds determined for the left and right nostril (which was used as a reliability estimate) was in a range of 0.30 < r32 < 0.68 (Cain and Gent 1991
). In a different study (Punter 1983
), a correlation coefficient for repeated n-butanol threshold testing of r28 = 0.73 was found. However, only limited data are available on the test–retest reliability of the Sniffin' Sticks test battery. Kobal et al. (1996)
performed only one repetition of the odor identification test of the Sniffin' Sticks in a relative small group of subjects (24 subjects). Hummel et al. (1997)
performed one repetition of all 3 subtests of the Sniffin' Sticks in 104 volunteers. The mean interval between test and retest was 10 days (standard deviation [SD] 11 days). They found that the correlation coefficient of the threshold test was 0.61. Additionally, they demonstrated that there was no significant difference of the olfactory detection threshold on 7 occasions (over a period of 4 months), but this was observed in only 6 subjects. The correlation coefficients of these measurements were not reported.
In practice, the Sniffin' Sticks olfactory detection threshold test is conducted repeatedly (Damm et al. 2003
; Hummel et al. 2005
), even within a day (Kirchner et al. 2004
; Muttray et al. 2004
; Pollatos, Kopietz, et al. 2007) without existing extensive data about its test–retest reliability. Consequently, the aims of the present study were to investigate the short-term and long-term reproducibility of the olfactory detection threshold subtest of the Sniffin' Sticks test battery and in doing so to expand knowledge about its test–retest reliability. Results of a previous study point to a correlation between olfactory sensitivity in subjects with depressive symptoms (Pollatos, Albrecht, et al. 2007). Therefore, it was ascertained that the subjects did not suffer from depressive symptoms. Although in a previous study (Albrecht J, Schreder T, Kleemann AM, Schöpf V, Kopietz R, Anzinger A, Demmel M, Linn J, Kettenmann B, Wiesmann M, unpublished data) we were not able to show a relation between the state of satiety and olfactory sensitivity to n-butanol, a possible relationship has been suggested (Koelega 1994
). Thus, it was assured that subjects did not differ in their state of satiety on the 2 days of threshold measurement.
| Materials and methods |
|---|
|
|
|---|
Subjects
Sixty-four healthy subjects (32 males, 32 females; mean age 27.9 years, SD 4.6 years) participated in the study. Age did not differ significantly between male (mean age 28.3 years, SD 5.0 years) and female (mean age 27.4 years, SD 4.3 years) subjects (F1,62 = 0.66, P = not significant [NS]). All subjects were nonsmokers and were not taking any medication known to interfere with sensory perception (Frye et al. 1990
; Schiffman 1994
; Doty and Bromley 2004
). They provided their written informed consent. The study protocol was approved by the Medical Ethics Review Committee (internal review board) of the Ludwig-Maximilians-University, Munich.
Stimulus material
When subjects arrived for testing, they had to rate their current state of hunger (0 = not hungry at all, 100 = very hungry), their desire for food (0 = very weak, 100 = very strong), and the fullness of their stomach (0 = not full at all, 100 = very full) on a visual analogue scale (Aitken 1969
).
Depressive symptoms were assessed using the Beck Depression Inventory (BDI, Beck et al. 1961
). The BDI is a self-administered 4-point rating scale (0 = not at all, 3 = always) designed to measure how often a patient has experienced depressive symptoms in the past week.
Olfactory function was assessed by means of the olfactory detection threshold test, a subtest of the Sniffin' Sticks (Burghart Instruments, Wedel, Germany) (Kobal et al. 1996
; Hummel et al. 1997
). Standard procedure for olfactory detection threshold tests using the Sniffin' Sticks as described by Hummel et al. (1997)
and Kobal et al. (1996)
was used. The Sniffin' Sticks have been thoroughly validated; normative data are based on investigations in more than 3000 subjects (Kobal et al. 2000
; Hummel et al. 2007
).
After each olfactory detection threshold test, subjects rated their emotional valence (0 = negative, 100 = positive), arousal (0 = calm, 100 = aroused), and alertness (0 = inattentive, 100 = very attentive) during the olfactory testing as well as the pleasantness (0 = unpleasant, 100 = pleasant) and intensity (0 = very weak, 100 = very strong) of the pen containing a concentration which was 3 concentration steps above the individual threshold concentration of n-butanol and of the pen with the highest concentration of n-butanol on a visual analogue scale (Aitken 1969
).
Experimental procedure
The study consisted of 2 parts. Three olfactory detection threshold tests were conducted on day 1 (T1 = 0 min, T2 = 35 min, and T3 = 105 min), a fourth threshold test was conducted on day 2 (T4: mean 35.1 days, standard error of the mean 6.1 days after day 1) (Figure 1). These time points of testing were given because the present study constitutes the baseline measurement for a study investigating the effects of acupuncture on olfactory sensitivity (Anzinger A, Albrecht J, Kopietz R, Kleemann AM, Schöpf V, Demmel M, Schreder T, Eichhorn I, Wiesmann M, in preparation). On day 1, the subjects completed the BDI and rated their current state of hunger. This was followed by the first olfactory detection threshold test (T1 = 0 min). Estimated mean duration of a threshold test was 10–15 min. Thirty-five and 105 min after the beginning of the first threshold test, the second and the third threshold test were started, respectively (T2 = 35 min, T3 = 105 min). The brake between T1 and T2 was approximately 20–25 min, the brake between T2 and T3 was approximately 55–60 min. On day 2, subjects completed the BDI and rated their current state of hunger again. Subjects were advised to participate in the same state of satiety as they did in the first threshold test (T1). This was followed by the fourth olfactory detection threshold test (T4).
|
Statistics
SPSS (version 15.0 for Windows, SPSS Inc., Chicago, IL) was used for statistical evaluation. Normality of the data was tested using the Kolmogorov–Smirnov test. Normally distributed data (olfactory detection thresholds, pleasantness and intensity of the odor, and fullness of the stomach) were submitted to repeated-measures analyses of variance (ANOVAs) using the general linear model with the "within-subject factor" time (T1/T2/T3/T4 or day 1/day 2) and the "between-subjects factor" sex (male/female). We looked for main effects as well as second-order interactions between these factors. Existing second-order interactions were corrected using the Bonferroni method. Not normally distributed data (age, BDI score, current state of hunger, desire for food, emotional valence, emotional arousal, alertness of the subjects, and pleasantness of the pen with the highest concentration of n-butanol) were submitted to nonparametric tests (Wilcoxon rank sum test/Friedman ANOVA). Pearson's correlation analyses were used to examine the relationship between the olfactory detection thresholds at different points of time. Correlations were corrected using the Bonferroni method. The alpha level for all tests was set at 0.05.
| Results |
|---|
|
|
|---|
Olfactory sensitivity
The mean olfactory detection threshold for n-butanol in 64 subjects was 8.9 (SD 2.3) at T1 (0 min), 9.1 (SD 2.3) at T2 (35 min), 9.2 (SD 2.3) at T3 (105 min), and 9.3 (SD 2.5) at T4 (mean 35.1 days, SD 49.0 days) (Table 1, Figure 2). No significant differences in olfactory detection threshold for n-butanol were observed concerning time (F3,189 = 0.59, P = NS). With regard to the variable sex, there were no significant differences in olfactory detection thresholds for n-butanol (F1,62 = 0.03, P = NS). Correlation analyses of the results at different time points revealed significant correlation coefficients of r64 = 0.43–0.85 (P < 0.01) (Table 2, Figure 3 and 4).
|
|
|
|
|
Subjective ratings regarding emotional valence, emotional arousal, alertness, and pleasantness and intensity of the odorants
Emotional arousal (P = 0.005, Friedman ANOVA) and alertness (P = 0.033, Friedman ANOVA) differed significantly between different times of olfactory detection threshold testing, whereas the other parameters did not (Table 1). The ANOVA regarding the pleasantness of the pen containing a concentration of n-butanol that was 3 concentration steps above the individual threshold concentration revealed a significant result (F3,189 = 3.03, P = 0.038), whereas pairwise comparisons revealed no significant results.
Depressive symptoms/state of satiety
BDI scores were within the normal range in all subjects (meanday1 = 1.8 [SD 2.1], rangeday1: 0–8; meanday2 = 1.3 [SD 1.7], rangeday2: 0–6). Differences between days 1 and 2 were small, yet statistically significant (P = 0.002, Wilcoxon rank sum test).
On both testing days, the subjects described themselves as slightly hungry (meanday1 = 30.2 [SD 25.3], meanday2 = 32.0 [SD 23.2]), they had a low desire for food (meanday1 = 25.6 [SD 22.5], meanday2 = 28.9 [SD 21.3]), and described their stomach as moderately full (meanday1 = 44.0 [SD 21.4], meanday2 = 46.7 [SD 22.0]). No significant differences between testing days were found with regard to feeling of hunger (P = NS, Wilcoxon rank sum test), desire for food (P = NS, Wilcoxon rank sum test), or filling state of the stomach (F1,63 = 1.28, P = NS).
| Discussion |
|---|
|
|
|---|
Mean olfactory detection threshold scores of the subjects in the current study did not change significantly over the 4 testing sessions. The correlation coefficients between the different points of testing were relatively high (r64 = 0.43–0.85) compared with the results of other studies (Punter 1983
There are limited data about test–retest reliability of the subtests of the Sniffin' Sticks (Kobal et al. 1996
; Hummel et al. 1997
), but no study has been published about repeated threshold testing more than once per day and more than once per subject in an adequate number of subjects. Our data confirm and extend the results of Hummel et al. (1997)
in the way that a short (0, 35, and 105 min) and a long test–retest interval (35 days, SD 49 days) was used and that the threshold test was applied 4 times on a larger sample size.
Because we investigated olfactory sensitivity of only young subjects (mean age 27.9 years, SD 4.6 years, range 21–40 years) and none of these subjects was suffering from olfactory loss, it is assumed that distortions of the reliability coefficients due to age and olfactory performance of the subjects did not appear.
Because there were no significant differences between the olfactory detection threshold tests at the 4 time points, it is concluded that the threshold test does not lead to adaptation, even if it is performed more than once per day. Additionally, one can assume that no learning of the testing method took place during repeated testing.
There are contradictory assumptions regarding the context between olfactory sensitivity and depression scores. The results of a study by Pause et al. (2001)
suggest a reduced olfactory sensitivity in patients with major depression. Our research group was able to confirm these results (Pollatos, Albrecht, et al. 2007) using a correlative approach, but we were not able to evidence a significant difference between groups of subjects with low versus higher depressive symptoms. (Doty et al. 1988
) and (Doty 1994
) investigated olfactory sensitivity and BDI scores of healthy subjects and of subjects reporting symptoms of multiple chemical sensitivities. As in the current study, olfactory thresholds did not, but BDI scores did significantly differ between both groups. In our study, there was a small, yet significant difference between BDI scores of the 2 testing days. In accordance with the previous results Doty et al. 1998; Doty 1994; Pollatos, Albrecht, et al. 2007), this did not lead to significant difference in olfactory sensitivity between the time points of threshold testing. The reason for the small difference between the BDI scores at the testing days remains unclear. However, the normal range of the BDI score is between 0 and 9. Thus, all our results lie within the normal range and may even be considered incidental.
Furthermore, our results indicate that the measures of olfactory sensitivity using the olfactory detection threshold test are independent of subjective ratings regarding the emotional situation of the subjects (emotional valence, emotional arousal, and alertness) and the intensity and pleasantness of the odor. A possible explanation for the difference of emotional arousal and alertness between the testing sessions (most existing between T1/T2/T3) could be that the longer the testing session on the first day the more the subjects felt calm and inattentive. However, these findings should be interpreted with caution because the observed variations were relatively small even if some of these parameters differed significantly from each other.
Taken together, the results of the current study show that the olfactory detection threshold test is highly reliable, even if repeated more than once per day and over a long-term period. The present results provide a significant extension of the knowledge about reliability of the olfactory detection threshold test of the Sniffin' Sticks, which is important for application of the test in clinical, industrial, and academic context.
| Acknowledgements |
|---|
|
|
|---|
Parts of this study were developed in line with the dissertation of Andrea Anzinger at the Medical Faculty of the Ludwig-Maximilians-University of Munich (in preparation).
| References |
|---|
|
|
|---|
Aitken RC. Measurement of feelings using visual analogue scales. Proc R Soc Med (1969) 62(10):989–993.[Web of Science][Medline]
Albrecht J, Schreder T, Kleemann AM, Schöpf V, Kopietz R, Anzinger A, Demmel M, Linn J, Kettenmann B, Wiesmann M. Olfactory detection thresholds of food-related and non-food odors in hunger and satiety. (2008).
Anzinger A, Albrecht J, Kopietz R, Kleemann AM, Schöpf V, Demmel M, Schreder T, Eichhorn I, Wiesmann M. Effects of Laserneedle-acupuncture on olfactory sensitivity of healthy human subjects: a placebo-controlled, double-blinded, randomized trial.
Beck AT, Ward CH, Mendelson M, Mock J, Erbaugh J. An inventory for measuring depression. Arch Gen Psychiatry (1961) 4:561–571.
Cain WS, Gent JF. Olfactory sensitivity: reliability, generality, and association with aging. J Exp Psychol Hum Percept Perform (1991) 17(2):382–391.[CrossRef][Web of Science][Medline]
Cain WS, Gent JF, Goodspeed RB, Leonard G. Evaluation of olfactory dysfunction in the Connecticut Chemosensory Clinical Research Center. Laryngoscope (1988) 98(1):83–88.[Web of Science][Medline]
Damm M, Eckel HE, Jungehulsing M, Hummel T. Olfactory changes at threshold and suprathreshold levels following septoplasty with partial inferior turbinectomy. Ann Otol Rhinol Laryngol (2003) 112(1):91–97.[Web of Science][Medline]
Doty RL. Olfaction and multiple chemical sensitivity. Toxicol Ind Health. (1994) 10(4–5):359–368.
Doty RL. Office procedures for quantitative assessment of olfactory function. Am J Rhinol (2007) 21(4):460–473.[CrossRef][Web of Science][Medline]
Doty RL, Bromley SM. Effects of drugs on olfaction and taste. Otolaryngol Clin North Am (2004) 37(6):1229–1254.[CrossRef][Web of Science][Medline]
Doty RL, Deems DA, Frye RE, Pelberg R, Shapiro A. Olfactory sensitivity, nasal resistance, and autonomic function in patients with multiple chemical sensitivities. Arch Otolaryngol Head Neck Surg (1988) 114(12):1422–1427.
Doty RL, McKeown DA, Lee WW, Shaman P. A study of the test-retest reliability of ten olfactory tests. Chem Senses (1995) 20(6):645–656.
Doty RL, Shaman P, Dann M. Development of the University of Pennsylvania Smell Identification Test: a standardized microencapsulated test of olfactory function. Physiol Behav (1984) 32(3):489–502.[CrossRef][Medline]
Doty RL, Shaman P, Kimmelman CP, Dann MS. University of Pennsylvania smell identification test: a rapid quantitative olfactory function test for the clinic. Laryngoscope. (1984) 94(2 Pt 1):176–178.
Doty RL, Snyder PJ, Huggins GR, Lowry LD. Endocrine, cardiovascular, and psychological correlated of olfactory sensitivity changes during the human menstrual cycle. J Comp Physiol Psychol (1981) 95(1):45–60.[CrossRef][Web of Science][Medline]
Frye RE, Schwartz BS, Doty RL. Dose-related effects of cigarette smoking on olfactory function. JAMA (1990) 263(9):1233–1236.
Hasegawa M, Kern EB. The human nasal cycle. Mayo Clin Proc (1977) 52(1):28–34.[Web of Science][Medline]
Hummel T, Gollisch R, Wildt G, Kobal G. Changes in olfactory perception during the menstrual cycle. Experientia (1991) 47(7):712–715.[CrossRef][Web of Science][Medline]
Hummel T, Jahnke U, Sommer U, Reichmann H, Muller A. Olfactory function in patients with idiopathic Parkinson's disease: effects of deep brain stimulation in the subthalamic nucleus. J Neural Transm (2005) 112(5):669–676.[CrossRef][Web of Science][Medline]
Hummel T, Kobal G, Gudziol H, Mackay-Sim A. Normative data for the "Sniffin' Sticks" including tests of odor identification, odor discrimination, and olfactory thresholds: an upgrade based on a group of more than 3,000 subjects. Eur Arch Otorhinolaryngol (2007) 264(3):237–243.[CrossRef][Medline]
Hummel T, Sekinger B, Wolf SR, Pauli E, Kobal G. Sniffin sticks': olfactory performance assessed by the combined testing of odor identification, odor discrimination and olfactory threshold. Chem Senses (1997) 22(1):39–52.
Kirchner A, Landis BN, Haslbeck M, Stefan H, Renner B, Hummel T. Chemosensory function in patients with vagal nerve stimulators. J Clin Neurophysiol (2004) 21(6):418–425.[CrossRef][Web of Science][Medline]
Kobal G, Hummel T, Sekinger B, Barz S, Roscher S, Wolf S. "Sniffin' sticks": screening of olfactory performance. Rhinology (1996) 34(4):222–226.[Medline]
Kobal G, Klimek L, Wolfensberger M, Gudziol H, Temmel A, Owen CM, Seeber H, Pauli E, Hummel T. Multicenter investigation of 1,036 subjects using a standardized method for the assessment of olfactory function combining tests of odor identification, odor discrimination, and olfactory thresholds. Eur Arch Otorhinolaryngol (2000) 257(4):205–211.[CrossRef][Medline]
Koelega HS. Diurnal variations in olfactory sensitivity and the relationship to food intake. Percept Mot Skills (1994) 78(1):215–226.[Web of Science][Medline]
Lundstrom JN, McClintock MK, Olsson MJ. Effects of reproductive state on olfactory sensitivity suggest odor specificity. Biol Psychol (2006) 71(3):244–247.[CrossRef][Web of Science][Medline]
Muttray A, Moll B, Faas M, Klimek L, Mann W, Konietzko J. Acute effects of 1,1,1-trichloroethane on human olfactory functioning. Am J Rhinol (2004) 18(2):113–117.[Web of Science][Medline]
Pause BM, Miranda A, Goder R, Aldenhoff JB, Ferstl R. Reduced olfactory performance in patients with major depression. J Psychiatr Res (2001) 35(5):271–277.[CrossRef][Web of Science][Medline]
Pause BM, Sojka B, Krauel K, Fehm-Wolfsdorf G, Ferstl R. Olfactory information processing during the course of the menstrual cycle. Biol Psychol (1996) 44(1):31–54.[CrossRef][Web of Science][Medline]
Pollatos O, Albrecht J, Kopietz R, Linn J, Schoepf V, Kleemann AM, Schreder T, Schandry R, Wiesmann M. Reduced olfactory sensitivity in subjects with depressive symptoms. J Affect Disord. (2007) 102(1–3):101–108.
Pollatos O, Kopietz R, Linn J, Albrecht J, Sakar V, Anzinger A, Schandry R, Wiesmann M. Emotional stimulation alters olfactory sensitivity and odor judgment. Chem Senses (2007) 32(6):583–589.
Principato JJ, Ozenberger JM. Cyclical changes in nasal resistance. Arch Otolaryngol (1970) 91(1):71–77.
Punter PH. Measurement of human olfactory thresholds for several groups of structurally related compounds. Chem Senses (1983) 7:215–235.
Robson AK, Woollons AC, Ryan J, Horrocks C, Williams S, Dawes PJ. Validation of the combined olfactory test. Clin Otolaryngol Allied Sci (1996) 21(6):512–518.[CrossRef][Medline]
Schiffman S. Changes in taste and smell: drug interactions and food preferences. Nutr Rev. (1994) 52(8 Pt 2):11–14.
Thomas-Danguin T, Rouby C, Sicard G, Vigouroux M, Farget V, Johanson A, Bengtzon A, Hall G, Ormel W, De Graaf C, et al. Development of the ETOC: a European test of olfactory capabilities. Rhinology (2003) 41((3)):142–151.[Web of Science][Medline]
Accepted 29 February 2008
![]()
CiteULike
Connotea
Del.icio.us What's this?
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||


