Gender Differences in Pressure Pain Threshold in Healthy Humans
Gender Differences in Pressure Pain Threshold in Healthy Humans
Gender Differences in Pressure Pain Threshold in Healthy Humans
www.elsevier.com/locate/pain
Abstract
Aims of investigation: To quantify the magnitude of putative gender differences in experimental pressure pain threshold (PPT), and to establish
the relevance of repeated measurements to any such differences. Methods: Two separate studies were undertaken. A pressure algometer was used
in both studies to assess PPT in the first dorsal interosseous muscle. Force was increased at a rate of 5 N /s. In study 1, two measurements were taken
from 240 healthy volunteers (120 males, 120 females; mean age 25 years) giving a power for statistical analysis of b ¼ 0:80 at a ¼ 0:01. In study
two, 30 subjects (15 males, 15 females mean age 28 years) were randomly selected from study one. Fourteen repeated PPT measurements were
recorded at seven, 10 min intervals. Mean PPT data for gender groups, from both studies, were analysed using analysis of covariance with repeated
measures, and age as the covariate. Results: The mean PPT for each of the two measurements in study one showed a difference between gender of
12.2 N (f ¼ 30:5 N, m ¼ 42:7 N) and 12.8 N (f ¼ 29:5 N, m ¼ 42:3 N), respectively, representing a difference of 28% with females exhibiting a
lower threshold. In study two, the mean difference calculated from 14 PPT repeated measurements over a 1 h period was comparable to that in
study one at 12.3 N (range 10.4–14.4 N) again females exhibited the lower threshold. The differences in mean PPT values between gender were
found to be significant in both study one, at (P , 0:0005, F ¼ 37:8, df ¼ 1) and study two (P ¼ 0:01, F ¼ 7:6, df ¼ 1). No significant differences
were found in either study with repeated measurement (P ¼ 0:892 and P ¼ 0:280), or on the interaction of gender and repeated measurement after
controlling for age (P ¼ 0:36 and P ¼ 0:62). Conclusion: Healthy females exhibited significantly lower mean PPTs in the first dorsal interosseous
muscle than males, which was maintained for fourteen repeated measures within a 1 h period. This difference is likely to be above clinically
relevant levels of change, and it has clear implications for the use of different gender subjects in laboratory based experimental designs utilising
PPT as an outcome measure.
q 2002 International Association for the Study of Pain. Published by Elsevier Science B.V. All rights reserved.
Keywords: Gender differences; Pressure pain threshold; Experimental pain; Algometry
0304-3959/02/$20.00 q 2002 International Association for the Study of Pain. Published by Elsevier Science B.V. All rights reserved.
doi:10.1016/S0 304-3959(02)00 330-5
260 L.S. Chesterton et al. / Pain 101 (2003) 259–266
Table 1
Summary of studies, supporting gender differences in PPT
Hogeweg et al. (1992) Spinal process of : C6, T1,T3, T6, T10, 14 males 12 Healthy
L1, L3, L5, and articular points at the 14 females
elbow, knee, ankle
Jensen et al. (1992) a M. temporalis left and right 352 females 2 Drawn at random from 1000
385 males individuals identified on the
National Central Person
register – Copenhagen
a
Denotes inclusion in the meta analysis by Riley et al. (1998).
comparative studies to give a power of 0.70 in the analysis. age may have played a confounding role in some of the
However, details of this calculation were not provided, and differences noted (although their statistical analysis does
the authors concede that their analysis only included studies not appear to support this notion). Not withstanding these
in which adequate data were provided and it is not therefore results, the findings may be in accord with Berkley (1997)
a comprehensive interpretation of all the available literature. who suggests that reported gender differences are small, are
Table 2 shows a number of studies where gender differences inconsistently reported in experimental conditions and are
in PPT were not found. Although most of these studies do subject to variations based upon experimental protocols.
not satisfy the recommended sample size suggested by Riley Unruh (1997) and Fillingim and Maixner (1995) also state
et al. (1998), there is a sufficient number to convey a lack of that approximately 50% of all existing studies find no gender
consensus and confusion across the literature. difference, although they suggest these are generally studies
Interestingly, one large study (n ¼ 207) included in Table of less experimental rigour. Berkley (1997) does, however,
2 by Lee et al. (1994) (reporting no gender differences over- agree with Fillingim and Maixner (1995) in that, where
all) does identify some significant gender differences within differences have been reported, they tend to show females
the results. However, these were observed in less than 50% of with lower thresholds. Thus, the direction of gender differ-
the measured sites (six of the 13 anatomical points). This ences is not in dispute, but the magnitude and relevance
would appear to contrast with suggestions by Fillingim et remain debateable.
al. (1999) that gender differences are independent of Many studies have evaluated protocols of repeated
measurement site. Lee et al. (1994) however, propose that threshold measurements using pressure algometry and
L.S. Chesterton et al. / Pain 101 (2003) 259–266 261
Table 2
Summary of studies, which do not support gender differences in PPT
Isselee et al. (1997) M.temporalis anterior left and right 11 females 6 Healthy
M. masseter superficialis left and right 11 males
Isselee et al. (1998) M.temporalis anterior left and right 12 males 10 Healthy
M. masseter superficialis left and rights 9 females
Jensen et al. (1986) M. temporalis anterior left and right 12 males 2 Healthy
12 females
Lee et al. (1994) M.temporalis (anterior, middle, posterior) left and right 104 males Healthy
M.masseter (deep, anterior, inferior) left and right 103 females 13
M. pterygoid left and right
M. posterior digastric
M. sternocleidomastoid (superior, middle)
M. splenius capitus
M. trapezius
Sandrini et al. (1994) M. trapezius 26 males Healthy
M. frontalis 24 females 11
Vatine et al. (1993) Mastoid processes, malleoli, and sternum 14 males 3 Healthy
10 females
found the technique to show high levels of reliability (Anto- to such analgesic interventions and ultimately pain manage-
naci et al., 1998; Nussbaum and Downes, 1998; Isselee et ment strategies (Wesselman, 1997), this investigation aims
al., 1997; Kosek et al., 1993; Brennum et al., 1989; List et to gain a better understanding of the gender differences in
al., 1989). Although a high degree of variability in PPT response to the pressure pain threshold model of experimen-
levels between individual subjects has also been shown, tal pain used in our larger investigation. The adopted experi-
this aspect does not appear to impact upon the reliability mental protocol in terms of the number and sequence of
of the measurement technique (Fischer, 1987a). One aspect repeated measures reflects that used within our laboratory
of PPT measurements that has not however been reported is on previous occasions (Barlas et al., 2002; Chesterton et al.,
the relevance of gender in response to repeated measure- 2002). Other centres investigating therapeutic interventions
ments. This issue may be important since Sarlani and Green- have also used experimental PPT, induced via a pressure
span (2002) have shown that greater temporal summation algometer, and measured at the first dorsal interosseous
can occur in females compared to males, in response to muscle or palmar muscles. In these cases similar measure-
rapidly applied mechanical evoked pain at supra threshold ment intervals have been used, but with less repetitions
levels (12 trains of ten repetitive stimuli at intervals of 1– (Alves-Guerreiro et al., 2001; McDowell et al., 1999;
6 s). Whilst this study did not demonstrate an overall gender Walsh et al., 1995; Wylie et al., 1995). Similar repeated
difference, interaction effects for stimulus order and gender measures time protocols have however, been reported
were clearly demonstrated; with stimuli later in the repeti- using different models of experimental pain e.g. (Johnson
tion trains causing females to give higher pain ratings. The and Tabasam, 1999). Therefore use of this method allows
statistical power of Sarlani and Greenspan’s (2002) study results from the larger study to be compared across a broad
was not reported and the sample size is relatively small section of the literature and exploring the issue of poten-
(n ¼ 20). Nevertheless this effect has also been demon- tially different gender responses is therefore important.
strated in response to electrical pain (Arendt-Nielsen et The purposes of this study were therefore, first, to quan-
al., 1994; Price, 1972) and thermal pain (Fillingim et al., tify the magnitude of gender differences in PPT measured at
1998; Price et al., 1977). It is, therefore, possible that there the first dorsal interosseous muscle, and second, to establish
are variable gender responses to repeated threshold the effect of 14 repeated measures over a 1 h period on any
measurements. Many prospective experimental studies recorded gender difference. These effects have not been
have used repeated measures as a protocol and include previously reported. To address the shortcomings identified
both gender groups for example, Alves-Guerreiro et al. in previous literature, the first experiment was designed to
(2001); Kosek and Ordeberg (2000b); Fischer (1987b); allow statistical analysis at a power of 80% at a ¼ 0:01.
Hong et al. (1993). A differential response between groups
to the outcome measure would confound the results
recorded and therefore this issue requires further investiga- 2. Method
tion.
This study formed part of a multifaceted investigation The study was designed as two separate experiments and
into the hypoalgesic effects of electrostimulation. Since was granted ethical approval from university research ethics
gender differences may play an important role in response committee. Both experiments used the same equipment and
262 L.S. Chesterton et al. / Pain 101 (2003) 259–266
measurement protocol. Experimenters and subjects were mately 10–15 s apart. For experiment 2, recordings were
unaware of the purpose of the study and both were unable made at six further time points, each 10 min apart. At
to see the algometer display. The difference between the two each time point, two PPT measures were again taken,
studies lay in the number of repeated measures taken and the approximately 10–15 s apart. Thus for experiment 2, 12
number of subjects recruited. extra measures were taken, giving a total of 14 PPT
measures collected per subject, over a period of 60 min.
2.1. Subjects One experimenter collected all PPT data from an individual
subject and seven experimenters were used in total.
2.1.1. Study 1 The inter-rater reliability for the seven experimenters was
A sample of 240 healthy volunteers (120 females, 120 tested prior to the study (unpublished data Chesterton et al.,
males) was recruited from the University student and staff 2002) with ICC (2,1) ¼ 0.90 (95% CI 0.81–0.96). Shrout
populations. The mean age of the sample was 25 years and Fleiss (1979) define reliability as excellent where analy-
(SD ¼ 7, range 19–57 years). Prior to participation, subjects sis reports ICC to be .0.75.
were screened for relevant contraindications: peripheral
neuropathy, pain symptoms, and history of trauma or
surgery to the dominant hand, current medication, diabetes 2.4. Data analysis
or pregnancy. The experimental procedure was explained to Data for both experiments 1 and 2 were analysed using
each subject, who then signed a consent form. analysis of covariance (ANCOVA) with time as a within
subject factor, gender as a between subject factor and age as
2.1.2. Study 2 a covariate. All statistical analyses were carried out using
From the original sample of 240, 30 subjects (15 females, statistical package for social science (SPSS) for Windows
15 males) were randomly selected (by computer generated Version 10, at significance level a ¼ 0:01.
randomised list) and consented to take part in experiment 2.
The mean age of this sample was 28 years (SD ¼ 9, range
19–48 years). 3. Results
3.2. Study 2
The mean PPT values for the gender groups at each time
point are depicted in Fig. 2. These values are seen to remain
relatively stable, giving a mean gender difference over all
time points of 12.3 N(range 10.4–14.4 N). The mean PPT
values over all time points for females were 27.5 N
compared to males at 39.8 N. Repeated measures ANCOVA
(correcting for violation of sphericity assumptions using the
Green House Geisser method) showed a significant differ-
ence in PPT between gender (P ¼ 0:01, df ¼ 1, F ¼ 7:66),
with no significant difference in mean PPT values for
repeated measures (P ¼ 0:28, df ¼ 4.7, F ¼ 1:28) or for
Fig. 2. Experiment 2: Line graph of mean PPT (N) for each gender group
the interaction between gender and repeated measures with repeated measures at 10 min intervals over one hour. Mean PPT for
(P ¼ 0:62, df ¼ 4.7, F ¼ 0:62) again adjusting for age. group (n ¼ 30) and mean of two PPT readings taken at each time point error
bars ¼ standard error of the mean.
4. Discussion
levels of normative PPT values (Hogeweg et al., 1992;
The purpose of this study was first to quantify, with Petersen et al., 1992; Ohrbach and Gale, 1989b; Gerecz-
adequate power, the magnitude of gender differences in Simon et al., 1989). However, by calculating the mean
PPT measured at the first dorsal interosseous muscle. The percentage difference between genders for the mean PPT
second objective was to establish the effect of repeated PPT recorded in experiment 1 at 28%, the results can be
measures on the recorded difference. No previous studies compared with other studies using similar calculations of
have reported the magnitude of PPT between genders at this the mean percentage difference from the data presented.
anatomical point, or indeed, the effect of repeated measures This necessitates the use of only those studies supplying
at 10 min intervals on PPT gender differences. raw data and in this instance, only studies using sample
Results show females to report a lower mean PPT sizes of n . 41 were selected for comparison, as suggested
212.2 to 212.8 N compared with males. This value cannot by Riley et al. (1998). Using this approach Jensen et al.
be compared directly with those from other studies because (1992) reported females to demonstrate an average PPT of
different anatomical measurement sites produce varying 20% less than males when measured at both left and right
temporal muscles of 737 healthy subjects. Data reported by
Fischer (1987a) shows an average gender difference of 27%
(range 5 to 35%) in mean PPT measured at nine common
trigger points in 50 healthy volunteers. Whereas Lee et al.
(1994) report an average difference of 12% (range 26 to
27%) across 13 different measurement sites in 207 subjects.
These comparisons tend to suggest that whilst gender differ-
ences do occur, the magnitude is not consistent as suggested
by Berkley (1997), and is dependent upon the anatomical
measurement site.
Whilst the results from our study show statistically signif-
icant differences between genders in mean PPT at the first
interosseous muscle, the clinical relevance of such differ-
ences should also be considered. A review of the literature
Fig. 1. Experiment 1: Data distribution summaries for PPT measures 1 and revealed no investigations devoted to this topic. Therefore,
2 for male (n ¼ 120) and female (n ¼ 120Þ groups The rectangular boxes studies using PPT in addition to other clinical indicators of
correspond to the 25–75% inter-quartile ranges for each dataset and indi- change were reviewed, and three studies identified are
cate the variability for the middle 50% of measures. The thick black line
discussed in relation to gender effects. In a study of 50
across each box represents the median. The vertical lines outside the box
(whiskers) connect the largest and smallest values not categorised as patients, Fischer (1990) used 93 PPT recordings to identify
outliers. PPTs outside of this range (more than 1.5 box lengths away correlations between clinical ‘hot spots’ identified in ther-
from the box) are shown as outliers and are depicted by the small circles. mographs and tender spots identified by PPT measurement.
There were no extreme values recorded (values greater than three box When a tender spot was defined by a PPT measuring 1.5 Kg/
lengths away from the box). The location of the median lines and the
cm 2 (14.7 N) less than the corresponding contralateral
approximately equal length of the whiskers suggest approximate symmetry
in all datasets. The overlapping boxes, show that the lower values for the anatomical site, a correlation of 97.8% was shown with
middle 50% of males, falls within the same PPT range as the upper values thermograph reports. Where the definition of tender spots
of the middle 50% of females. was refined to require a 2 Kg/cm 2 (19.6 N) difference in
264 L.S. Chesterton et al. / Pain 101 (2003) 259–266
PPT at contralateral sides, an 87% correlation was still that may influence the experience of pain and, although
achieved. Hong et al. (1993) reported a double blind, the underlying reasons for gender differences have been
controlled trial of 84 patients and 24 healthy volunteers extensively investigated, the precise physiological and
comparing the effects of heat, ultrasound, stretch and deep psychological mechanisms underlying the difference
massage in relieving myofascial trigger point pain. Statisti- remain unclear. Other proposals have included; the notion
cally significant increases in PPT in response to therapeutic of gender role expectancy, age, personality, familial influ-
interventions ranged from 1.1 to 2.6 Kg (9.9–20.2 N) ence, cultural and hormonal factors (Isselee et al., 2001;
(P , 0:05), with negligible changes seen in control groups. Rollman and Lautenbacher, 2001; Fillingim, 2000; Fill-
In a study by Pratzel (1998), similar levels of change in PPT ingim et al., 2000; Fillingim and Ness, 2000; Fillingim et
were accompanied by significant improvements in VAS al., 1998; Helme and Gibson, 1997; Thomas and Rose
scores of patient reported pain intensity. A double blind, 1991; Zatzick and Dimsdale 1990; Otto and Dougher
randomised, placebo controlled trial was used to investigate 1985; Keele 1972). In experimental designs, the gender
the analgesic effect of a standard physical therapy of the subject in relation to the experimenter is suggested
programme and repeated sulphur bath treatments (bath addi- to influence responses (Levine and De Simone, 1991),
tives contained polysulfide and huminic acid). Twenty while anxiety and fear have also been reported to reduce
patients with rheumatic disease, showed an average increase PPT (Buchanan and Midgley, 1987). Due to the sheer
of 1.1 Kg/cm 2 (9.9 N) in the mean PPT value of 16 anato- number of variables, Lautenbacher (1997) suggests that it
mical sites measured in each patient. This compared with an is impossible to account for all relevant characteristics,
increase of 0.4 Kg/cm 2 reported in placebo groups (placebo which may contribute to apparent differences in gender
bath additives contained huminic acids only and baths could reports of pain and indeed these factors (with the exception
not be distinguished by smell or colour). In the same study, of age) have not been considered in this analysis, which is
the single point defined by each patient as giving maximum an obvious limitation.
pain was recorded separately. In this instance, an average Within the second experiment of this study, results
increase of 1.5 Kg/cm 2 in mean PPT was shown in treatment showed a post hoc observed power of 0.76. However, due
groups compared with controls (standard physical therapy to the relatively small sample size (n ¼ 30) and known
programme only). heterogeneity in PPT measures, it would be prudent to
Although these examples represent changes in PPT due consider the conclusions to be indicative of a response
to pathology or in response to a therapeutic intervention, and again further large studies, which include more anato-
and are not gender specific, a difference in the level of PPT mical measurement sites, would be beneficial. Nevertheless,
of greater than 1 Kg/cm 2 has typically implied clinically the indication is that the use of repeated measures applied in
relevant changes. It is suggested by Price et al. (1986) that this way produces a stable response over a 1 h period within
clinical and experimental pain are reduced by similar levels both gender groups and also reflects the reliability of the
in response to therapeutic intervention (specifically to measurement technique. Results from two quick and succes-
opiate intervention). Therefore, one might conclude that, sive measurements in experiment 1 (n ¼ 240) support this
the mean gender difference in PPT seen in this study i.e. notion of reliability with greater statistical power. Even so,
12.2 N (1.2 Kg/cm 2), is likely to be above the implied level experimental designs should consider the use of standardis-
of clinically relevant change identified, and thus has impor- ing heterogeneous baseline PPT measures to reflect change
tant implications for experimental research and clinical against the standard, rather than using absolute levels of
practice. However, we accept that results using a single measured PPT. This will reduce the effect of gender differ-
anatomical measure will only provide an indication of ences, and other potentially confounding variables, which
potential gender differences in other PPT measures. There- have been associated with the PPT measurement, for exam-
fore the recommendations that can be made based on these ple age (Edwards and Fillingim, 1999; Jensen et al., 1992;
results are limited to sample selection and analysis techni- Brennum et al., 1989).
ques. The results suggest that it is prudent to use designs In conclusion, the results from this study show that there
that incorporate either single gender groups, or where is a significant difference between genders in the mean PPT
analysis requires ‘within-subject’ comparisons (and gender measured at the first interosseous muscle of at least 12.2 N
will not confound results), samples should have equal (1.23 Kg/cm 2), with females showing lower thresholds. The
gender mix within intervention groups. Additionally the magnitude of this difference was largely unaffected by 14
generalisation of results across genders would seem inap- repeated measurements (seven trains of two repeated
propriate in experimental studies and further investigations stimuli) over a 1 h period and PPT measured in this way
are required to establish clinically relevant changes in PPT is shown to be stable, independent of gender. Nevertheless
for both gender groups. It is also important to note that the the magnitude of the observed differences between genders
relevance of gender dependent differences in experimental is likely to be at a clinically relevant level. It is therefore
pain compared with clinical pain remains speculative and recommended that, in order to reduce potential bias in
further research is required (Fillingim, 2000). Indeed, the experimental protocols, single gender samples are used or,
literature widely suggests that gender is only one factor gender is matched across all experimental groups.
L.S. Chesterton et al. / Pain 101 (2003) 259–266 265
of physical therapeutic modalities and drug effects. J Musculoskeletal muscles. Repeatability and relation to subjective symptoms in a work-
Pain 1998;6:111–137. ing population. Scand J Rehabil Med 1990;22:63–68.
Price DD. Characteristics of second pain and flexion reflexes indicative of Thomas VJ, Rose FD. Ethnic differences in the experience of pain. Soc Sci
prolonged central summation. Exp Neurol 1972;37:371–387. Med 1991;32:1063–1066.
Price DD, Harkins SW, Rafii A, Price C. A simultaneous comparison of Travell JG, Simons DG. Myofascial pain and dysfunction. The trigger point
fentanyl’s analgesic effects on experimental and clinical pain. Pain manual. Baltimore, MD: Williams and Wilkins, 1983.
1986;24:197–203. Unruh AM. Why can’t a woman be more like a man? Behav Brain Sci
Price DD, Hu JW, Dubner R, Gracely RH. Peripheral suppression of first 1997;20:467–468.
pain and central summation of second pain evoked by noxious heat Vanderweeen L, Oostendorp RAB, Vaes P, Duquet W. Pressure algometry
pulses. Pain 1977;3:57–68. in manual therapy. Manual Therapy 1996;1:258–265.
Riley 3rd JL, Robinson ME, Wise EA, Myers CD, Fillingim RB. Sex Vatine JJ, Shapira SC, Magora F, Adler D, Magora A. Electronic pressure
differences in the perception of noxious experimental stimuli: a meta- algometry of deep pain in healthy volunteers. Arch Phys Med Rehabil
analysis. Pain 1998;74:181–187. 1993;74:526–530.
Rollman GB, Lautenbacher S. Sex differences in musculoskeletal pain. Clin Walsh DM, Foster NE, Baxter GD, Allen JM. Transcutaneous electrical nerve
J Pain 2001;17:20–24. stimulation. Relevance of stimulation parameters to neurophysiological
Sandrini G, Antonaci F, Pucci E, Bono G, Nappi G. Comparative study with and hypoalgesic effects. Am J Phys Med Rehabil 1995;74:199–206.
EMG, pressure algometry and manual palpation in tension-type head- Wesselman U. Gender differences: implications for pain management.
ache and migraine. Cephalalgia 1994;14:451–457 discussion 394–5. Behav Brain Sci 1997;20:470–471.
Sarlani E, Greenspan JD. Gender difference in temporal summation of Wylie L, Baxter GD, Walsh DM, Robinson L. The hypoalgesic effects of
mechanically evoked pain. Pain 2002 (in press). low-intensity infrared laser therapy upon mechanical pain threshold.
Shrout PE, Fleiss JL. Intraclass correlations: uses in assessing rater relia- Lasers Surg Med 1995;Suppl:9.
bility. Psychological Bulletin 1979;86:420–428. Zatzick DF, Dimsdale JE. Cultural variations in response to painful stimuli.
Takala EP. Pressure pain threshold on upper trapezius and levator scapulae Psychosom Med 1990;52:544–557.