Full text of Economic Brief : Why Use a Diffusion Index?, No. 22-22 : Appendix

View original document
The full text on this page is automatically extracted from the file linked above and may contain errors and inconsistencies.
NONCONFIDENTIAL // EXTERNAL

Appendix: Why Use a Diffusion Index?
Economic Brief No. 22-22, June 22, 2022
This appendix provides a deeper dive into the methodology of diffusion indexes. A diffusion index is a
statistic used to report on multiple-choice survey questions that measure whether the item being
studied improved, deteriorated or stayed the same. Broadly, a typical sample of responses might be
summarized by the vector {u,s,u,d,d,d,u,s,…} where u indicates an increase (up), s indicates no change
(same), and d indicates a decrease (down).
Diffusion Index Calculation
In this case, we define a diffusion index D as 𝐷 =

𝑛𝑢
𝑛

−

𝑛𝑑
𝑛

where n is the total number of respondents, nu

is the number of respondents who reported an increase and nd is the number of respondents who
reported a decrease.
If there are n survey respondents, then the survey process can be interpreted as n independent trials. If
we generalize that the possible responses are a (where a is u, d or s), then each answer type has a
“success” probability of 𝑝𝑎 . We then define 𝑝̂ 𝑎 = 𝑛𝑎 ⁄𝑛 as the proportion of survey answers that are
type a. If type-a answers are assigned a value of 𝜔𝑎 in the diffusion index — where 𝜔𝑎 is a number
𝑎
between 𝜔𝑎 and 𝜔 (that is, values of -1, 0 or 1) — then, a general diffusion index statistic can be
defined as
̂=∑
𝐷

𝑟

𝜔𝑎 𝑝̂ 𝑎

𝑎=1

̂ = 𝑝̂ 𝑢 − 𝑝̂ 𝑑 . In this case, an expansion
In the Richmond Fed survey, (𝜔𝑢 , 𝜔 𝑠 , 𝜔𝑑 ) = (1,0, −1), so that 𝐷
̂ greater than 0. One problem with this interpretation is that
would therefore be indicated by a value of 𝐷
̂ can be equal to zero under two completely different conditions:
𝐷
𝑝̂ 𝑢 = 𝑝̂ 𝑑 or 𝑝̂ 𝑠 = 1
In other words, half of the respondents can say “up,” and half can say “down,” or all respondents can
say “same.” To distinguish between these two cases, we would need information about the variance of
̂ . The equation is as follows:
𝐷
̂ 2]
[(1 − 𝑝̂ 𝑠 ) − 𝐷
̂ ± 𝑧√
𝐷
𝑛
Thus, when 𝑝𝑢 = 𝑝𝑑 = 0.5 and 𝑝 𝑠 = 0, 𝐶𝐼 = ± 𝑧(1/√𝑛), and when 𝑝𝑢 = 𝑝𝑑 = 0, 𝑝 𝑠 = 1, 𝐶𝐼 = 0.
Confidence Intervals
Regarding calculating confidence intervals, the table below can help provide insight. It is constructed
based on the Richmond Fed manufacturing employment index from 2002 through 2018 and provides a
rule of thumb for how to calculate the confidence interval and assess the statistical significance of the
diffusion index.

NONCONFIDENTIAL // EXTERNAL

𝒑𝒔

0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0

40
16
15
14
13
12
11
10
9
7
5
0

60
13
12
11
11
10
9
8
7
6
4
0

80
11
11
10
9
9
8
7
6
5
4
0

100
10
9
9
8
8
7
6
5
4
3
0

𝒏
120
9
9
8
8
7
6
6
5
4
3
0

140
8
8
8
7
7
6
5
5
4
3
0

160
8
7
7
7
6
6
5
4
4
2
0

180
7
7
7
6
6
5
5
4
3
2
0

200
7
7
6
6
5
5
4
4
3
2
0

Each cell in the table reports (half of the length of) the confidence interval for the null hypothesis that
the diffusion index is equal to zero. Specifically, it shows the values 𝑧√(1 − 𝑝 𝑠 )/𝑛 for different
combinations of 𝑝 𝑠 (the proportion of individuals that responded “stay the same”) and 𝑛 (the sample
size). The table uses 𝑧 = 1, or a 68 percent confidence level.
In our example, the mean/median share that responded “stay the same” was 0.7 (ranging from 0.4 to
0.8), and 𝑛 ranged from 60 to 140, with a mean of 90 and a median of 85. The cell for 𝑝 𝑠 = 0.7 and 𝑛 =
80 shows a value of 6. This means that the diffusion index 𝐷 is significantly different from zero when its
(absolute) value is larger than 6.
Variance
In the Richmond Fed monthly business surveys, the variance equals (1 − 𝑝 𝑠 − 𝐷 2 )/𝑛, where 𝐷 = (𝑝𝑢 −
̂ ):
𝑝𝑑 ). Thus, we observe the following with respect to the variance of the observed diffusion index (𝐷
̂ is derived from a weighted sum of means, its variance decreases at rate 𝑛. In other
1) Because 𝐷
words, the larger the sample, the smaller the uncertainty.
2) The variance decreases with the magnitude of the diffusion index 𝐷 2 = (𝑝𝑢 − 𝑝𝑑 )2 . Note that
D2 is equal to 1 when all survey participants respond either “up,” “down” or “same.” In all three
cases, (1 − 𝑝 𝑠 − 𝐷 2 ) = 0.
3) On the other hand, the variance will be maximized if 𝑝𝑢 = 𝑝𝑑 and 𝑝 𝑠 = 0.
Observations (2) and (3) are key to this point: An equal split of “up” and “down” responses might lead to
a reported index of 0 — just like all respondents reporting “same” — but the variance in the former will
be higher than in the latter. This is because uncertainty in the first case is much higher. Formally:
•
•

̂ ) = (1 − 𝑝 𝑠 ).
If 𝑝̂ 𝑢 = 𝑝̂ 𝑑 , then 𝑉𝑎𝑟(𝐷
̂ ) = 0.
If 𝑝̂ 𝑠 = 1, then 𝑉𝑎𝑟(𝐷

In this sense, the variance of the diffusion index can be thought of as reflecting polarization or
disagreement across responses.
Full text of Economic Brief : Why Use a Diffusion Index?, No. 22-22 : Appendix

FRASER