View original document

The full text on this page is automatically extracted from the file linked above and may contain errors and inconsistencies.

Working Paper Series

Optimal Policy with Probabilistic
Equilibrium Selection

WP 01-03

Huberto M. Ennis
Federal Reserve Bank of Richmond
Todd Keister
Department of Economics and Centro
de Investigación Económica, ITAM

This paper can be downloaded without charge from:
http://www.richmondfed.org/publications/

Optimal Policy with Probabilistic Equilibrium Selection*
Huberto Ennis
Research Department, Federal Reserve Bank of Richmond
huberto.ennis@rich.frb.org

Todd Keister
Department of Economics and Centro de Investigación Económica, ITAM
keister@itam.mx

Federal Reserve Bank of Richmond Working Paper No. 01-03
June 2001

Abstract
This paper introduces an approach to the study of optimal government policy in economies
characterized by a coordination problem and multiple equilibria. Such models are often
criticized as not being useful for policy analysis because they fail to assign a unique
prediction to each possible policy choice. We employ a selection mechanism that assigns,
ex ante, a probability to each equilibrium indicating how likely it is to obtain. With this, the
optimal policy is well defined. We show how such a mechanism can be derived as the
natural result of an adaptive learning process. This approach generates a theory of how
government policy affects the process of equilibrium selection; we illustrate this theory by
applying it to problems related to the choice of technology and the optimal sales tax on
Internet commerce.
JEL Classification Nos.: E61, D83, H21
Keywords: Coordination failure, government policy, learning

* We thank seminar participants at Cornell, the Federal Reserve Bank of Richmond, ITAM, and
Purdue. We are also grateful to Preston McAfee for helpful comments and especially to Karl Shell
for emphasizing the importance of studying how government policy can affect the process of
equilibrium selection. The views expressed herein are those of the authors and do not necessarily
reflect those of the Federal Reserve Bank of Richmond nor those of the Federal Reserve System.

1 Introduction
The extent to which purchases made over the Internet should be subject to sales tax is currently the
subject of a lively debate in the United States (and elsewhere). The Internet Tax Freedom Act of 1998
was essentially an agreement to postpone any major decisions for at least three years, meaning that
the issue is still very much undecided. The arguments both for and against special tax treatment of
Internet transactions are numerous, but one of the arguments in favor of special treatment strikes us as
particularly interesting. This argument1 relies on the idea that there is a natural network externality:
as more people engage in e-commerce, the ef¿ciency of e-commerce increases for all users. The
future size of the electronic market is currently unclear, and the cost advantages of e-commerce over
traditional retailing methods make a large electronic marketplace a better outcome (in terms of social
welfare). Lower e-commerce taxes are believed to substantially increase the number of people who
use this medium for commerce.2 Preferential tax treatment, therefore, is aimed at trying to generate
a large e-commerce market.
Putting this argument into the language of formal economic models, the claim is that there are
multiple equilibria with differing levels of e-commerce activity. To keep things simple, suppose
there are two equilibria. In one, the e-commerce market is small and engaging in e-commerce is
not very pro¿table, leading few people to do so. In the other, the market is large and very ef¿cient,
leading many people to engage in e-commerce. Because e-commerce is claimed to be more ef¿cient than traditional distribution methods, the latter equilibrium socially dominates the former. In
this language, the argument is that by giving e-commerce transactions preferential tax treatment the
government can encourage the economy to settle into the better equilibrium. In other words, the
proposed policy is an attempt to affect the process by which an equilibrium is selected.
This argument raises a fundamental issue that reaches far beyond the speci¿c example: whenever
there are multiple equilibria, actions taken by the government may affect which equilibrium is selected. This means that, in such situations, simply formulating the optimal policy problem requires
a theory of how the policy choice affects the equilibrium selection process. In the present paper,
we provide such a theory for exactly the type of situation described above: coordination-problem
economies with multiple equilibria. Our approach is to assign a probability to each equilibrium indicating how likely it is to obtain. These probabilities then generate a well-de¿ned optimal policy
problem. Solving this problem requires taking into account the effect that the policy choice has on
the probabilities. We show how an adaptive learning process naturally generates such probabilities

4 See Zittrain and Nesson (2000) for one statement of this argument in the popular press.
5 Indeed, Goolsbee (2000) estimates that applying existing sales taxes to Internet commerce would reduce the number
of online buyers by about 24 percent.

1

and that they are affected by policy in intuitive and interesting ways.
We apply our methods to both the Internet sales tax issue and to the general problem of the choice
between technologies in the presence of network externalities. In such situations, there are often two
(symmetric, pure-strategy) equilibria – one where the good technology is adopted and one where the
bad technology is used. These are typically both strict equilibria, neither of which is easily re¿ned
away. There is ample evidence from the historical and experimental literatures that both Paretodominant and Pareto-dominated equilibria can arise in such settings.3 Hence rules that select only
Pareto-optimal equilibria do not seem to describe such situations very well. On the other hand, it
is often intuitively clear that one equilibrium seems more likely to obtain than the other. As an
example, consider the problem of an agent deciding whether to engage in e-commerce or traditional
commerce. To simplify the story, suppose she believes the e-commerce market will be either “thick”
or “thin.”4 Suppose further that the gain from choosing e-commerce is large if the market is thick
and that the loss from choosing e-commerce is small (near zero) if the market is thin. If all agents are
identical to this one, then there are two strict equilibria. However, in a world with uncertainty and
imperfect information, it seems intuitively more likely that the economy would settle on the thickmarket equilibrium since the large payoff of the e-commerce equilibrium will lead expected-utilitymaximizing agents to choose e-commerce for a large range of beliefs about the market thickness. In
the words of Harsanyi and Selten (1988), the e-commerce equilibrium is risk dominant. Hence, it
seems intuitively plausible that a risk-dominant equilibrium might be expected to obtain more often
than a risk-dominated one. How much more often would seem to depend on the strength of its risk
dominance.
Experimental evidence lends strong support to the probabilistic view of equilibrium selection. Van
Huyck, Battalio, and Beil (1990, 1991) report that, in a series of coordination-game experiments,
the frequency with which the subjects converged to each equilibrium varied systematically with the
treatment variables. In other words, the ex ante probability distribution across equilibria was nondegenerate and was affected by small changes in the structure of the game.5 We build on the model
of adaptive learning in Howitt and McAfee (1992) and show that it generates precisely this type of
behavior. Under this learning process, the economy can converge to either equilibrium. Where it
converges depends in part on the particular realizations of uncertainty along the learning path. In the
context of our leading example, suppose that the ef¿ciency level of e-commerce is not known with

6 See Cooper (1999) for an excellent review of this literature.
7 Another type of model where expectations about market thickness can generate multiple equilibria is the market
game. See Peck, Shell, and Spear (1992).

8 See Crawford (1997) for an interesting discussion of these results and for an estimation of a general learning model
using the experimental data.

2

certainty. If the ¿rst few observations make e-commerce seem attractive, agents will begin to coordinate on the high e-commerce decision, making this equilibrium more likely. We demonstrate that
the probability of converging to a particular equilibrium is negatively related to the risk factor of the
equilibrium. (A low risk factor corresponds to a strongly risk-dominant equilibrium.) The risk factor, in turn, depends on the government’s policy choice. This is the mechanism through which policy
affects equilibrium selection – by making a particular choice less risky for agents in the stochastic
environment, the government can make that equilibrium more likely to obtain.
This mechanism assigns an objective probability to each equilibrium and thereby allows us to
address the question of optimal government policy. We take as the government’s objective the maximization of expected utility across equilibria. The optimal policy derived under our approach typically differs from that derived using any deterministic selection criterion. We show, for example, that
it is generally not optimal to choose the policy that maximizes the (utility) value of the good equilibrium. The reason is that, by deviating in some direction, there is a ¿rst-order gain in the likelihood
of attaining the equilibrium, but no ¿rst-order loss in the value of that equilibrium. We also show
that even when it is feasible to eliminate the bad equilibrium, it may not be optimal to do so. It may
be optimal to allow the bad outcome to occur with low probability in exchange for a higher (utility)
value if the good equilibrium occurs.
Our approach stands in contrast to the recent work of Morris and Shin (1998), which uses informational imperfections to eliminate the multiplicity of equilibria in a related coordination game.
In that approach, economic fundamentals (and the information structure) determine how agents will
coordinate there is no role for chance. In our approach, chance plays an important role, with the
economic fundamentals determining the probabilities. Our approach is therefore more in the spirit
of Cole and Kehoe (2000). That paper studies a coordination game embedded in a dynamic general equilibrium model and calculates (for given fundamentals) the maximum probability of a bad
outcome that is consistent with equilibrium. Hence chance plays a role and economic fundamentals
determine the possible probabilities. However, our approach assigns a unique probability to each
outcome, eliminating the need to focus on the worst case scenario.
When there are multiple equilibria, policy choices can affect equilibrium selection by acting as a
sunspot variable and guiding the coordination of agents’ actions.6 For example, agents might expect
the high e-commerce equilibrium to obtain if and only if e-commerce is completely tax free. This
would then be a rational expectations equilibrium. It is extremely dif¿cult, however, to formally

9 Manuelli and Peck (1992) show how intrinsically important economic variables can also act as sunspot variables
with respect to agents’ expectations.

3

model agents’ expectations regarding such effects.7 A variety of papers have focused instead on the
“worst” equilibria and looked for policies that eliminate these as possible outcomes.8 Our approach
differs from these in that it assigns an objective probability to each equilibrium and hence generates
a unique probability distribution over the set of equilibria. This can be thought of as endogenizing
agents’ expectations, although doing so requires a departure from rational expectations in the learning
process. The distribution tells us how much weight the government should give to each of the possible
outcomes. This allows us to ask not only can the government eliminate a particular bad equilibrium,
but also should the government eliminate it.
The remainder of the paper is organized as follows. In the next section, we lay out our approach
to optimal policy analysis and equilibrium selection in a general setting. The focus is on illustrating the central ideas rather than proving speci¿c results. In Section 3, we specialize to a class of
coordination-problem economies that contains the examples discussed above, and we provide a detailed analysis of both the learning process and the optimal policy problem in that setting. In Section
4, we then apply our approach to simple models of technology choice. Finally, in Section 5 we offer
some concluding remarks.

2 The General Approach
In this section, we outline our approach in a general setting that allows us to highlight the important
features with minimal complications. We consider economies with a continuum of identical agents.
We focus on symmetric equilibria, where the fact that we have identical agents means that social
welfare is the same as individual welfare. Studying symmetric equilibria facilitates the analysis of
the optimal policy, as there is no need to impose a social welfare function.
2.1

The Model and Equilibrium

Each agent must choose an action @ from a set   U . There is a (benevolent) government that
chooses a policy

from a set A  U. In addition, there is aggregate uncertainty represented by

the variable S 5   U The distribution of S is given by sc a probability measure on E E c the
Borel subsets of 9 The utility of each agent depends on each of these variables and on the average
action in the economy @( this external effect is what will generate multiple equilibria. Since there is
a continuum of agents, we have a truly competitive economy and each agent correctly perceives his

: See Ghiglino and Shell (2000), section 7, for an interesting discussion of this issue.
; Contributions along these lines in the sunspots literature include Grandmont (1986), Woodford (1986), Smith (1994),
Goenka (1994), and Keister (1998). This approach is also common in the literature on ¿nancial fragility.

< We are assuming that both f and  are one-dimensional, but this does not seem to be important for the results.
4

own action to have no effect on @ Each agent’s utility is given by the function
T G     A   $ Uc
where  is the convex hull of  We assume that the set A is convex and that T is (jointly) strictly
concave and twice continuously differentiable in all of its arguments. These assumptions are made
so that we can use a simple ¿rst-order condition to characterize optimal government policy.
Each agent has a belief about the values of @ and S that is represented by a joint probability
measure 8 on   . The government policy is commonly known. Each agent therefore maximizes
expected utility by solving

]
4@

@M

T E@c @c c S _8 E@c S 

(1)

Rational expectations equilibrium is a natural solution concept in this environment we focus on
symmetric outcomes.
De¿nition: A symmetric rational-expectations equilibrium is an @ 5  such that
W

E Agents have rational expectations, that is, we have




s EF
'
8 E@c F '
for @
@ for all F 5 E E , and
f
9'
W

E Given these beliefs, agents are optimizing (that is, @ solves (1)).
W

We denote the set of equilibria for a given government policy by H E . Our interest is in situations
where there is more than one such equilibrium for at least some values of , so that equilibrium
selection is an issue.10 It is well known in the literature on coordination problems that some degree
of payoff complementarity is necessary for the existence of multiple equilibria (see Cooper and John
[1988]), so we are assuming that the T function has this property. In Section 4, we work through
explicit examples where this is the case.
2.2

Optimal Policy

We now turn our attention to the problem of determining the optimal policy. The traditional approach
is to focus on a particular equilibrium @ 5 H E . This equilibrium might be selected by a formal rule,
W

or it might be the focus of attention solely because it has some desirable properties. The government
recognizes, of course, that the equilibrium action @ is a function of  The benevolent government
W

chooses its policy to maximize the expected utility of each agent in this symmetric equilibrium. The

43 The multiplicity here opens the door to a richer set of possible rational expectations equilibria (related to the
correlated equilibria of the coordination game). We choose to restrict our de¿nition exclusively for simplicity (see the
remarks in Section 3.1).

5

traditional optimal policy problem is therefore
]
4@
T E@ E  c @ E  c c S _s ES 
W

M

W

A

The shortcoming of this approach is that the resulting optimal policy typically depends on which
equilibrium was selected. As we mentioned above, the evidence on coordination games indicates
that in many cases a unique equilibrium simply cannot be singled out as the prediction of the model.
As a result, this approach does not yield a clear policy prescription.
Suppose instead that we allowed for a probabilistic equilibrium selection mechanism. Such a
mechanism assigns, for each value of , a probability distribution over the set of equilibria H E .
We denote the probability of equilibrium @ by Z E@ c   As the notation indicates, this distribution
W

W

will typically depend on the government’s choice of policy. In the next subsection, we provide a
foundation for this type of mechanism by showing that it can be derived as the limiting outcome
of an adaptive learning process in the model. First, however, we discuss the implications of such a
mechanism for the optimal policy problem. Once a probability is assigned to each equilibrium, the
natural goal of the government is to maximize the expected utility of agents across equilibria.11 The
optimal policy problem is therefore

] ]
T E@ E  c @ E  c c S _s ES _Z E@ E  c  
4@
W

M

Notice how

W

W

(2)

A

enters the objective function by E directly affecting the value Tc E affecting the

equilibrium action @ c and E affecting the equilibrium selection mechanism Z
W

Deriving the optimal policy this way leads to interesting and sometimes surprising results as we
demonstrate in Section 4. To gain some intuition here, suppose that there are two equilibria, one
“good” and the other “bad.” Let T} E  and TK E  represent the utility level of agents in each of these
equilibria. Also, let Z E  represent the probability of converging to the good equilibrium (that is,
Z E}( ). Then, the optimal policy problem can be written as
4@ Z E  T} E  n E  Z E  TK E  
A

(3)

M

Suppose (again for illustration) that Z is differentiable in Then the ¿rst-order condition for this
problem is
Z E  T} E  n E  Z E  TK E  n Z E  ET} E   TK E  ' f
This is in many ways the central equation of the paper its solution is the optimal policy






W

. It says

44 Agents now face a compound lottery – ¿rst an equilibrium is selected and then a state of nature is realized.
Maximizing expected utility across equilibria amounts to a reduction of this into a single lottery.

6

that the optimal policy is the result of the balancing of three forces. One must consider not only the
effect of the policy on the utility of agents in each of the two equilibria, but also the effect of the
policy on the equilibrium selection mechanism.
One way to think about this approach is to contrast it with a situation where agents use a sunspot
variable to coordinate their actions. In that case, with probability Z there is sunspot activity and
with probability E  Z there is not, and Z is independent of government policy (and everything


else in the economy). There is then an equilibrium where agents choose }c %} when sunspots are
W

present and EKc %K  when they are absent.12 If the government must choose
W

before the realization

of the sunspot variable, it faces exactly problem (3) except that Z does not depend on  One can
think of our approach, therefore, as endogenizing the properties of the extrinsic variable on which
agents coordinate their actions. This allows us to study the impact that the government has on this
coordination device and the resulting changes in the optimal policy problem.
The crucial question at this point is obviously where the function Z comes from. In the next
subsection, we show how it can be obtained as the natural result of an adaptive learning process
within this environment.
2.3

A Probabilistic Equilibrium Selection Mechanism

In this subsection, we describe how a general learning process induces a probabilistic equilibrium
selection mechanism. In section 3, we provide a more detailed analysis of the approach in a speci¿c
class of models. Studying learning within this model requires changing the information possessed by
agents so that there is something for them to learn about. In addition to the endogenous uncertainty
about each other’s actions, we introduce uncertainty about the distribution s This means that agents
need to learn about the fundamentals of the economy while they learn about market conditions, and
the coevolution of their beliefs about the two objects determines where the economy converges.
Our interpretation of the learning process is similar to that in Lucas (1986), which advocates using
learning to investigate the plausibility of different equilibria. Using adaptive behavior to predict the
actual performance of the economy along the learning transition does not constitute, for Lucas, a
“serious hypothesis.” We also do not think of our learning process as an accurate description of the
short-run behavior of the economy. Rather, we view it as a mechanism that accurately reÀects the
likelihood with which the economy will end up in each equilibrium. Our primary interest is in how
these likelihoods are affected by the government’s actions.
We model the learning process as taking place over an in¿nite sequence of discrete (arti¿cial)

45 In general, however, sunspot equilibria need not be mere randomizations over equilibria of the non-sunspots
economy, as shown by Cass and Shell (1983).

7

time periods | ' fc c 2c   . In period |c each agent must choose an action @| c and then observe a
state S|  The state S| is drawn from the true distribution s over  and is drawn independently across
periods. As a result, agents will asymptotically learn the true distribution s The action @| is chosen
to maximize current-period expected utility, given the agent’s time | beliefs (the beliefs at | ' f are
the initial priors). The agent then observes the average action @| and updates his beliefs, and the
process repeats itself.
Formally, let t| denote the random vector that describes the possible outcomes of the aggregate
economy in period | of the learning process. A realization of t| is denoted +| ' E@| c S| c and therefore
we have +| 5 t ' . Let x represent the set of distributions over t that agents consider possible
for t| . We may choose to restrict this set to, say, distributions of a particular parametric form. Agents’
beliefs at time | are given by a probability measure over x that we denote by >| 
Agents begin the learning process with a common prior belief >f . We consider a recursive learning
process with the following structure
>|n ' K| E>| c +| 

(4)

Note that t| is, in general, an endogenous random variable. Given the current state of beliefs and the
structure of the economy, described by a mapping [c we have
t| ' [E>| ( 

(5)

Together, equations (4) and (5) and the initial prior beliefs >f fully describe the learning system.
The fact that the laws of evolution of the endogenous variables are determined in part by the learning
process make this system self-referential. (Agents are learning about a system that is being inÀuenced
by the learning processes of people like themselves.) This property of the system implies that agents
are not learning about a ¿xed data-generating process (see Marcet and Sargent [1989]). The limiting
behavior of beliefs is especially complicated due to this fact. We are interested in adaptive rules
K that satisfy certain “natural” requirements (see Woodford [1990]). For example, we would like
to consider learning algorithms that satisfy asymptotic consistency: beliefs converge almost surely
to the “true” distribution. In self-referential systems there is actually no “true” (¿xed) distribution.
Although we could still check this requirement for the non-self-referential part of the system, we
will only consider adaptive rules that satisfy an even stronger requirement: convergence of beliefs
to the rational expectations equilibrium beliefs. Using the notation above, we can de¿ne a rational
expectations equilibrium as a random vector te such that, together with the belief >
e that puts full
mass on te , we have
te ' [Ee
>( 
8

(6)

In other words, an equilibrium random vector is such that if agents’ beliefs put probability one on
that vector, then agents’ actions will generate that same vector. We are interested in economies with
multiple rational expectations equilibria, that is, with more than one random vector solving (6). Let
e   x be the set of solutions to equation (6) and XE
e  the corresponding set of belief distributions
xE
e  To study the asymptotic behavior of system (4) it is
that put full mass on a particular element of xE
convenient to consider the set t

"

of in¿nite sequences i+| j|' and the induced probability measure
"

 over that space (which depends on the prior distribution >f and, through (5), the government policy
). We then focus on adaptive rules that for all sequences + 5 t

"

satisfy

e 
*4 > ' >" 5 XE
|<" |
Note that this implies two things. First, it implies that the sequence i>| j converges and second, that
e Finally, note that
it converges to a distribution that puts full mass on only one element of the set x
if the adaptive learning process satis¿es this condition, then it will induce a probability distribution
e  that is given by
over the set xE
hEte  '  E+ 5 t

"

G>

"

e  ( 
'>
e 5 XE

This is the foundation for our proposed equilibrium selection mechanism (ESM). The probability
that we assign to each equilibrium is the probability of the set of sequences i+| j that converge to it.
It is important to note that both the endogenous variables @| and the exogenous random variables
S| are essential to our problem. If vector t| contained only endogenous variables, then given the prior
distribution >f c the economy would always follow the same path during the learning process. Then,
there would be no chance of convergence with positive probability to more than one rational expectations equilibrium. By the same token, if vector t| contained only exogenous (random) variables
e  would be a singleton (containing only the measure that puts full mass on the true
then the set xE
distribution of t| ) and hence there would be no equilibrium selection problem.
It is implicit in the formulation of problem (2) is that the policy maker knows sc the true distribution of random variable S Recall that our interest is in determining the optimal policy in the static,
rational-expectations model of Section 2.1. In that context, the policy maker must make a single
choice c knowing s This may seem somewhat at odds with our learning story – the policy maker
chooses

while knowing sc and then agents begin to learn about s However, this is an unavoidable

aspect of using learning to generate an equilibrium selection mechanism. A rational expectations
equilibrium of the model in Section 2.1 – our model of interest – requires that everyone, including

9

the government, know the true distribution s .13 However, if agents began the learning process knowing sc their initial beliefs would uniquely determine the outcome and learning would not be selecting
the equilibrium. In other words, if an equilibrium could be selected using methods that are entirely
consistent with the original model, there would not have been multiple equilibria to begin with.
In a sense, this is the same type of point made by Morris and Shin (1998) in the context of a model
of self-ful¿lling currency attacks. However, it is also where our approach differs fundamentally from
theirs. We perform equilibrium selection for a given economy, whereas they change the economy so
that the equilibrium is unique. In particular, they change the informational structure so that agents
receive different signals about economic fundamentals and must act on the basis of this (incomplete)
information. The result is, in general, not an equilibrium of the original economy. We keep the
original economy as our fundamental object of interest and study an adaptive learning process that
converges to an equilibrium of this economy. We think of our approach in the following way. The
model in section 2.1 abstracts from the uncertainty that is undoubtedly present in any economic
situation. This is a standard approach and, as long as the uncertainty is not too large, seems reasonable
in the context of the static model. We then add this uncertainty in for the learning process, where it
interacts with agents’ beliefs about each other’s actions in a crucial way.
Another implication of this approach is that the policy maker cannot change

during the learning

process or after the economy has converged to an equilibrium. This seems a natural requirement in
any event for two reasons. First, the policymaking process may be such that changing policies is
costly and/or involves time lags. Second, and more importantly, a policy change could easily cause a
“jump” in agents’ beliefs, which would reset the learning process to some new initial condition. The
strategic interplay between policy changes and changes in beliefs is a very dif¿cult problem, and we
defer this to future work.

3 Coordination Problems
In order to simplify the analysis, we specialize the model to economies with a coordination problem and two symmetric, pure-strategy equilibria. This is most easily done by assuming that agents
face a binary choice, such as the choice between two competing technologies.
3.1

A Binary-Choice Model

We now restrict the choice of agents to include a binary decision, in addition to other payoff-relevant
choices (such as how much to produce with the chosen technology). Therefore, we model the agents’

46 Rewriting the model in Section 2.1 to include uncertainty about i would not resolve the issue it would only move it
to the next level: beliefs about the distribution of the true distribution.

10

choice set as
where f  U 

 ' i}c Kj  fc

(We will later interpret } as choosing the “good” technology and K as choosing the “bad” one.) The
analog of this binary choice in game-theoretic analysis is the 2  2 game that has received so much
attention in the literature on equilibrium selection.14
De¿ne %} as the solution to
W

]

%} ' @h} 4@
W

%





T E}c % c }c %} c c S _s ES 
W

Then %} will be the value taken by % in an equilibrium where all agents choose } (if such an equilibW

rium exists). We assume that this equation has a unique solution. The utility value to each agent of
being in this equilibrium would then be
]

 


T}}  T }c %} c }c %} c c S _s ES 
W

W

We similarly de¿ne the utility that an agent would receive from choosing K (and then choosing %
optimally) when (almost) every other agent is choosing } by
]




T EKc % c }c %} c c S _s ES 
TK}  4@
W

%

We assume that T}} : TK} holds, meaning that there is indeed an equilibrium where all agents choose
} We de¿ne %K c TKK c and T}K in an analogous way, and we assume that TKK : T}K holds. Then TKK is
W

the utility agents receive in the equilibrium where all agents choose Kc and TK} is the utility value of
deviating unilaterally from this equilibrium. Finally, we assume that T}} : TKK holds, so that } is the
“good” equilibrium and K the “bad” one.
There is an important point to be mention here. The de¿nition of rational expectations equilibrium that we will be using (see Section 2.1) is somewhat restrictive, even for this speci¿c kind of
coordination game. It is well known that coordination games often admit a large number of correlated equilibria. For example, we could have grouped agents within a ¿nite number of different
coordinated groups and then constructed correlated equilibria for which the equilibrium probability distribution over the aggregate state @ is non-degenerate. This is important because, during the
learning process, agents hold prior probabilities over a set of possible nondegenerate distributions
of the aggregate state @ (as opposed to holding prior distributions directly over the possible states @
or equivalently only over degenerate distributions).15 Agents are trying to learn about equilibrium

47 See, for example, Harsanyi and Selten (1988), Kandori, Mailath, and Rob (1993), and Matsui and Matsuyama
(1995).

48 A subset of these correlated equilibria is the set of (symetric information) sunspot equilibria. Our de¿nition rules out
11

behavior and hence de¿ne prior beliefs broadly enough to include all possible types of equilibrium.
As it turns out, without the existence of a generally accepted correlating device (see Fudenberg and
Tirole [1991]), the ¿nal equilibria are always of the type described by our restrictive de¿nition. In
addition, much of the analysis in this paper can be extended to include such correlation with only
minor modi¿cations.
3.2

Optimal Policy

The traditional optimal policy problem is to maximize T E  c where  is either } or K depending on
which of the two equilibria was selected. We denote the solution to this problem by  c where the 
subscript indicates that this is the optimal policy when the government knows that equilibrium  will
W

obtain.
In contrast, our approach uses a probability distribution Z over the set of equilibria i}c Kj  The
optimal policy problem is then given by (3). Without placing further structure on the vector of utilities
 It is typically not equal to either } or K  It may fall
between these two values, or it may be higher than both. We now look at a speci¿c learning model
Tc little can be said about the optimal policy

W

W

W

and the properties of the Z that it generates.
3.3

Bayesian Learning and Equilibrium Selection

In this section, we specialize to a particular learning process, that of Howitt and McAfee (1992). We
use this process primarily because it has a simple graphical representation that allows us to illustrate
how policy affects the equilibrium selection process. We should emphasize, however, that the details
of the learning process are not critical for our story. As Section 2.3 indicates, any of a broad class of
adaptive processes could be used. The only real requirement is that the process converges to each of
the two equilibria with positive probability.
We assume that the exogenous random variable S takes on only two values, SM and Su  We will
later interpret S as a random ¿xed cost of operating technology } In that case, Su will correspond to
a low cost and SM to a high cost. This assumption implies that s is a Bernoulli distribution let R be
the (true) probability of Su . We use T}}u (resp. T}}M ) to denote the (ex post) utility value of equilibrium
} when S takes the value Su (resp. SM ). Therefore we have
T}} ' RT}}u n E  R T}}M 
Subscripts added to other variables are de¿ned similarly. We use T to denote the vector whose
elements are T& for  ' }c K(  ' }c K( & ' Mc u
those as well. For the relationship between sunspots and correlated equilibria, see Peck and Shell (1991).

12



We also assume that each agent knows that @ will be either }c %} or EKc %K   This is a restriction
on the support x of the set of beliefs. The agent believes that which of these two events occurs is


the result of an i.i.d. Bernoulli random variable, with the probability that }c %} is chosen given by
W

W

W

^ Each agent is uncertain about the values of R and ^. She has (independent) beliefs about R and ^
which are given by beta distributions over dfc o with means R and ^c respectively. The agent begins
at | ' f with diffuse (uniform) priors for each variable and updates these beliefs using Bayes’ rule.16
Because all agents begin with the same priors and observe the same information, they hold identical
beliefs at every point in time.
We place the following restrictions on the utility values:
T}Ku : TKKu

and

TK}M : T}}M 

(7)

The ¿rst of these inequalities implies that if R were equal to one, then } would be the optimal choice
for the agent regardless of ^ In other words, if the agent is optimistic enough about the variable Sc she
will choose } regardless of her beliefs about market conditions. The second inequality is the reverse
it implies that if R were zero, then K would be the optimal choice regardless of ^ This implies that
agents believe it is possible that either one of the equilibria exists or that both exist. We will see
below that these assumptions guarantee that the economy has positive probability of converging to
each equilibrium from any current set of beliefs. This prevents initial beliefs from having too strong
of an effect on the ¿nal outcome.
Intuitively, the learning process works as follows. The agent knows that either the good technology
will catch on or the bad one will. In other words, one of the two symmetric equilibria will occur.
What she does not know is what probability to assign to the good outcome. In the same way, she
knows that the cost of operating the good technology will be either low or high, but she does not know
what probability to assign to the low cost. She begins the learning process with beliefs about each
of these probabilities. In each period, after making a decision, she receives two signals that she uses
to update these beliefs for the next period. One signal is a random draw from the cost distribution,
either Su or SM  With this she updates her belief about the likelihood that Su will be the true state.


The other signal is about the market. She observes either }c %} or EKc %K  and uses this to update
W

W

her belief about the likelihood that the good technology will catch on. There is no “true” distribution
for this signal to be drawn from instead, the signal is equal to the optimal response given the agents’

49 Diffuse priors are a standard way of representing “minimal” prior knowledge about a parameter. See, for example,
Zellner (1971).

13

current (shared) beliefs. The expected utility given belief ER| c ^|  of each choice is given by
} G R| ^| T}}u n E  R|  ^| T}}M n R| E  ^|  T}Ku n E  R|  E  ^|  T}KM
K G R| ^| TK}u n E  R|  ^| TK}M n R| E  ^|  TKKu n E  R|  E  ^|  TKKM 
Straightforward algebra shows that the agent chooses } if we have17


TKKM  T}KM  T}Ku  T}KM  TKKu n TKKM R|


 
^|   M
T}}  T}KM  TK}M n TKKM E  R|  n T}}u  T}Ku  TK}u n TKKu R|

(8)

For a given level of R| c the agent will prefer technology } if she thinks it is likely enough that other
agents will be choosing technology } Following Howitt and McAfee (1992), we can represent this
graphically as in Figure 1.

Figure 1: The dynamics of beliefs
The box in Figure 1 represents the set of all possible ERc ^ pairs. We de¿ne the set C to be the set
of points ER| c ^|  such that (8) holds. The set  is the complement of C. It is straightforward to show
that (7) implies that the curve separating these regions must begin to the right of Efc  and end to
the left of Ec f  One can show that the curve is smooth, strictly decreasing, and that it can be either
convex or concave depending on the functional forms chosen (in Howitt and McAfee [1992] and in
the examples in Section 4, it is linear).
Let ER| c ^|  denote the expected value of ERc ^ according to the beliefs held by agents at date |. In
particular, diffuse priors imply that we have Rf ' ^f ' 2 . In each iteration of the learning algorithm,

4: Here we are imposing the tie-breaking rule that agents choose j when they are indifferent. This is not important for
the results the only real requirement is that all agents take the same action.

14

the observation of E@| c S|  provides useful information for updating ER| c ^| . Bayesian updating of
beliefs allows us to write the expected value of the parameter of the distributions after | observations


as
R|n '


and
^|n '

# | R|
# | R| n E  # | 

# | ^|
# | ^| n E  # | 



S
if S| ' M
Su

if ER| c ^|  5
C

(9)

c

(10)

where #| ' E| n 2*E| n c for | ' fc c 2c   . Such Bayesian learning has a nice representation


in Figure 1. If the time | beliefs fall in Cc all agents will choose }c %} and hence the value of ^
W

will increase. As Howitt and McAfee (1992) point out, the posterior beliefs always lie on the line
segment connecting the prior beliefs with one of the corners of the box. From point %c for example,
we would move to + if SM is observed and to 5 if Su is observed. Similarly, if the original point is
% (in ), @ ' EKc %K  will be observed, and ^ will decrease. We move to point + if SM is observed
and to 5 if Su is observed.


W





Suppose that in period |c agent’s beliefs are represented by the point % To which equilibrium
will the economy converge? The answer depends crucially on the sequence of realizations iS| j c
especially the next few observations. Because % falls in Cc we know that agents will be choosing }
and hence ^| will be rising. Suppose, however, that the economy is “unlucky” and receives a string
of realizations of SM  Then R| will be falling and eventually beliefs will cross into region  At this
point, agents will begin to choose K and ^| will start to fall.
Bayesian updating consistently estimates the value of Rc that is, we have R| $ R almost surely as
| $ 4. Whether ^| converges to zero or one depends (very roughly speaking) on whether beliefs are
in C or  when R| settles down. A suf¿ciently unlucky sequence of realizations of S| will lead the
economy into region  and therefore make convergence to the bad equilibrium likely. Conversely, a
suf¿ciently lucky economy will be driven into Cc making convergence to the good equilibrium likely.
Howitt and McAfee (1992) formalize this argument showing that the probability of converging to
each equilibrium is positive. We now show that the learning process must converge.18 We then use
monte carlo simulation to con¿rm that the system only converges (with positive probability) to the
two rational expectations equilibria. These two results combine to show that the learning process
generates a valid equilibrium selection mechanism. The proof of the ¿rst result and the description
of the computations for the second are contained in the appendix.

4; Convergence is guaranteed in purely Bayesian models because beliefs form a martingale. This result does not apply
to our setting, however, because agents learn using a misspeci¿ed model. It is well known that in such an environment,
learning need not converge (hence the importance of this ¿rst proposition). See Blume and Easley (1998) and Nyarko
(1991).

15

Proposition 1 The learning process iR| c ^| j converges with probability one.
Proposition 2 Let Z be the empirical probability of the set of sequences iS| j such that iR| c ^| j $
ERc   Then the empirical probability of the set of sequences iS| j such that iR| c ^| j $ ERc f is equal
to E  Z  In other words, the simulation of the learning process always converges to a symmetric
rational expectations equilibrium.
The value of Z clearly depends on the position of the curve in Figure 1 and will change when this
curve is shifted by, say, a policy change. The curve can be interpreted in a way that relates it to the
risk factor of each equilibrium. Following Young (1998), we de¿ne the risk factor of equilibrium


 5 i}c Kj to be the smallest probability 4 such that if an agent believes @ will be c % with


probability strictly greater than 4c then c % is her unique optimal action. In Figure 1, de¿ne ^
W

W

to be the level of ^ where the curve separating the two regions crosses the R line. Then the risk
factor of equilibrium } is given by ^c while that of equilibrium K is given by E  ^  It is clear from
the diagram that this alone is not enough to determine Z – the position of the entire curve matters.
Because agents do not know R during the learning process, it matters how risky each strategy seems
for each possible belief R This leads us to give the following de¿nition.
De¿nition: The risk factor of action  given belief R is the smallest probability 4 such that if an


agent’s beliefs about R have mean R and the agent believes @ will be c % with probability strictly


greater than 4c then c % is her unique optimal action.
W

W

The risk factor of action } given belief R is equal to the height of the line at R in Figure 1. (Note
that this is equal to one for low enough values of R and zero for high enough values). In the next
proposition, we show that enlarging the region C (by shifting the curve down in some way) strictly
increases the probability of attaining the good equilibrium. Hence, a change in the economy that
uniformly lowers the risk factor of an equilibrium will increase the probability that the economy
reaches that equilibrium.
Proposition 3 If the risk factor of action  5 i}c Kj given belief R decreases for some R and does
not increase for any Rc then the probability of equilibrium  strictly increases.
Even though the statement of this proposition is rather intuitive, the proof is fairly complex. It
involves establishing that when the curve shifts, E the area between the new and old curves is
visited with positive probability and E the asymptotic behavior of a trajectory that visits this area
is changed with positive probability. The proof is contained in the appendix.
The condition that the risk factor not increase for any belief R may seem strong, but changes in the
individual elements of vector T have exactly this effect. This can be seen by working with equation
(8). For every value of R| such that the curve is in the interior of the box, ^| is either strictly increasing
16

or strictly decreasing in each element of T . Using Proposition 3, this implies that Z is monotone in
each of the elements of vector T This is important because it is through these elements that the policy
parameter

affects Z We state this result as a corollary.

Corollary 1 The value of Z is strictly increasing in T}}M c T}}u c T}KM c and T}Ku  It is strictly decreasing
in TKKM c TKKu c TK}M c and TK}u 
We now apply this approach to two simple examples and show how it generates interesting insights
into the corresponding optimal policy problems.

4 Applications
In this section, we apply our approach to the type of technology-choice problems that we discussed
in the introduction. We give two examples, one that captures the central features of the debate over
the Internet sales tax and another that applies to problems of technology choice more generally. Each
example puts a different structure on the utility functions T and how they are affected by government
policy this allows us to highlight different features of the optimal policy that may emerge under our
approach.
4.1

Taxing Internet Transactions

We ¿rst present a simple model that captures some of the crucial features of the Internet sales tax
debate. In line with the analysis above, there is a continuum of identical agents. Each agent gains
utility from “transacting” with other agents. These transactions occur through one of two technologies, which we label } and K. Technology } (the “good” one) represents Internet transactions and
technology K (the “bad” one) traditional store-based methods. Transacting requires effort and we
denote the agent’s choice of effort level by % 5 dfc 4  The utility cost of this effort is quadratic.
The utility derived from this effort depends on the productivity of the transactions technology, which
is characterized by a network externality. The agent’s utility level is given by


 
% s E%}   @2 %2  S
}
if technology
is used.
 '
%  @2 %2
K
The term %} is the total amount of effort employed by agents using technology }c that is, %} '

U
K

% _c

where K is the set of agents using technology } The function s is increasing and strictly concave
higher levels of total effort in technology } make each agent using the technology more productive
(this is the external effect that will generate multiple equilibria). In other words, Internet-based
transacting becomes more ef¿cient as more people use it. We assume that s Ef : f holds, as does
17

the Inada-type condition
*4 s E% ' f


%<"

(11)

In addition, the ¿xed utility cost S of operating technology } is stochastic. Keeping with the analysis
above, we assume S 5 iSu c SM j c with Su

SM and the probability of Su given by R We denote the

expected value of S by S Everything is measured in terms of utility, so the agent’s objective function
will be linear in probabilities and only the expected cost S will matter in a rational expectations
equilibrium.
This is obviously a very stylized model of exchange. A detailed analysis of government policy
and equilibrium selection in an economy with explicit search frictions and decentralized exchange
can be found in Ennis and Keister (2000).
Because of the externalities, it seems natural for the government to consider intervening to encourage effort in technology } We assume that the government does this by subsidizing such effort.
Let

5 dfc  be the rate of ad valorem subsidy and let A be the lump-sum tax (on all agents) that

¿nances this subsidy. This subsidy can be thought of as the discount from the standard sales tax that
Internet transactions receive. A passive government would charge the same tax rate on both types
of transactions. An active government would lower the rate on Internet commerce, effectively subsidizing electronic transactions. This subsidy would be paid for by all agents through either higher
tax rates on other activities or reduced government expenditures. For simplicity, we assume that this
policy does not affect the ef¿ciency or the cost of effort for agents using technology K (that is, that the
government does not raise the tax rate on traditional commerce to pay for the e-commerce subsidy).
4.1.1 Equilibrium
We ¿rst examine the set of rational expectations equilibria for a given government policy. If the agent
chooses technology Kc he will choose his effort level to solve
@
4@ %  %2  A
%
2
The solution to this problem is given by % ' @  If instead the agent chooses technology }c his
problem is
@
4@ % s E%}   E   %2  S  Ac
(12)
%
2
which is solved by
s E%} 
% '

(13)
E   @

18

The agent then compares the expected utility given by each technology
s E%} 2
SA
} G
2 E   @

K G
A
2@
and chooses the more promising one. This choice clearly depends on the agent’s beliefs about %} . In
our rational expectations equilibria, this number is known with certainty. We only look at symmetric
equilibria, where all agents act identically. We look ¿rst for the good equilibrium, where %} is
positive. From (13), the equilibrium effort level is given by the unique solution to
 
s %}
%} '

(14)
E   @
W

W

This generates utility level

 2
s %}
T}} E  '
 S  Ac
2 E   @
where the value of A is given by the government’s budget constraint
W

A '

@  2
%} 
2
W

This is an equilibrium as long as no individual agent is made better off by choosing technology K
That is, we need


A
2@
to hold. We assume that the parameter values are such that we have T}} Ef : TK} Ef ' TKK ,
so that the good equilibrium exists when the government is passive and Pareto dominates the bad
T}} E  : TK} E  '

equilibrium. It is straightforward to show that the difference ET}}  TK}  is strictly increasing in ,
and therefore the good equilibrium exists for all subsidy levels.
We also assume that the parameter values are such that the bad equilibrium exists when the government is passive (otherwise the problem of equilibrium selection does not arise). For this, we
require
TKK : T}K Ef
or


s Ef2
:
 S
2@
2@
This does not, however, mean that the bad equilibrium exists for all values of  By setting
:

s Ef2
'
c
 n 2@S

19

the government can eliminate the bad equilibrium (and thus be sure that the good equilibrium will
 Notice that, while all of the

obtain). Finally, we assume that the conditions in (7) hold for all

previous assumptions depend on Sc these two depend on Su and SM c respectively. As a result, they
can be satis¿ed for any S by making the spread between Su and SM large enough.
4.1.2 Learning and Optimal Policy
Learning in this example proceeds exactly as described in Section 3.3. From Proposition 1, we know
that the equilibrium selection mechanism Z is well-de¿ned for this example. We can use Proposition
3 to determine how the policy variable

affects Z

Proposition 4 In this example, Z is strictly increasing in

for



Proving this proposition entails showing that increasing the subsidy rate strictly expands the region C
in Figure 1. The details are contained in the appendix, but the intuition is straightforward. Increasing
the subsidy to technology } always makes it more attractive relative to K (since the lump-sum tax is
paid by the agent regardless of her choice). This leads agents to choose } for a wider range of beliefs,
which makes convergence to the good equilibrium more likely. This does not, however, imply that a
higher level of subsidy is always better from a welfare standpoint, as we show below.
The simplifying assumptions of this example allow us to gain a fair amount of insight into the
nature of the optimal policy problem. The utility value of the bad equilibrium is independent of the
policy chosen (if no one engages in Internet transactions, the sales tax rate on such transactions is
irrelevant). Hence the only relevant factors for determining the expected utility of agents are the
utility value of the good equilibrium T}} and the probability of reaching that equilibrium Z The
optimal policy problem can be written as
4@ Z E  ET}} E   TKK  

(15)

There are two cases to consider. If we have
}: c
W

then the policy that maximizes the value of the good equilibrium also eliminates the bad equilibrium,
ensuring that the good equilibrium will obtain. In this case, correcting the externality present in the
good technology is suf¿cient to make that technology a dominant choice and therefore to eliminate
W

the coordination problem. When this happens, } is clearly the optimal policy.
The more interesting case is where the coordination problem remains even when the externality is
20

being corrected, or when we have


W

}

In this case, the government faces a tradeoff. By increasing the subsidy level above } c the good
equilibrium becomes less attractive. However, deviating from the good equilibrium also becomes
W

less attractive and thereby makes the good equilibrium more likely. This tradeoff is illustrated in the
two panels of Figure 2. For each value of c we plot the pair ET}}  TKK c Z generated by the policy.

Figure 2: Optimal policy
The point generated by

' f, for example, has a value of Z strictly between zero and one, since we

assumed that both equilibria (strictly) exist when the government is passive. As we increase c we
trace out a curve of feasible points. The arrows in the ¿gure indicate the direction of movement along
increases. Initially, both T}} and Z are increasing in  When we reach the policy } c
we know that T}} is at its maximum level. For higher subsidies, T}} starts to fall, but Z continues to
W

the curve as

increase until it reaches unity at subsidy level  Increasing

beyond this is clearly inef¿cient as Z

cannot increase further and T}} continues to decrease.
The level curves of the objective function in (15) are hyperbolas. If eliminating the bad equilibrium is not very costly, as in part E@ of the ¿gure, then

is the optimal policy. If, however, the

situation is as in part EK of the ¿gure, the optimal policy will fall somewhere in between } and
 In this case, the optimal policy under our approach is different than that which would be derived
W

under any deterministic selection criterion. In particular, it should be clear from the diagram that the
21

optimal policy will never be }  This is because increasing a little past this point causes a small
(second-order) loss in the value of the good equilibrium and brings a larger (¿rst-order) increase in
W

the probability of reaching that equilibrium.
What does this tell us about the debate over taxing Internet transactions? The model tells us that
Internet commerce should be subsidized for two reasons. The ¿rst is straightforward – we assumed
that there is a network externality that should be corrected. However, the optimal subsidy level
is necessarily higher than the level that would just correct the externality because higher subsidies
make the good equilibrium more likely to obtain. This second reason for the subsidy is new to our
approach and can only be seen in a model where the subsidy can affect the equilibrium selection
process. In addition, as part EK of Figure 2 shows, it may not be optimal to subsidize e-commerce so
much that the good equilibrium is certain to obtain. Instead, it may be optimal to face some risk over
which equilibrium is selected because eliminating the bad equilibrium is too costly.19
4.2

The Choice of Technology

We can modify the above example to study situations where there are two available technologies, both
of which are subject to network externalities.20 There is again a continuum of identical agents whom
we now think of as producing a single commodity and consuming their own output. Each agent has
available two production technologies, } and K, and can operate only one of them. Both technologies
again require (costly) effort as an input, and the agent chooses an effort level % 5 dfc 4  Utility is
linear in output and is given by


 
}% s E%}   @2 %2  S
}
 '
if technology
is used.
@
2
K
K% s E%K   2 %
As before, the function s represents the network externality which now applies to both technologies.
We maintain all of our assumptions about s from the previous section, including s Ef : f. (This
last condition will help guarantee that there are no zero-effort equilibria in this setting.) We assume
} : K, so that technology } has a higher marginal product than technology K for a given amount of
total effort. As before, there is a stochastic ¿xed utility cost S of operating technology }.
We again assume that the government subsidizes effort. Let

be the rate of ad valorem subsidy

and let A be the lump-sum tax that ¿nances this subsidy. We now assume that the same subsidy level
must apply to all effort (the government cannot distinguish effort devoted to technology K from effort
devoted to technology }) and likewise, the same tax is paid by all agents. Hence, we are not allowing

4< Indeed, we have abstracted from the distortions caused by increasing other tax rates, which would increase the cost
of eliminating the bad equilibrium further.

53 A well-known example of such a situation was the adoption of video cassette recorders with the competing Beta and
VHS technologies. See Katz and Shapiro (1986) and the references therein for a discussion of this and other examples.

22

the government to pick the “winning” technology by subsidizing it and taxing the other. Instead, the
government can only encourage (or discourage) the entire industry. Because the two technologies
differ, the chosen level of encouragement will affect the equilibrium selection process.
4.2.1 Equilibrium
Regardless of the technology  5 i}c Kj chosen by the agent, his optimization problem will resemble
(12) above. The solution is therefore of the same form as (13), or
% '

s E% 

E   @

(16)

The agent compares the expected utility generated by each technology
E}s E%} 2
SA
2 E   @
EKs E%K 2
K G
A
2 E   @

} G

and chooses the more promising one. We again look at symmetric equilibria, where one of the two
numbers % will be positive and the other zero. We look ¿rst for the good equilibrium, where %} is
positive. As above, the equilibrium effort level is given by the unique solution to (14). We then have
  2
}s %}
 S  A}
T}} E  '
2 E   @
W

and

EKs Ef2
 A} c
TK} E  '
2 E   @
where, from the government’s budget constraint we have
A} '

@  2
%} 
2
W

We assume that the parameter values are such that we have T}} Ef : TK} Ef and hence the good
equilibrium exists when the government is completely passive. It is straightforward to show that this
implies that the good equilibrium exists for all values of 
Next, we look at an equilibrium where %K is positive. Such an equilibrium is a solution to
%K '
W

This leads to

Ks E%K 

E   @
W

EKs E%K 2
TKK E  '
 AK
2 E   @
W

23

and

E}s Ef2
 S  AKc
2 E   @
where AK is de¿ned similarly to A} above. We also assume that we have TKK Ef : T}K Ef, so that
the bad equilibrium exists when there is no government intervention. In this example, the difference
T}K E  '

ETKK  T}K  is strictly increasing in c which implies that the bad equilibrium exists for all values of 
Hence, unlike in the previous example, policy cannot be used here to eliminate the bad equilibrium.
We also assume that (7) holds and that we have T}} Ef : TKK Ef c so that } is indeed the good
equilibrium.
4.2.2 Learning and Optimal Policy
Appealing to Proposition 1 again gives us an equilibrium selection mechanism Z Because of the
additional complexity of this example (all of the elements of the vector T now depend directly on
), drawing diagrams such as those in Figure 2 is not possible. For this reason, we move directly to
numerical analysis. We use the following functional form and parameter values


s E% ' E% n  2 c @ ' 2 c K '  c } ' e c S ' fD
Furthermore, we take SM ' 2c Su ' c and R ' 2 
We use a grid of values for

5 dfc feo c with a step size of fffD For each value of , we simulate

the learning process Dff,fff times and set Z E  equal to the fraction of times the process converges
to the good equilibrium.21 The results of this exercise are presented in a series of graphs in Figure 3.
The ¿rst four graphs plot the elements of T as functions of the policy . (These are just the graphs of
the expressions given in the previous section.) The last two graphs are the results of the simulations
they give Z and expected utility as functions of  The former shows Z to be increasing, as expected.22
' f2e, larger than both K ' fHD and } ' f22D.
This last result clearly demonstrates the importance of considering equilibrium-selection effects in

The latter shows that the optimal policy is

W

W

W

W

W

determining the optimal policy. In this example, K and } are the obvious candidates for the optimal
policy. A policy maker who faces uncertainty about which equilibrium will obtain might be tempted
to choose something in between these two values. This would be the correct approach if Z did not
depend on . Our analysis shows, however, that such a choice is not correct for the chosen parameter
values. As the subsidy level is increased beyond } c the utility value is decreasing for both equilibria.
W

54 The computations are done in FORTRAN, using the RAN2 algorithm for generating random numbers. The source
code is available from the authors upon request.

55 The discontinuities in  come from the discrete nature of the learning process. Some of the points in the box in

Figure 4 are visited much more often than others. The value of  that ¿rst makes such an “important” point fall in region
J is accompanied by a discrete jump in =

24

Figure 3: Computational results

25

The good technology suffers less from this increase, though, and hence becomes a relatively more
attractive option. As a result, the probability of the good equilibrium is still increasing. As long as
this effect is large enough (relative to the decreases in the utility values), expected utility continues
to increase.
Notice that the government in this example does not need to know which technology is good and
which is bad since the same subsidy level is applied to both. The point is that increasing the subsidy
to the entire industry will have a positive effect of equilibrium selection, simply because the good
technology always bene¿ts more (or suffers less) from such an increase. This example shows that
policies designed to correct for network externalities can be more powerful (and therefore important)
than is currently recognized. As a result, calculations that ignore the process by which an equilibrium
is selected can substantially underestimate the optimal subsidy level.

5 Concluding Remarks
The main point of this paper is that using a probabilistic equilibrium selection mechanism can
bring models of multiple equilibria to bear on policy questions in interesting and informative ways.
We believe that the probabilistic view of equilibrium selection is both appealing and plausible, and
we have shown how adaptive learning naturally generates such a mechanism. We have also shown
through examples that taking the equilibrium-selection effect into account can reveal that policy
may be more potent than is commonly recognized. We have illustrated our approach using a very
simple model: a static model with identical agents making a binary choice and a focus on symmetric
equilibria. This allowed us to present the issues involved and to discuss the workings of the learning
process as transparently as possible, but it certainly does not mean that the general approach applies
only in such models. In other work (Ennis and Keister [2000]), we use this approach in a model with
explicit search frictions and decentralized exchange. In that context we discuss optimal aggregate
demand management policies, taking into account the effect that policy can have on the equilibrium
selection process. This is only one of many possible applications. We conclude by mentioning two
other areas that we ¿nd promising.
Financial Crises: Models of ¿nancial crises are often of the coordination-problem type that we
have studied here. We discussed above the relationship between our approach and that of Morris
and Shin (1998), who deal with currency crises. Our approach could easily be applied to their
model and would allow one to answer questions such as: under what conditions is it optimal for
the government to allow currency crises to occur with positive probability? Both Cole and Kehoe
(2000) and Chari and Kehoe (2000) consider similar issues, and our approach could be brought to
26

bear on their questions. Cooper and Corbae (2000) study ¿nancial collapses (such as occurred in the
Great Depression) as a form of coordination problem. Again, our approach could be used to study
the question of optimal policy and whether or not that entails completely eliminating the possibility
of a collapse.
Monetary Economics: It is well known that monetary models typically exhibit multiplicity of
equilibrium, with at least one equilibrium where money has no value. This is true in overlappinggenerations as well as search-theoretic environments. Suppose an economy is in a monetary steady
state and the government is considering a substantial change in policy. Extensions of our approach to
these settings would allow one to address questions such as: What is the probability that the policy
change will cause the economy to switch to another (possibly hyperinÀationary) equilibrium? Taking
this into account, what is the optimal policy? Although extending our analysis to these dynamic
settings is far from trivial, the potential payoff seems very high.

27

Appendix A. Proofs
Proposition 1: The learning process iR| c ^| j converges with probability one.
Proof: The dynamics of R| are independent of ^| and represent a standard statistical learning process.
Let l be the set of all possible sequences iS| j|' and / be an element of this set. Let   l be the
"

set of / such that R| E/ $ R( by the strong law of large numbers we know that the probability of the
set  is one. We will show that for each / in c the learning process converges.

Figure 4: Convergence of beliefs
Let the function ^ ' s ER represent the curve separating the two regions in Figure 4. Consider a
sequence of small numbers i0? j converging to zero. For each ?c de¿ne
^? ' s ER n 0? 
^?2 ' s ER  0?  
Notice that we have ^?

^

^?2 for all ?. Fix a particular / in , so that we know R| E/ converges

to R. Then for each 0? : fc there exists a |? such that | : |? implies
mR| E/  Rm

0? 

That is, for any small band around Rc the sequence R| E/ will eventually enter the band and never
leave. If ^| E/ is ever suf¿ciently low after this happens, that is, if we have
^| E/

^?

for any |  |? c

28

(A-1)

then the trajectory will never switch regions again (because doing so would require leaving the 0band). Hence all future observations will involve all agents choosing Kc and therefore ^| will converge
to zero. Similarly, if we have
^| E/ : ^?2

for any |  |? c

(A-2)

all future observations will involve all agents choosing }c and ^| E/ will converge to one. Therefore
if, for any ?, either (A-1) or (A-2) is satis¿ed, the learning process converges to one of the two
equilibria. The conclusion of the proposition is therefore established unless we have


^| E/ 5 ^? c ^?2

for all |  |? c for all ?

The continuity of s implies that ^? and ^?2 both converge to ^ as ? goes to in¿nity, so in this case
^| E/ must converge to ^ This establishes our claim.

Proposition 2: Let Z be the empirical probability of the set of sequences iS| j such that iR| c ^| j $
ERc   Then the empirical probability of the set of sequences iS| j such that iR| c ^| j $ ERc f is equal
to E  Z  In other words, the simulation of the learning process always converges to a symmetric
rational expectations equilibrium.
Proof: Note that, because of proposition 1, we only need to verify that the system does not converge
to the point ERc ^  The law of motion for ^| has a discontinuity along the curve in Figure 4, which
passes through ERc ^  This makes the asymptotic behavior around that point very dif¿cult to study
analytically. For this reason, we turn to a numerical methodology. From (8), we have that the curve
separating the two regions is of the form
^'

kR
c
q E  R n R

where kc qc  satisfy
fD : k : c
q

kc

and
q n  : 2k   : f
We ¿x R ' fD and construct a grid over the three-dimensional parameter space containing a total
of 2f points in each dimension. Points in this grid that satisfy the previous inequalities correspond
to different shapes of the curve in Figure 4.23 For each such point, we simulate c fff runs of the

56 To cover uniformly all possible shapes of the curve we construct the grid using an auxiliary variable,  @   = The
restrictions over parameters imply that  A 4= When  ? 3 the curve is concave and when  A 3 the curve is convex.
Hence, we consider a grid that concentrates half of the points in the negative range of  and half in the positive range.

29

learning process.
We set convergence bounds for ^| in the following manner. First we compute ^. Then we de¿ne
the variables KJ?_C ' 4?iE  ^*Dc fj and KJ?_ ' 4?i^*Dc fj When ^| goes beyond
KJ?_C and the system has not switched zones for the last 2c fff iterations, we say the economy
has converged to the good equilibrium. For the bad equilibrium we use a similar procedure when ^|
goes below KJ?_ . If neither of these events has occurred after ffc fff iterations, we say that the
economy did not converge to one of the two equilibria. (This would be the case, for example, if ^|
were to converge to ^.)
The convergence bounds may seem somewhat large, but it should be kept in mind that the step
size of a Bayesian learning process decreases fairly rapidly. As an example, a process that reaches
^| ' f after 2c fff steps and that continues monotonically approaching the bad equilibrium will
take over Hc fff more steps to reach ^| ' ff.24 As a result, tightening the convergence bounds is
computationally very expensive. However, this small step size also means that the probability of a
sequence switching regions after not having switched in the previous 2fff iterations is likely to be
minuscule.
In every case, the system converged to one of the two rational expectation equilibria. In fact, the
process never switches regions after the ¿rst quarter of the total number of possible iterations. Based
on this, we claim that the empirical support of the limit of the learning process is the two rational
expectations equilibria. The Fortran code is available from the authors upon request.



Proposition 3: If the risk factor of action  5 i}c Kj given belief R decreases for some R and does
not increase for any Rc then the probability of equilibrium  strictly increases.
The proof of this proposition is fairly long so we ¿rst offer a brief discussion. It is fairly straightforward to see that the probability of equilibrium  cannot decrease. Suppose the curve in Figure 4 shifts
in such a way that the new region C strictly contains the old one. Pick any / in l and let ER| c ^|  be
the sequence generated by / before the shift, and Eh
R| c ^h|  the sequence afterwards. Then we have
Rh| ' R| and ^h|  ^| for all |
This implies that if ER| c ^|  converged to ERc  before the change, it will still do so after the change
and therefore Z cannot decrease.
Showing that the probability actually increases is much more dif¿cult because it requires establishing speci¿c properties of the (probabilistic) behavior of trajectories in the box. We break this task
57 Using the equation tw.4 @ w tw > it can be shown that tw.q @ ^+w . 5,@+w . 5 . q,`tw and hence that
q @ +w . 5,^+tw @tw.q ,  4` holds. For the numbers above, this gives us q @ 5> 335+43  4, @ 4;> 34;=
30

into parts. We present and prove three lemmas and then use these results to prove the proposition.
Our ¿rst lemma applies for a ¿xed curve s and shows that any open set near enough to the center of
the box is visited with positive probability. Let R and R2 denote the values of R at which s intersects
the top and the bottom of the box, respectively (see Figure 5). We then have the following.
Lemma 1: Fix any |f  fc any starting point ER|f c ^|f , any target point Ee
Rc ^e with R

Re

R2 , and

any 0 : f. Then there exists a ¿nite number A  |f and a sequence iS| jA|'|f such that the trajectory
from ER|f c ^|f  is within 0 of Ee
Rc ^e at time A .
Proof of lemma 1: Suppose Ee
Rc ^e is below the curve s , as depicted in Figure 5. (The reverse case
is completely symmetric.) Draw the line segment starting at the origin, running through Ee
Rc ^e and
ending on s Let Eh
Rc ^h denote the endpoint of this segment on sc and let % denote the entire segment.
s
Rc ^e falls both
Consider a band around this segment with width B ' 20 (so that a B-square around Ee
inside this band and inside the 0-ball). Notice that if a trajectory enters this band between Ee
Rc ^e and
Eh
Rc ^h when the maximum step size is less than B, a long enough sequence of consecutive S| ' M
realizations will lead the trajectory to land in the 0-ball, as desired.

Figure 5: Visiting a neighborhood of Ee
Rc ^e
Next, draw the line from Efc  passing through Eh
Rc ^h( denote this line + Also draw the parallel
line segment that intersects s at the same point as the lower bound of the B-band and continues to
the right. Denote this segment by +  Suppose that a trajectory lands in the strip between + and +




when the maximum step size is less than B Then, a suf¿ciently long sequence of S| ' M realizations
31

will lead the trajectory to ¿rst cross s into the B-band around %c and to then land in the 0-ball around
Ee
Rc ^e, as desired. All that remains, therefore, is to show that for an arbitrary starting point and time,
there exists a ¿nite sequence of realizations that will lead the trajectory to land in this strip at a time
when the maximum step size is less than B.
To do this, we use the line + and the curve s to divide the box into three regions as labelled
in Figure 5. First, suppose ER|f c ^|f  is in region . Then a long enough sequence of consecutive
S| ' M realizations will bring the trajectory into region 2. From any point in region 2, a long enough
sequence of S| ' M realizations will make ^|
^hc and then a long enough sequence of S| ' u
realizations will take the trajectory into region .25
From any point in region , a long enough sequence of S| ' M realizations will lead the trajectory
to either E land in the strip between + and + or E step across this strip and land in region  Notice


that if | is large enough (so that the step size is small enough), the former will necessarily occur. If
E occurs, the above process can be repeated to construct a (long but ¿nite) sequence of realizations
that leads the trajectory to cycle until E occurs. Because the maximum step size is converging to
zero, E must occur with a maximum step size of less than B in ¿nite time.



Note that all the arguments in the proof of Lemma 1 involve ¿nite sequences of speci¿c realizations of S| and hence involve events that would occur with (perhaps very low but) positive probability.
In other words, starting from any point, the probability of entering any open set containing values of
R between R and R2 is positive. The next lemma shows that once a target neighborhood is reached,
R| can stay in that neighborhood for arbitrarily long periods of time.
Lemma 2: Pick any RA 5 Efc , any 0 : f, and any    Let A be large enough that the maximum
step size of R| is less than 0. Then there exists a sequence of realizations iS| jA|'nA such that R| remains

in the interval ERA  0c RA n 0 for all | satisfying A  |  A n 
Proof of Lemma 2: If R|  RA c then a realization of S| ' u will ensure that R|n is in the desired
interval. If R|  RA c then a realization of S| ' M will do the same. This allows one to construct a
sequence of realizations of arbitrary length that keeps R| within 0 of RA .



Lemma 2 shows that the behavior of R| can be “controlled” for ¿nite periods of time using events
of positive probability. The next lemma provides an in¿nite-period counterpart, showing that the
probability of staying in any neighborhood of R is positive in the long run.
Lemma 3: Fix any 0 : f and any A large enough that the maximum step size of R at A is less than 0

58 It is also possible that the trajectory will enter the -band between +e
s> te, and +h
s> th, before getting to region 6, at
which point switching to fw @ K will lead to the desired result if the maximum step size is less than .

32

Suppose we have a partial history of realizations / A ' iS| jA|' such that RA 5 ER  0c R n 0  Then
we have


h R| 5 ER  0c R n 0 for all |  A m / A : f

Proof of Lemma 3:

Suppose this is not true. Then there exists an 0 : f, a A   (where the

maximum step size of R is less than 0), and a partial history /A with RA 5 ER  0c R n 0 such that


h R| 5
* ER  0c R n 0 for some | : A m / A ' 

(A-3)

Pick an arbitrary partial history / A such that the generated partial trajectory ie
R| jA|' (which does
e

e

not necessarily pass through the point RA above) has ReAe 5 ER  0c R n 0 

Returning to our original partial history / A c we can construct a ¿nite sequence of  realizations

that will lead the trajectory iR| jA|'nA to E stay in the interval ER  0c R n 0 and E land on the point
ReAe at time A n  To show this, we de¿ne
 



u
%| '
if S| '

f
M
We can then write the period | belief as
S
 n |' %
R| '

S
 n |  |' %
Notice that ReAe must be a rational number and can therefore be written as the ratio of two integers U
and a For the partial history /A c de¿ne

#
u ' U 

n

A
[
'

#

and
M ' a 

nA 

$
%

A
[
'

$
%



The integers U and a can be chosen large enough that u and M are both non-negative and that
A n   Aec where  ' M n u  Then appending a sequence of  realizations, u of which are
S| ' u, to the partial history /A will lead the trajectory to land on ReAe at time A n , satisfying E.

These realizations can be ordered as in Lemma 2 to keep R| in the 0-band around R so that E is also
satis¿ed. (As long as positive numbers of both types of realizations remain, move in the direction of
R. Then the last string of (identical) realizations will lead monotonically to ReAe .)
33

Because it is of ¿nite length, this string of  realizations follows / A with positive probability. By
E, the trajectory has not exited the 0-band around R between periods A and A n . Therefore, by
(A-3) it must do so after time A n  with probability one. In other words, we have


h R| 5
* ER  0c R n 0 for some | : A n  m / A n ' 
However, any continuation history that, when appended to / A n , causes R| to exit the 0-band for

some | will also cause Re| to exit the 0-band when appended to / A . This is because the step size at


A n  is smaller than at Ae, so that R| will be closer to their common starting point Ree ' RA n than
e

A

is Re| for every |. Recall that (by independence) the set of continuation histories and their probabilities

is the same after every partial history. Therefore the probability of exiting the 0-band following / A

e

is at least as great as that following / A n c which is unity. Therefore we have
l
k
e
h R| 5
* ER  0c R n 0 for some | : Ae m / A ' 
This is true for any partial history / A at any time that it enters the 0-band – it must exit the band with
e

probability one. This implies that if we look at complete histories /c we must have
h dR| 5
* ER  0c R n 0 in¿nitely ofteno ' c
which contradicts the strong law of large numbers.



In other words, this lemma shows that if the trajectories following some partial history were driven
away from R with probability one, the same would be true for the trajectories following every partial
history. This would imply that convergence to R is a zero-probability event, which we know is false.
We now use these lemmas to prove Proposition 3.
Proof of Proposition 3: We will focus on the case where the curve shifts in such a way that the
risk factor of action } decreases for some R

R, as depicted in Figure 6. The other cases are

completely symmetric and therefore omitted. Let s denote the curve before the change and s2 the
curve afterwards.26 Let { denote the set of points lying between the two curves and between the
vertical lines at R and R2 (where the original curve intersects the top and the bottom of the box).
Note that the continuity of the curves implies that { has a non-empty interior.
Pick an arbitrary open ball in the set {c and let 20 be the radius of this ball. Consider the 0-ball
centered at the same point. Pick |f large enough so that the maximum step size for R| is less than

59 The continuity and decreasing nature of the curves are the only properties that we use here. Equation (8) actually
puts much more structure on the nature of the possible changes. However, this structure does not seem to simplify the
proofs in any way, and we therefore do not use it.

34

Figure 6: Asymptotic behavior depends on s
0 for all |  |f . Then Lemma  tells us that there is a ¿nite number A  |f and a sequence of
realizations iS| jA|'f such that, under the dynamics generated by curve s , ERA c ^A  falls in the 0-ball.

Let /A denote this partial history. The dynamics generated by curve s2 under / A will lead to the
same RA (since the behavior of R| is independent of the curve) and to some ^hA  ^A 
Notice that the 0-ball around ERA c ^A  is also contained in the set { (this was the reason for the 20
radius of the original ball). Compute the number of steps  that would be required to move ^ from
^A to below ^ Then from Lemma 2 we know that there is a sequence of  realizations such that R|
stays in the interval ERA  0c RA n 0 c and therefore the trajectory does not switch regions between
periods A and A n  (using either curve). Let / A n denote the partial history in which these 

realizations are appended to / A  This partial history leads the trajectory to a point like @ in Figure 6
at time A n  when the curve is s and to a point like K when the curve is s2 
Next, draw a line from ERA n 0c ^ to Efc . Let Rh be the value of R where this line crosses the
curve s  Note that Rh : R must hold. Also note that if R| stays in the interval ERA  0c Rh for all
future |, the trajectory from point @ will never change regions and will therefore converge to ^ ' f
Likewise, under the same restriction, the trajectory from point K will converge to ^ '  Lemma
tells us that we have27


h R| 5 ERA  0c Rh for all | : A n  m / A n : f

5: The asymmetry of the bounds around s is not important here. We could impose symmetric bounds as in Lemma

6 and then use a ¿nite sequence of realizations to lead the trajectory into these bounds. We skip this simply to avoid
introducing further notation.

35

Let ( be the set of / 5 l that begin with the partial history / A n and then remain in the interval
ERA  0c Rh for all |  A n  By construction, the trajectory generated by any / 5 ( would have
converged to ERc f with the curve s c but converges to ERc  with the curve s2  The probability of the
set ( is given by




h d(o ' h / A n  h R| 5 ERA  0c Rh for all | : A n  m /A n c
which is positive because both terms on the right-hand side are positive. In addition, as discussed
above, any / that led the economy to converge to ERc  under the curve s will do the same under
the curve s2  Therefore the probability of the set of sequences that lead the economy to converge to
ERc  has strictly increased.



Proposition 4: In this example, Z is strictly increasing in



for

Proof: In this example, using (14) allows us to write the curve dividing the regions C and  in
Figure 1 as




n
S

E   ESM  Su 
M
2@
^| '

Rc
@ E  2 % 2  
@ E  2 % 2   |
}
}
2
2
E  

W

where the constant

W

s Ef2

2@
has been introduced to simplify the notation. The curve is a straight line in this case because of the
linearity of utility in the cost S Solving this equation for R| yields


@ E  2 % 2  
E   2@ n SM  
}
R| '
 2
^| 
E   ESM  Su 
E   ESM  Su 
W

We ¿rst look at the value of R| when ^| is equal to unity (that is, the value of R| such that agents would
need to be certain that %} ' %} in order to be willing to choose technology }). This is given by
W

 2

n SM  @2 E   %W}
R ' @
SM  Su



2



 2
From (14), we see that %} is strictly increasing in and also that @2 E   %} can be replaced by
 

% s %}  This latter term is strictly increasing in %} hence in  This demonstrates that R is strictly
2 }
W

W

W

W

decreasing in  In Figure 1, this means that as

W

increases, the intersection of the curve with the top

of the box moves to the left.
We next examine the change in the intersection of the curve with the bottom of the box. This is

36

the value of R| when ^| is equal to zero, which is given by
R '
2

@ n SM 


2



E3 

SM  Su



This is clearly also decreasing in c and hence this intersection moves to the left as well. Since in this
example the curve is a straight line for all values of c this implies that a small increase in decreases
the risk factor of action } given belief R for all R in ER c R2   The risk factor for all other values is
unchanged. Therefore, by Proposition 3, Z strictly increases.



References
BLUME, L. and EASLEY, D. (1998), “Rational Expectations and Rational Learning”, in M. Majumdar (ed.), Organizations with Incomplete Information (Cambridge: Cambridge University Press).
CASS, D. and SHELL, K. (1983), “Do Sunspots Matter?”, Journal of Political Economy, 91, 193227.
CHARI, V.V. and KEHOE, P.J. (2000), “Financial Crises as Herds”, (Working paper no. 600, Federal
Reserve Bank of Minneapolis, March).
COLE, H. and KEHOE, T. (2000), “Self-ful¿lling Debt Crises”, Review of Economic Studies 67,
91-116.
COOPER, R. (1999) Coordination Games: Complementarities and Macroeconomics (Cambridge:
Cambridge University Press).
COOPER, R. and CORBAE, D. (2000), “Financial Fragility and Active Monetary Policy: A Lesson
from the Great Depression” (mimeo., May).
COOPER, R. and JOHN, A. (1988), “Coordinating Coordination Failures in Keynesian Models”,
Quarterly Journal of Economics 103, 441-463.
CRAWFORD, V. (1997), “Learning Dynamics, Lock-in, and Equilibrium Selection in Experimental
Coordination Games”, (Discussion paper 97-19, University of California, San Diego).
ENNIS, H. and KEISTER, T. (2000), “Aggregate Demand Management and Equilibrium Selection”,
(mimeo., December).
FUDENBERG, D. and TIROLE, J. (1991) Game Theory (Cambridge, MA: MIT Press).

37

GHIGLINO, C. and SHELL, K. (2000) “The Economic Effects of Restrictions on Government Budget De¿cits”, Journal of Economic Theory 94, 106-137.
GOENKA, A. (1994), “Fiscal Rules and Extrinsic Uncertainty”, Economic Theory 4, 401-16.
GOOLSBEE, A. (2000), “In a World Without Borders: The Impact of Taxes on Internet Commerce”,
Quarterly Journal of Economics 115, 561-576.
GRANDMONT, J.M. (1986), “Stabilizing Competitive Business Cycles”, Journal of Economic Theory 40, 57-76.
GUESNERIE, R. and WOODFORD, M. (1992), “Endogenous Fluctuations”, in J.-J. Laffont (ed.),
Advances in Economic Theory, Sixth World Congress, Vol II (Cambridge: Cambridge University
Press).
HARSANYI, J. and SELTEN, R. (1988) A General Theory of Equilibrium Selection in Games (Cambridge, MA: MIT Press).
HOWITT, P. and MCAFEE, R. P. (1992), “Animal Spirits”, American Economic Review 82, 493-507.
KANDORI, M., MAILATH, G. J. and ROB, R. (1993) “Learning, Mutation, and Long Run Equilibria
in Games”, Econometrica 61, 29-56.
KATZ, M. L. and SHAPIRO, C. (1986) “Technology Adoption in the Presence of Network Externalities”, Journal of Political Economy 94, 822-841.
KEISTER, T. (1998) “Money Taxes and Ef¿ciency when Sunspots Matter”, Journal of Economic
Theory 83, 43-68.
LUCAS, R. E. (1986), “Adaptive Behavior and Economic Theory”, Journal of Business 59, S401S426.
MANUELLI, R. and PECK, J. (1992), “Sunspot-like Effects of Random Endowments”, Journal of
Economic Dynamics and Control 16, 193-206.
MATSUI, A. and MATSUYAMA, K. (1995) “An Approach to Equilibrium Selection”, Journal of
Economic Theory 65, 415-434.
MORRIS, S. and SHIN, H.S. (1998) “Unique Equilibrium in a Model of Self-ful¿lling Currency
Attacks”, American Economic Review 88, 587-597.
NYARKO, Y. (1991) “Learning in Mis-speci¿ed Models and the Possibility of Cycles”, Journal of
Economic Theory 55, 416-427.
PECK, J. and SHELL, K., “Market Uncertainty: Correlated and Sunspot Equilibria in Imperfectly
Competitive Economies”, Review of Economic Studies 58, 1011-1029.
PECK, J., SHELL, K. and SPEAR, S. (1992), “The Market Game: Existence and Structure of Equilibrium”, Journal of Mathematical Economics 21, 271-299.
38

MARCET, A. and SARGENT, T. (1988), “The Fate of Systems with ‘Adaptive’ Expectations”, American Economic Review, 78, 168-172.
MARCET, A. and SARGENT, T. (1989), “Convergence of Least Squares Learning Mechanisms in
Self-referential Linear Stochastic Models”, Journal of Economic Theory 48, 337-368.
SHELL, K. (1977), “Monnaie et Allocation Intertemporelle” (CNRS Séminaire Roy-Malinvaud,
Paris, November. Title and abstract in French, text in English. Translation forthcoming in Macroeconomic Dynamics.)
SMITH, B. (1994), “Ef¿ciency and Determinacy of Equilibrium under InÀation Targeting”, Economic Theory 4, 327-44.
VAN HUYCK, J., BATTALIO, R. and BEIL, R. (1990), “Tacit Coordination Games, Strategic Uncertainty, and Coordination Failure”, American Economic Review, 80, 234-248.
VAN HUYCK, J., BATTALIO, R. and BEIL, R. (1991), “Strategic Uncertainty, Equilibrium Selection Principles, and Coordination Failure in Average Opinion Games”, Quarterly Journal of Economics, 106, 885-910.
WOODFORD, M. (1986), “Stationary Sunspot Equilibria in a Finance Constrained Economy”, Journal of Economic Theory, 40, 128-137.
WOODFORD, M. (1990), “Learning to Believe in Sunspots”, Econometrica 58, 277-307.
YOUNG, H. P. (1998) Individual Strategy and Social Structure: An Evolutionary Theory of Institutions (Princeton: Princeton University Press).
ZELLNER, A. (1971) An Introduction to Bayesian Inference in Econometrics (New York: John
Wiley & Sons).
ZITTRAIN, J. and NESSON, R. (2000), “How to Think about the Internet Sales Tax Quandary”, The
New Republic Online, (May, available at http://thenewrepublic.com/online/jzrn050200.htmlfn.

39