Notes

Emp IO

Nice things to know

What is IO?

This section is not for formally defining the field of IO. I am just writng down some things IO research seem to focus on.

IO is about understanding market structure and how it affects the equilibrium outcomes such as price, quantity, etc. By market structure, we mean mainly the various features of the supply-side of the market (the firm). IO tries to understand the characteristics of the market structure and how it affects welfare (consumer surplus, profit) in the equilibrium.

Old IO used to analyze empirical association of market structures across industries. But this was problematic because market structure is endogenous. This lead to the period of game theory. Game theory was useful because it provided useful tools to characterize the strategic interactions of individuals that make up the market.

New empirical IO seems to focus their research within some specific industry and also employ economic theory and econometric methods to fully characterize the market of their interests. During this process, they developed many useful tools (demand estimation, production function estimation, dynamic model, etc) that are also helpful to non-IO researchers.

Why do we start with indirect utility in Discrete choice model?

Actually, this can be derived from the usual utility function. Suppose consumer maximizes:

\[ \max_{q_1, q_2, c} U(q_1, q_2, c) \]

subject to

\[\begin{align} p_1 q_1 + p_2 q_2 + c = m \\ q_1 q_2 = 0 \end{align}\]

\(c\) is the numeraire good with its price normalized to 1. To make this more simple, we will assume that quantity has to be either 1 or 0.

Then we can get conditional indirect utility function which is (we condition on \(q_1 = 0\))

\[ \max_{q_2, c} U(0, q_2, c) \]

subject to

\[ p_2 q_2 + c = m \]

There is a standard solution for this demand functions \(q_2(p_2, m)\) and \(c(p_2, m)\). So plug this into the demand function and we get the indirect utility function:

\[ V_2(p_2, m) = U(0, q_2(p_2, m), c(p_2, m)) = U(0, 1, m-p_2) \]

where the last equation holds because we are only considering that consumer can buy at most one good for \(q_1, q_2\).

Doing the same conditioning on \(q_2 = 0\), we can see that the solution to the discrete choice part of the decision problem is then given by a choice between the wo conditional utility functions:

\[ \max_{j=\{1,2 \}} V_j (p_j, m) \]

This is the discrete choice model we usually see. We then usually add some random shocks to accomodate the changes in the people’s choice in the real-world data.

What is \(\varepsilon_{ijt}\)? (in progress)

I think \(\varepsilon\) part is some random shock related to the conditional indirect utility of product \(j\). But this becomes bit weird as we add more random components into the model like mixed logit in BLP. In this setup, people usually call \(\varepsilon\) as some superimposed noise used to accomodate estimation. This is because having \(\varepsilon\) allows us to derive the conditional choice probability in multinomial logit form. Thus in this case, \(\varepsilon\) does not seem to have a very important economic meaning. But sometimes I feel people think of this term as some idiosyncratic error terms that is specific to some individuals. Thus if we consider this error as something that is very specific to certain individual (uncorrelated with other unobserved systematic factors), this could make sense.

Ackerberg et al. (2006)

Demand systems

Before the modern empirical IO, demand system was usually based on “representative agent models” in a “product space.” This lead to various problems:

Problms of representative agent models

Since lot of the demand systemts were estimated from aggregate market level data, it was hard to generalize the analysis of particular market into different settings. This is because of heterogenous agents in different markets. In order to alleviate this, researchers tried methods such as imposing some a priori distribution of consumer characteristics and aggregate it to the market level. The problem of this approach was that this distribution was very ad-hoc and unrealistic.

Solutions of representative agent problem

One solution to this problem was the use of simulation methods. Instead of assuming certain distribution, researchers would draw a vector of consumer characteristics from some observed population data (e.g. CPS) in that market. Then they would determine the choices made by individuals for some given parameters and aggregate the choices to get the predicted demand. After that, they would apply some algorithm to find the parameter values that get these predicted demand match the observed demand.

Problems of product space

“Too many parameters problems”: If we model the demand system in terms of the product space, we would need to estimate large number of parameters. This could easily become impossible with the data we have.
“New goods problem”: Since we are in product space, we can only think about goods that were already there. We cannot analyze the demand for new goods.

Solutions of product space problem

One solution was to transition from product space to characteristic space. Now a product is a bundle of characteristics and individuals’ preferences are defined on those characteristics.

Background on characteristics space

In fact, papers such as Mcfadden (1974, 1981) and others already provided a well-defined econometrics models that is consistent with this setup. But it was not widely used in IO due to two reasons:

Logit model’s tendency to give arise to IIA problem.
Early models did not account for unobserved product characteristics.

IIA problem

Before we go deep into the problem of IIA, please be aware that IIA assumption is not always an unrealistic asumption. In fact, there can be certain settings where the assumption is plausible. Thus what matters is the objective of your research question.

Anyway, IIA stands for Independence from Irrelevant Alternatives. What this means is that for alternatives \(i\) and \(k\), ratio of logit probability does not depend on other alternatives. So why is this property problematic? This becomes an issue when we are interested in understanding substitution patterns across products in certain market. Nice example is the famous red bus and blue bus example. Suppose in some market consumers can either take a car or a blue bus to work. Now assume that the representative utility of two modes are same. Then the we have \(p_c = P_{bb} = \frac{1}{2}\). In this case, the ratio of logit probability is 1. Now suppose red bus is introduced to the market. Since the only difference between red bus and blue bus is color, assume that ratio of their logit probability is one: \(P{rb} / P_{bb} = 1\). But in logit model, the ratio for car and blue bus will also still be one. So the only probability where this is possible is when \(P_c = P_{bb} = P_{rb} = \frac{1}{3}\). But this is unrealistic as we would expect that only the probability of people taking the blue bus would be affected. This is because probability of people taking the car should not be affected by introduction of another bus.

This problem leads to proportional substitution. In logit model, cross-elasticity of product \(i\) becomes same for any other alternative \(j\). So change in one product’s factor will lead to same proportional shift in all other alternativies. Of course this is an unrealistic behavior (especially if you are interested in variations of substitution patterns across products).

Solutions to IIA

There are many ways to solve it. Mostly, people use other model such as nested logit model or random coefficient (mixed logit) model.

Unobserved product characteristics problem

This is similar to the traditional simultaneous equation problem in demand/supply function. Basically, price is correlated with the error term. Simple solution would be to apply IV using supply shifters. But this is not an easy problem for the demand system as equations are embedded in a complex non-linear functional forms.

Solutions to unobserved product characteristics problem

This was solved by Berry (1994) and BLP (1995). By applying non-linear change of variables or contraction mapping, we can retrieve the equation that is linear in unobservables. Then we can apply the traditional IV estimation to overcome the endogeneity problem.

Simple model

\[ u_{ijt} = U(\tilde{x}_{jt}, \xi_{jt}, z_{it}, v_{it}, y_{it} - p_{jt}, \theta). \]

Usually we will just drop the index \(t\) which stands for market. We asume there are \(k\) dimension of product characteristics. In practice, we do not explicitly model the expenditure in other markets. Instead, income is subsumed into either \(v_i\) or \(z_i\) and utility is modelled as depending explicitly on price.

\[ u_{ij} = U(\tilde{x}_j, \xi_j, z_i, v_i, p_j, \theta). \]

We then parameterize the model in linear fashion and let

\[ U_{ij} = \sum_k x_{jk}\theta_{ik} + \xi_j + \varepsilon_{ij}, \]

where \(\theta_{ik} = \overline{\theta_k} + \theta_k^{o}` z_i + \theta_k^{u}` v_i\). We normalize outside utility as \(U_{i,0} = 0\).

We can then fully write the model as

\[ U_{ij} = \overbrace{\delta_j}^{\sum_k x_{jk} \overline{\theta_k} + \xi_j} + \sum_{kr} x_{jk} z_{ir} \theta_{rk}^o + \sum_{kl} x_{jk} v_{il} \theta_{kl}^u + \varepsilon_{ij}. \]

Steps in estimation

Step I

Approximation to the aggregate shares conditional on a partiuclar value of \((\delta, \theta)\). McFadden (1974) showed that in logit assumption, we can find the choice probabilities implied by the model analytically condition on the \(v_i\).

\[ \sigma_j(\theta, \delta) = \int \overbrace{\frac{\exp [ \delta_j + \sum_{kl} x_{jk} v_{il} \theta_{kl}^u ]}{1 + \sum_q \exp[ \delta_q + \sum_{kl} x_{qk} v_{il} \theta_{kl}^u ]}}^{\text{by McFadden}} f(v) d(v). \]

This integral is intractable. We then use simulation to obtain an approximation of it following Pakes (1986).

\[ \sigma_j(\theta, \delta, P^{ns}) = \sum_{r=1}^{ns} \ldots \]

Step II

Then contraction mapping by BLP (1995):

\[ \delta_j^k(\theta) = \delta_j^{k-1} (\theta) + \log[s_j^n] - \log[\sigma_j(\theta, \delta^{k-1}, P^{ns})]. \]

Then we can get the unobservables in linear form:

\[ \xi_j(\theta, s^n, P^{ns}) = \delta(\cdot) - \sum_k x_{jk} \overline{\theta_k}. \]

Step III

Good ol’ GMM.

\[ G_{J, n, ns} (\theta) = \sum_j \xi_j(\theta, s^n, P^{ns}) f_j(w). \]

Additional sources of info on demand parameters

Adding equilibrium assumption (e.g. supply equation, pricing equation) could make the estimation more precise.
Adding Micro data: Sometimes we can get a more precise estimate if we have individual level data on consumer’s characteristics. This is become more relevant as now there are many microdata that is available to researchers.