Introduction
This vignette covers the simplest demand models in the BLP framework:
the plain logit and the logit with product fixed effects. These models
serve as useful starting points before moving to random coefficients
specifications (see vignette("random-coefficients")).
The Logit Demand Model
Utility Specification
In the logit demand model, consumer in market receives utility from product :
where is the mean utility that is common across all consumers, and is an i.i.d. Type I Extreme Value (Gumbel) taste shock. The term is an unobserved (to the econometrician) product-market quality.
The consumer chooses the outside option (good 0) if no inside good provides higher utility than .
The Logit Inversion
Under the Type I EV assumption, the market share of product takes the well-known logit form:
Berry (1994) showed that this can be analytically inverted to recover mean utilities from observed shares:
where is the outside good share. This inversion is exact and requires no iteration – a major computational advantage of the plain logit.
Why Instruments Are Needed
Substituting the linear specification for , the estimating equation is:
Prices are set by firms that observe (unobserved quality), creating endogeneity: . OLS estimation of is therefore biased (typically toward zero, understating price sensitivity).
Instrumental variables (IV) are needed to consistently estimate the price coefficient. The moment condition is:
where are instruments that are correlated with prices but uncorrelated with unobserved quality. Common choices include:
- BLP instruments: sums of characteristics of other own-firm and rival products
- Differentiation instruments (Gandhi & Houde 2020): measures of product isolation in characteristic space
- Cost shifters: input prices, exchange rates, etc.
The Nevo cereal dataset comes with 20 pre-computed excluded
instruments (demand_instruments0 through
demand_instruments19).
Example: Plain Logit with Nevo Data
library(rblp)
# Load the Nevo (2000) cereal data
products <- load_nevo_products()
cat(sprintf("Observations: %d\n", nrow(products)))
#> Observations: 2256
cat(sprintf("Markets: %d\n", length(unique(products$market_ids))))
#> Markets: 94
cat(sprintf("Products per market: %d\n", length(unique(products$product_ids))))
#> Products per market: 24The plain logit includes an intercept and observable product characteristics in the linear index:
# Define the linear demand formulation: intercept + prices + sugar + mushy
f1 <- blp_formulation(~ prices + sugar + mushy)
# Create the problem (single formulation = logit)
logit_problem <- blp_problem(list(f1), products)
# Solve with one-step GMM
logit_results <- logit_problem$solve(method = "1s")
print(logit_results)
#> BLP Estimation Results
#> Method: 1S GMM
#> Objective: 2.047932e+02
#> Optimization converged: TRUE
#> FP converged: TRUE (94 total iterations)
#>
#> Parameter Estimates:
#> parameter estimate se t_stat
#> (Intercept) -2.810674 0.109432 -25.684
#> prices -11.699731 0.858444 -13.629
#> sugar 0.048381 0.004208 11.498
#> mushy 0.043127 0.052717 0.818Key observations:
- The price coefficient () is negative, as expected: higher prices reduce utility.
- The exogenous product characteristics (intercept, sugar, mushy) and the 20 pre-computed instruments serve as excluded instruments for the price variable.
- This is a IV-GMM estimator, not OLS, so we are addressing price endogeneity.
Post-Estimation Checks
# Elasticities for the first market
first_market <- logit_results$problem$unique_market_ids[1]
E <- logit_results$compute_elasticities(first_market)
cat("Own-price elasticities (first 5 products):\n")
#> Own-price elasticities (first 5 products):
print(round(diag(E)[1:5], 3))
#> [1] -0.833 -1.325 -1.529 -1.516 -1.779
# Consumer surplus
cs <- logit_results$compute_consumer_surplus()
cat("\nConsumer surplus (first 5 markets):\n")
#>
#> Consumer surplus (first 5 markets):
print(round(cs[1:5], 4))
#> C01Q1 C03Q1 C04Q1 C05Q1 C07Q1
#> 0.0503 0.0458 0.0944 0.0472 0.0772A well-known limitation of the plain logit is the
Independence of Irrelevant Alternatives (IIA) property:
the ratio of any two products’ market shares depends only on their own
mean utilities, not on what other products are available. This generates
unrealistic substitution patterns. Product fixed effects (below) and
random coefficients (vignette("random-coefficients"))
address this.
Example: Logit with Product Fixed Effects
When panel data is available (the same products observed across multiple markets/time periods), product fixed effects can absorb all time-invariant product characteristics. This is the within estimator applied to the logit share inversion.
The specification absorbs (product dummies), so that the only remaining coefficient is the price coefficient :
In rblp, the absorb argument implements the
Frisch-Waugh-Lovell (FWL) theorem, demeaning all variables within the
groups defined by the absorb formula.
# Absorb product-level fixed effects
f1_fe <- blp_formulation(~ prices, absorb = ~ product_ids)
fe_problem <- blp_problem(list(f1_fe), products)
fe_results <- fe_problem$solve(method = "1s")
print(fe_results)
#> BLP Estimation Results
#> Method: 1S GMM
#> Objective: 1.797148e+02
#> Optimization converged: TRUE
#> FP converged: TRUE (664 total iterations)
#>
#> Parameter Estimates:
#> parameter estimate se t_stat
#> prices -30.420493 1.030311 -29.526The price coefficient of -30.42 is substantially more negative than the plain logit estimate. This is because the product fixed effects control for time-invariant unobserved quality, reducing the upward bias from the positive correlation between prices and quality.
This result matches the pyblp Nevo tutorial specification, where the analogous code is:
Comparing Elasticities
first_market <- fe_results$problem$unique_market_ids[1]
E_fe <- fe_results$compute_elasticities(first_market)
cat("Own-price elasticities with FE (first 5 products):\n")
#> Own-price elasticities with FE (first 5 products):
print(round(diag(E_fe)[1:5], 3))
#> [1] -2.166 -3.446 -3.975 -3.942 -4.625The fixed-effects estimates yield much larger own-price elasticities (in absolute value), reflecting the more negative price coefficient. This is a common finding: controlling for product quality reveals greater consumer price sensitivity.
Nested Logit
The nested logit relaxes IIA by grouping products into nests (categories). Substitution within a nest is stronger than substitution across nests. The utility specification is:
where indexes the nest that product belongs to, is a nest-specific shock, and is the nesting parameter. When , the nested logit reduces to the plain logit; as , products within a nest become perfect substitutes.
The share equation becomes:
where is the within-nest share of product .
Nested Logit with rblp
To estimate a nested logit in rblp, include a
nesting_ids column in the product data and pass the
rho parameter to solve():
# Suppose products have a 'nesting_ids' column (e.g., product category)
# products$nesting_ids <- products$brand_category # hypothetical
# Same formulation as logit
f1 <- blp_formulation(~ prices + sugar + mushy)
nested_problem <- blp_problem(list(f1), products)
# Solve with nesting parameter (starting value)
nested_results <- nested_problem$solve(
rho = 0.5, # starting value for the nesting parameter
method = "1s"
)
# The estimated rho captures within-nest correlation
print(nested_results)The nested logit offers a middle ground between the plain logit (too restrictive substitution) and the full random coefficients model (computationally demanding). It is particularly useful when products have a natural categorical structure (e.g., cereal brands by manufacturer, cars by segment).
Summary
| Model | IIA | Computation | Parameters |
|---|---|---|---|
| Plain logit | Yes | Analytical inversion, very fast | , |
| Logit + FE | Yes | Analytical + demeaning, fast | (FE absorbed) |
| Nested logit | Relaxed within nests | Analytical + optimize | , , |
| Random coefficients | No | Contraction mapping + optimization | , , , |
For richer substitution patterns that do not rely on pre-specified
nesting structures, see
vignette("random-coefficients").
References
- Berry, S. (1994). Estimating Discrete-Choice Models of Product Differentiation. RAND Journal of Economics, 25(2), 242-262.
- Berry, S., Levinsohn, J., & Pakes, A. (1995). Automobile Prices in Market Equilibrium. Econometrica, 63(4), 841-890.
- Nevo, A. (2000). A Practitioner’s Guide to Estimation of Random-Coefficients Logit Models of Demand. Journal of Economics & Management Strategy, 9(4), 513-548.
- Gandhi, A. & Houde, J.-F. (2020). Measuring Substitution Patterns in Differentiated-Products Industries. NBER Working Paper.