Introduction
The random coefficients (RC) logit – also known as the mixed logit or BLP model – is the workhorse of modern demand estimation for differentiated products. Unlike the plain logit, it generates flexible substitution patterns that are not subject to the IIA property.
This vignette covers:
- The model and its parameters (, )
- Integration methods for simulating the mixing distribution
- Estimation with and without demographics
- GMM methods and convergence diagnostics
For the plain logit and nested logit, see
vignette("logit-nested-logit"). For post-estimation
analysis, see vignette("post-estimation-mergers").
The Random Coefficients Model
Utility Specification
Consumer in market receives utility from product :
where:
- is the mean utility (common to all consumers)
- is the individual deviation from the mean
- is a Type I EV idiosyncratic shock
The individual-specific component has two sources of heterogeneity:
- Unobserved heterogeneity (): are standard normal draws, and captures the dispersion of tastes for characteristic
- Observed heterogeneity (): are observed demographics (income, age, etc.), and captures how demographic shifts preferences for characteristic
The Sigma Matrix
The parameter is a lower-triangular matrix (Cholesky root). In many applications, is restricted to be diagonal, meaning that unobserved taste shocks for different characteristics are independent.
Convention: when specifying starting values for sigma:
- Zero on the diagonal means that parameter is fixed at zero (not estimated)
- Non-zero values are starting values for optimization
For example, with characteristics (intercept, prices, sugar, mushy):
Integration Methods
The RC logit requires numerical integration over the distribution of unobserved heterogeneity . The choice of integration method affects both accuracy and computation time.
Product Rule (Gauss-Hermite)
The product rule uses Gauss-Hermite quadrature nodes, taking the
tensor product across dimensions. With size nodes per
dimension and
random coefficients, this yields
integration points per market.
# 5 nodes per dimension, K2=4 -> 5^4 = 625 points per market
int_product <- blp_integration("product", size = 5)
# 3 nodes per dimension (faster, less accurate)
int_small <- blp_integration("product", size = 3) # 3^4 = 81 pointsThe product rule is deterministic (no simulation noise) and provides exact integration for polynomials up to degree . However, it suffers from the curse of dimensionality: the number of points grows exponentially with .
Monte Carlo
Standard Monte Carlo draws random points from the mixing
distribution. With size draws:
# 1000 Monte Carlo draws per market
int_mc <- blp_integration("monte_carlo", size = 1000, seed = 42)Monte Carlo integration is flexible but introduces simulation noise that can cause roughness in the GMM objective function, making optimization difficult.
Halton Sequences
Halton sequences are quasi-random low-discrepancy sequences that provide more uniform coverage than pseudo-random draws:
# 100 Halton draws per market (requires randtoolbox package)
int_halton <- blp_integration("halton", size = 100, seed = 42)RC Logit without Demographics
The simplest RC logit has only unobserved heterogeneity (no demographics). This requires two formulations: one for the linear component () and one for the nonlinear component ().
Problem Setup
library(rblp)
products <- load_nevo_products()
# X1: linear demand (intercept + prices + sugar + mushy)
f1 <- blp_formulation(~ prices + sugar + mushy)
# X2: characteristics with random coefficients (same as X1 here)
f2 <- blp_formulation(~ prices + sugar + mushy)
# Create the problem with Gauss-Hermite integration
rc_problem <- blp_problem(
product_formulations = list(f1, f2),
product_data = products,
integration = blp_integration("product", size = 3) # 3^4 = 81 points
)
cat(sprintf("K1 (linear params): %d\n", rc_problem$K1))
#> K1 (linear params): 4
cat(sprintf("K2 (random coefficients): %d\n", rc_problem$K2))
#> K2 (random coefficients): 4
cat(sprintf("Total integration nodes: %d (%d per market)\n",
rc_problem$I, rc_problem$I / rc_problem$T))
#> Total integration nodes: 7614 (81 per market)
cat(sprintf("Total observations: %d\n", rc_problem$N))
#> Total observations: 2256
cat(sprintf("Markets: %d\n", rc_problem$T))
#> Markets: 94The problem has 4 random coefficients (intercept, prices, sugar,
mushy), so with size = 3 Gauss-Hermite nodes per dimension,
we get
integration points per market.
Estimation
# Sigma starting values (4x4 diagonal)
initial_sigma <- diag(c(0.5, 0.5, 0.5, 0.5))
# Solve with one-step GMM
rc_results <- rc_problem$solve(
sigma = initial_sigma,
method = "1s",
optimization = blp_optimization("l-bfgs-b",
method_options = list(maxit = 200, factr = 1e7))
)
print(rc_results)Interpretation:
- The estimated values capture the standard deviation of taste heterogeneity for each characteristic
- Large means consumers differ substantially in price sensitivity
- These taste differences generate realistic substitution: consumers who are less price-sensitive substitute toward premium products
RC Logit with Demographics (Full Nevo Specification)
The full Nevo (2000) specification includes observed demographics (income, age, child presence) interacting with product characteristics. This is the canonical BLP specification that matches the pyblp Nevo tutorial.
Problem Setup
agents <- load_nevo_agents()
cat("Agent data columns:\n")
#> Agent data columns:
cat(paste(names(agents), collapse = ", "), "\n\n")
#> market_ids, city_ids, quarter, weights, nodes0, nodes1, nodes2, nodes3, income, income_squared, age, child
cat(sprintf("Agents: %d (%d per market)\n", nrow(agents),
nrow(agents) / length(unique(agents$market_ids))))
#> Agents: 1880 (20 per market)
# Inspect the agent data
head(agents[, c("market_ids", "weights", "income", "income_squared", "age", "child")])
#> market_ids weights income income_squared age child
#> 1 C01Q1 0.05 0.4951235 8.331304 -0.230109009 -0.2308511
#> 2 C01Q1 0.05 0.3787622 6.121865 -2.532694102 0.7691489
#> 3 C01Q1 0.05 0.1050146 1.030803 -0.006965458 -0.2308511
#> 4 C01Q1 0.05 -1.4854809 -25.583605 -0.827946010 0.7691489
#> 5 C01Q1 0.05 -0.3165969 -6.517009 -0.230109009 -0.2308511
#> 6 C01Q1 0.05 -0.3372762 -6.878070 0.851696161 -0.2308511
# X1: linear demand with product fixed effects (absorb product dummies)
# Only price is estimated; intercept, sugar, mushy are absorbed
f1 <- blp_formulation(~ 0 + prices, absorb = ~ product_ids)
# X2: nonlinear demand (intercept + prices + sugar + mushy get RC)
f2 <- blp_formulation(~ prices + sugar + mushy)
# Agent formulation: demographics that interact with X2
f_demo <- blp_formulation(~ 0 + income + income_squared + age + child)
# Create the problem (agent data provides nodes, weights, and demographics)
demo_problem <- blp_problem(
product_formulations = list(f1, f2),
product_data = products,
agent_formulation = f_demo,
agent_data = agents
)
cat(sprintf("K1 (linear params): %d (only prices, FE absorbed)\n", demo_problem$K1))
#> K1 (linear params): 1 (only prices, FE absorbed)
cat(sprintf("K2 (random coefficients): %d\n", demo_problem$K2))
#> K2 (random coefficients): 4
cat(sprintf("D (demographics): %d\n", demo_problem$D))
#> D (demographics): 4
cat(sprintf("Agents per market (from data): %d\n", demo_problem$I))
#> Agents per market (from data): 1880
cat(sprintf("Sigma matrix: %d x %d\n", demo_problem$K2, demo_problem$K2))
#> Sigma matrix: 4 x 4
cat(sprintf("Pi matrix: %d x %d\n", demo_problem$K2, demo_problem$D))
#> Pi matrix: 4 x 4The sigma and pi matrices encode the parameter structure:
# Sigma (4x4 diagonal): RC standard deviations
# Starting values near pyblp's converged estimates
sigma0 <- diag(c(0.558, 3.312, -0.006, 0.093))
# Pi (4x4): interactions between X2 characteristics and demographics
# Rows = X2 vars (intercept, prices, sugar, mushy)
# Cols = demographics (income, income_squared, age, child)
pi0 <- matrix(c(
2.292, 0, 1.284, 0, # intercept interactions
588.3, -30.19, 0, 11.05, # price interactions
-0.384, 0, 0.0524, 0, # sugar interactions
0.748, 0, -1.354, 0 # mushy interactions
), nrow = 4, ncol = 4, byrow = TRUE)
# Solve
demo_results <- demo_problem$solve(
sigma = sigma0,
pi = pi0,
method = "1s",
optimization = blp_optimization("l-bfgs-b",
method_options = list(maxit = 1000, factr = 1e7))
)
print(demo_results)
# Price coefficient ~ -63, matching pyblp's Nevo tutorialKey points about the Nevo specification:
- The agent data includes 20 agents per market with pre-computed nodes
(columns
nodes0–nodes3), integration weights, and demographic variables - Product fixed effects in absorb brand-level unobserved quality, so only the price coefficient is estimated in the linear part
- The large number of moment conditions (20 excluded instruments + exogenous characteristics) helps identify the many nonlinear parameters
- The Pi matrix zeros encode economic restrictions (e.g.,
income_squaredonly interacts with the price coefficient)
Two-Step vs. One-Step GMM
One-Step GMM (method = "1s")
Uses the initial weighting matrix (2SLS weight) throughout. This produces consistent estimates but is not asymptotically efficient.
Two-Step GMM (method = "2s")
- Step 1: Estimate with the 2SLS weight
- Update: Compute the optimal weight from Step 1 residuals
- Step 2: Re-estimate with
Two-step GMM is efficient (achieves the smallest asymptotic variance among GMM estimators) but can exhibit finite-sample bias, especially with many moment conditions.
# One-step (faster, consistent)
results_1s <- problem$solve(sigma = sigma0, method = "1s")
# Two-step (efficient, slower)
results_2s <- problem$solve(sigma = sigma0, method = "2s")Convergence Diagnostics
The RC logit objective function can be non-convex, so checking convergence is important.
Fixed-Point Convergence
The BLP contraction mapping must converge in every market at every evaluation:
# Did all fixed points converge?
results$fp_converged
# Total contraction iterations (across all markets and evaluations)
results$fp_iterationsBest Practices
- Try multiple starting values: the GMM objective may have local minima
- Start from the logit solution: use logit estimates as a sanity check
-
Use tight contraction tolerances:
blp_iteration("squarem")is the default and is faster than the simple contraction - Increase integration accuracy: more nodes reduce simulation error
- Check the gradient norm: it should be small at convergence
Optimization Options
rblp supports several optimizers:
# L-BFGS-B (default): quasi-Newton with bounds, uses gradient
opt_lbfgsb <- blp_optimization("l-bfgs-b",
method_options = list(maxit = 1000, factr = 1e7))
# Nelder-Mead: derivative-free simplex method
opt_nm <- blp_optimization("nelder-mead",
method_options = list(maxit = 5000))
# nlminb: PORT optimization, good with bounds
opt_nlminb <- blp_optimization("nlminb",
method_options = list(iter.max = 1000, eval.max = 2000))For most applications, L-BFGS-B is recommended because it uses analytical gradients (computed via the implicit function theorem) and respects parameter bounds.
Iteration Options
The BLP contraction mapping can be accelerated with SQUAREM:
# SQUAREM (default): accelerated fixed-point iteration
iter_squarem <- blp_iteration("squarem")
# Simple contraction: safer but slower
iter_simple <- blp_iteration("simple",
list(atol = 1e-14, max_evaluations = 5000))SQUAREM typically converges in 10–50% fewer iterations than the simple contraction, with no loss of accuracy.
Summary of Key Formulas
| Component | Formula |
|---|---|
| Mean utility | |
| Individual deviation | |
| Choice probability | |
| Predicted share | |
| Contraction | |
| Moment condition | |
| GMM objective | , where |
References
- Berry, S., Levinsohn, J., & Pakes, A. (1995). Automobile Prices in Market Equilibrium. Econometrica, 63(4), 841-890.
- Nevo, A. (2000). A Practitioner’s Guide to Estimation of Random-Coefficients Logit Models of Demand. Journal of Economics & Management Strategy, 9(4), 513-548.
- Conlon, C. & Gortmaker, J. (2020). Best Practices for Differentiated Products Demand Estimation with pyblp. RAND Journal of Economics, 51(4), 1108-1161.
- Varadhan, R. & Roland, C. (2008). Simple and Globally Convergent Methods for Accelerating the Convergence of Any EM Algorithm. Scandinavian Journal of Statistics, 35(2), 335-353.