Research Overview:
I am an applied methodologist, interested in the intersection of modern probabilistic machine
learning and marketing. I focus on problems in customer analytics, preference measurement, and
design, with an eye to developing and applying flexible, interpretable, computational tools to drive
insights in these domains. I also study how rich, unstructured data like text and images can be used
in classic data-driven marketing contexts, like preference measurement, and in contexts which
previously have been difficult to study from a data-driven perspective, like branding and design.
Methodologically, I am interested in Bayesian nonparametrics, Bayesian computation, deep generative
models, and representation learning. I'm honored to have received several awards for my research,
including the 2022 Frank M. Bass Award, the 2018 INFORMS Society for Marketing Science Doctoral
Dissertation Award, and the 2018 Marketing Section of the American Statistical Association's
Doctoral Research Award. My papers have also been finalists for the John D.C. Little Award and Paul Green Awards. In recognition of my research, I was named a 2023 MSI Young Scholar.
Publications:
-
Bayesian Nonparametric Customer Base Analysis with Model-based Visualizations
Ryan Dew and Asim Ansari
Marketing Science, 2018
- Finalist, 2019 Frank M. Bass Award
[
Show Abstract
] [
Paper (SSRN)
] [
Paper (Journal)
]
[
Code Notebook
] [
Replication Data
] [
Stan Code
]
Modern marketers are responsible for understanding and managing customer
spending behavior across many different products. Dynamics in spending result from
both predictable customer-level effects, which are characterized by interpurchase
time, customer lifetime, and past purchase frequency, as well as calendar time
effects, which are driven by managerial actions such as product changes and
promotions, and by general trends and random shocks. Understanding these dynamics in
spending is further complicated by a lack of knowledge of all of the factors that
influence spending for a given product, a problem exacerbated in large multiproduct
firms by information asymmetries that can exist between the product teams that
execute marketing actions and the marketing analytics team responsible for customer
base analysis. A comprehensive understanding of customer base dynamics therefore
requires a modeling framework that flexibly integrates both known and unknown
calendar time determinants of spending with the individual-level effects that
robustly predict spend activity. In this paper, we develop a Bayesian nonparametric
framework based on Gaussian process priors to understand and predict customer
spending. Our model separates out calendar time effects from individual-level
dynamics by modeling both sets of factors as unknown latent functions that jointly
determine spend propensity. The primary output of our Gaussian Process Propensity
Model (GPPM) is a set of estimated curves that provides a visual and easily
comprehensible representation of purchasing dynamics, which we call the model-based
dashboard. We illustrate the utility of our modeling framework on data from two
popular free-to-play mobile video games. We show how the GPPM's model-based
dashboard can be useful for assessing patterns and disruptions in spending. We also
show how the GPPM exhibits superior forecasting ability compared to existing
customer base analysis benchmarks, including hazard and buy-till-you-die models.
-
Modeling Dynamic Heterogeneity using Gaussian Processes
Ryan Dew, Asim Ansari, Yang Li
Journal of Marketing Research, 2020
- Finalist, 2020 Paul Green Award
[
Show Abstract
] [
Paper (Open Access)
]
Marketing research relies on individual-level estimates to understand the rich
heterogeneity that exists in consumers, firms, and products. While much of the
literature focuses on capturing static cross-sectional heterogeneity, little
research has been done on modeling dynamic heterogeneity, or the heterogeneous
evolution of individual-level model parameters. In this work, we propose a novel
framework for capturing the dynamics of heterogeneity, using individual-level,
latent, Bayesian nonparametric Gaussian processes. Similar to standard heterogeneity
specifications, our Gaussian Process Dynamic Heterogeneity (GPDH) specification
models individual-level parameters as flexible variations around population-level
trends, allowing for sharing of statistical information both across individuals and
within individuals over time. This hierarchical structure provides precise
individual-level insights regarding parameter dynamics. We show that GPDH nests
existing heterogeneity specifications, and that not flexibly capturing
individual-level dynamics may result in biased parameter estimates. Substantively,
we apply GPDH to two problems: understanding preference dynamics, and modeling the
evolution of online reviews. Across both applications, we find robust evidence of
dynamic heterogeneity, and illustrate GPDH's rich managerial insights, with
implications for targeting, pricing, and market structure analysis.
-
Letting Logos Speak: Leveraging Multiview Representation Learning for Data-Driven Logo
Design
Ryan Dew, Asim Ansari, Olivier Toubia
Marketing Science, 2022
- Winner, 2022 Frank M. Bass Award
- Finalist, 2022 John D.C. Little Award
[
Show Abstract
] [
Paper (SSRN)
] [
Paper (Journal)
]
[
Explore Our Data
] [
Personality-based Logo
Generator
]
Logos serve a fundamental role in branding as the visual figurehead of the
brand. Yet, due to the difficulty of using unstructured image data, prior research
on logo design has been largely limited to non-quantitative studies. In this work,
we explore logo design from a data-driven perspective. In particular, we aim to
answer several key questions: first, to what degree can logos represent a brand's
personality? Second, what are the key visual elements in logos that elicit brand and
firm relevant associations, such as brand personality traits? Finally, given text
describing a firm's brand or function, can we suggest features of a logo that elicit
the firm's desired image? To answer these questions, we develop a novel logo feature
extraction algorithm, that uses modern image processing tools to decompose
unstructured pixel-level image data into meaningful visual features. We then analyze
the links between firm identity and the features of logos through a deep, multiview
generative model, which links visual features of logos with textual descriptions of
firms and consumer ratings of brand personality by learning representations of brand
identity. We apply our modeling framework on a dataset of hundreds of logos, textual
descriptions from firms' websites, third party descriptions of firms, and consumer
evaluations of brand personality to explore these questions.
-
Detecting Routines: Applications to Ridesharing CRM
Ryan Dew, Eva Ascarza, Oded Netzer, and Nachum Sicherman
Journal of Marketing Research, 2024
[
Show
Abstract
] [
Paper (Open Access)
] [
Code
]
Routines shape many aspects of day-to-day consumption. While prior work has established
the importance of habits in consumer behavior, little work has been done to understand
the implications of routines —- which we define as repeated behaviors with recurring,
temporal structures —- for customer management. One reason for this dearth is the difficulty
of measuring routines from transaction data, particularly when routines vary substantially
across customers. We propose a new approach for doing so, which we apply in the context of
ridesharing. We model customer-level routines with Bayesian nonparametric Gaussian
processes (GPs), leveraging a novel kernel that allows for flexible yet precise estimation
of routines. These GPs are nested in inhomogeneous Poisson processes of usage, allowing us
to estimate customers' routines, and decompose their usage into routine and non-routine
parts. We show the value of detecting routines for customer relationship management (CRM)
in the context of ridesharing, where we find that routines are associated with higher
future usage and activity rates, and more resilience to service failures. Moreover, we
show how these outcomes vary by the types of routines customers have, and by whether
trips are part of the customer's routine, suggesting a role for routines in segmentation
and targeting.
-
Mega or Micro? Influencer Selection Using Follower Elasticity
Zijun Tian, Ryan Dew, and Raghu Iyengar
Journal of Marketing Research, 2024
[
Show Abstract
] [
Paper (PDF)
] [
Paper (Journal)
]
Web Appendix
]
[
Knowledge@Wharton
] [
YouTube
]
Despite the explosive growth of influencer marketing, wherein companies sponsor social
media personalities to promote their brands, there is little research to guide
companies' selection of influencer partners. One common criterion is popularity: while
some firms sponsor “mega” influencers with millions of followers, other firms partner
with “micro” influencers, who may only have several thousands of followers, but may also
cost less to sponsor. To quantify this trade-off between reach and cost, we develop a
framework for estimating the follower elasticity of impressions, or FEI, which measures
a video’s percentage gain in impressions corresponding to a percentage increase in the
follower size of its creator. Computing FEI involves estimating the causal effect of an
influencer’s popularity on the view counts of their videos, which we achieve through a
combination of a unique dataset collected from TikTok, a representation learning model
for quantifying video content, and a machine learning-based causal inference method. We
find that FEI is always positive, but often nonlinearly related to follower size,
suggesting different optimal sponsorship strategies than those observed in practice. We
examine the factors that predict variation in these FEI curves, and show how firms can
use these results to better determine influencer partnerships.
-
Adaptive Preference Measurement with Unstructured Data
Ryan Dew
Management Science
[
Show Abstract
] [
Paper (Journal)
] [
Paper (SSRN)
] [
Web Appendix
] [
Code
]
Many products are most meaningfully described using unstructured data like text or images. Unstructured data are also common in e-commerce, where products are often described by photos and text, but not with standardized sets of attributes. While much is known about how to efficiently measure consumer preferences when products can be meaningfully described by structured attributes, there is scant research on doing the same for unstructured data. This paper introduces a real-time, adaptive survey design framework for measuring preferences over unstructured data, leveraging Bayesian optimization. By adaptively choosing items to display based on uncertainty around a nonparametric utility model, the proposed method maximizes information gain per question, enabling quick estimation of individual-level preferences. The approach operates on embeddings of the unstructured data, thereby eliminating the requirement for manual coding of product attributes. We apply the method to measuring preferences over clothing, and highlight its potential both for the general task of marketing research, and for the specific task of designing customer onboarding surveys to mitigate the cold-start recommendation problem. We also develop methods for interpreting the nonparametric utility functions, which allow us to reconstruct consumer valuations of discrete attributes, even for attributes that were not considered or available a priori.
-
Probabilistic Machine Learning: New Frontiers for Modeling Consumers and their Choices
Ryan Dew, Nicolas Padilla, Lan E. Luo, Shin Oblander, Asim Ansari, Khaled Boughanmi, Michael Braun, Fred Feinberg, Jia Liu, Thomas Otter, Longxiu Tian, Yixin Wang, and Mingzhang Yin
Forthcoming, International Journal of Research in Marketing
[
Show
Abstract
] [
Paper (SSRN)
] [ Code Companion ]
Making sense of massive, individual-level data is challenging: marketing researchers and analysts need flexible models that can accommodate rich patterns of heterogeneity and dynamics, work with and link diverse data types, and scale to modern data sizes. Practitioners also need tools that can quantify uncertainty in models and predictions of consumer behavior to inform optimal decision-making. In this paper, we demonstrate the promise of probabilistic machine learning (PML), which refers to the pairing of probabilistic modeling and machine learning methods, in pushing the frontier of combining flexibility, scalability, interpretability, and uncertainty quantification for building better models of consumers and their choices. Specifically, we overview both PML models and inference methods, and highlight their utility for addressing four common classes of marketing problems: (1) uncovering heterogeneity, (2) flexibly modeling nonlinearities and dynamics, (3) handling high-dimensional and unstructured data, and (4) addressing missingness, often via data fusion. We also discuss promising directions in enriching marketing models, reflecting recent developments in representation learning, causal inference, experimentation and decision-making, and theory-based behavioral modeling.
Working Papers:
-
Correlated Dynamics in Marketing Sensitivities
Ryan Dew, Yuhao Fan
Last updated: March 2024
[
Show
Abstract
] [
Working
Paper
]
Understanding individual customers' sensitivities to prices, promotions, brands, and other marketing mix elements is fundamental to a wide swath of marketing problems. An important but understudied aspect of this problem is the dynamic nature of these sensitivities, which change over time and vary across individuals. Prior work has developed methods for capturing such dynamic heterogeneity within product categories, but neglected the possibility of correlated dynamics across categories. In this work, we introduce a framework to capture such correlated dynamics using a hierarchical dynamic factor model, where individual preference parameters are influenced by common cross-category dynamic latent factors, estimated through Bayesian nonparametric Gaussian processes. We apply our model to grocery purchase data, and find that a surprising degree of dynamic heterogeneity can be accounted for by only a few global trends. We also characterize the patterns in how consumers' sensitivities evolve across categories. Managerially, the proposed framework not only enhances predictive accuracy by leveraging cross-category data, but enables more precise estimation of quantities of interest, like price elasticity.
-
Your MMM Is Broken: Identification of Nonlinear and Dynamic Effects in Marketing Mix Models
Ryan Dew, Nicolas Padilla, Anya Shchetkina
Authors contributed equally. Last updated: August 2024
[
Show
Abstract
] [
Working Paper
] [
Web Appendix
]
Recent years have seen a resurgence in interest in marketing mix models (MMMs), which are aggregate-level models of marketing effectiveness. Often these models incorporate nonlinear effects, and either implicitly or explicitly assume dynamic, or time-varying, effects. In this paper, we show that nonlinear and dynamic effects are often not identifiable from standard marketing mix data: while certain data patterns may be suggestive of nonlinear effects, such patterns may also emerge under simpler models that incorporate dynamics in marketing effectiveness. This lack of identification is problematic because nonlinearities and dynamics suggest fundamentally different optimal marketing allocations. We examine this identification issue through theory and simulations, wherein we explore the exact conditions under which conflation between the two types of models is likely to occur. In doing so, we introduce a flexible Bayesian nonparametric model that allows us to both flexibly simulate and estimate different data generating processes. We show that conflating the two types of effects is especially likely in the presence of autocorrelated marketing variables, which is common in practice, especially given the common use of stock variables to capturing long-run effects of advertising. We illustrate these ideas through numerous empirical applications to real-world marketing mix data, showing the prevalence of the conflation issue in practice. Finally, we show how marketers can avoid this conflation, by designing experiments that strategically manipulate spending in ways that pin down model form.
Selected Research in Progress:
-
Optimal Product Design Synthesis: Pairing Generative Models with Adaptive Preference Measurement
with Weixin He
-
Using Haptic Response to Understand and Predict Consumer Preferences and Behavior
with Maximilian Gaerth, Cait Lamberton, & Stefano Puntoni
-
Unified Marketing Measurement and Optimal Test Timing
with Nicolas Padilla
-
Graph Representation Learning for Inferring Market Structure
with Mingyung Kim
-
How Do Influencers Learn From Feedback?
with Zijun Tian and Raghu Iyengar