Ryan Dew: Homepage

Research Overview:

I am an applied methodologist, interested in the intersection of modern probabilistic machine learning and marketing. I focus on problems in customer analytics, preference measurement, and design, with an eye to developing and applying flexible, interpretable, computational tools to drive insights in these domains. I also study how rich, unstructured data like text and images can be used in classic data-driven marketing contexts, like preference measurement, and in contexts which previously have been difficult to study from a data-driven perspective, like branding and design. Methodologically, I am interested in Bayesian nonparametrics, Bayesian computation, deep generative models, and representation learning. I'm honored to have received several awards for my research, including the 2022 Frank M. Bass Award, the 2018 INFORMS Society for Marketing Science Doctoral Dissertation Award, and the 2018 Marketing Section of the American Statistical Association's Doctoral Research Award. My papers have also been finalists for the John D.C. Little Award and Paul Green Awards. In recognition of my research, I was named a 2023 MSI Young Scholar, and was the Govil Family Faculty Scholar at the Wharton School from 2024-2026.

Publications:

Bayesian Nonparametric Customer Base Analysis with Model-based Visualizations
Ryan Dew and Asim Ansari
Marketing Science, 2018
- Finalist, 2019 Frank M. Bass Award
[ Show Abstract ] [ Paper (SSRN) ] [ Paper (Journal) ]
[ Code Notebook ] [ Replication Data ] [ Stan Code ]

Modern marketers are responsible for understanding and managing customer spending behavior across many different products. Dynamics in spending result from both predictable customer-level effects, which are characterized by interpurchase time, customer lifetime, and past purchase frequency, as well as calendar time effects, which are driven by managerial actions such as product changes and promotions, and by general trends and random shocks. Understanding these dynamics in spending is further complicated by a lack of knowledge of all of the factors that influence spending for a given product, a problem exacerbated in large multiproduct firms by information asymmetries that can exist between the product teams that execute marketing actions and the marketing analytics team responsible for customer base analysis. A comprehensive understanding of customer base dynamics therefore requires a modeling framework that flexibly integrates both known and unknown calendar time determinants of spending with the individual-level effects that robustly predict spend activity. In this paper, we develop a Bayesian nonparametric framework based on Gaussian process priors to understand and predict customer spending. Our model separates out calendar time effects from individual-level dynamics by modeling both sets of factors as unknown latent functions that jointly determine spend propensity. The primary output of our Gaussian Process Propensity Model (GPPM) is a set of estimated curves that provides a visual and easily comprehensible representation of purchasing dynamics, which we call the model-based dashboard. We illustrate the utility of our modeling framework on data from two popular free-to-play mobile video games. We show how the GPPM's model-based dashboard can be useful for assessing patterns and disruptions in spending. We also show how the GPPM exhibits superior forecasting ability compared to existing customer base analysis benchmarks, including hazard and buy-till-you-die models.

Modeling Dynamic Heterogeneity using Gaussian Processes
Ryan Dew, Asim Ansari, Yang Li
Journal of Marketing Research, 2020
- Finalist, 2020 Paul Green Award
[ Show Abstract ] [ Paper (Open Access) ]

Marketing research relies on individual-level estimates to understand the rich heterogeneity that exists in consumers, firms, and products. While much of the literature focuses on capturing static cross-sectional heterogeneity, little research has been done on modeling dynamic heterogeneity, or the heterogeneous evolution of individual-level model parameters. In this work, we propose a novel framework for capturing the dynamics of heterogeneity, using individual-level, latent, Bayesian nonparametric Gaussian processes. Similar to standard heterogeneity specifications, our Gaussian Process Dynamic Heterogeneity (GPDH) specification models individual-level parameters as flexible variations around population-level trends, allowing for sharing of statistical information both across individuals and within individuals over time. This hierarchical structure provides precise individual-level insights regarding parameter dynamics. We show that GPDH nests existing heterogeneity specifications, and that not flexibly capturing individual-level dynamics may result in biased parameter estimates. Substantively, we apply GPDH to two problems: understanding preference dynamics, and modeling the evolution of online reviews. Across both applications, we find robust evidence of dynamic heterogeneity, and illustrate GPDH's rich managerial insights, with implications for targeting, pricing, and market structure analysis.

Letting Logos Speak: Leveraging Multiview Representation Learning for Data-Driven Logo Design
Ryan Dew, Asim Ansari, Olivier Toubia
Marketing Science, 2022
- Winner, 2022 Frank M. Bass Award
- Finalist, 2022 John D.C. Little Award
[ Show Abstract ] [ Paper (SSRN) ] [ Paper (Journal) ]
[ Explore Our Data ] [ Personality-based Logo Generator ]

Logos serve a fundamental role in branding as the visual figurehead of the brand. Yet, due to the difficulty of using unstructured image data, prior research on logo design has been largely limited to non-quantitative studies. In this work, we explore logo design from a data-driven perspective. In particular, we aim to answer several key questions: first, to what degree can logos represent a brand's personality? Second, what are the key visual elements in logos that elicit brand and firm relevant associations, such as brand personality traits? Finally, given text describing a firm's brand or function, can we suggest features of a logo that elicit the firm's desired image? To answer these questions, we develop a novel logo feature extraction algorithm, that uses modern image processing tools to decompose unstructured pixel-level image data into meaningful visual features. We then analyze the links between firm identity and the features of logos through a deep, multiview generative model, which links visual features of logos with textual descriptions of firms and consumer ratings of brand personality by learning representations of brand identity. We apply our modeling framework on a dataset of hundreds of logos, textual descriptions from firms' websites, third party descriptions of firms, and consumer evaluations of brand personality to explore these questions.

Detecting Routines: Applications to Ridesharing CRM
Ryan Dew, Eva Ascarza, Oded Netzer, and Nachum Sicherman
Journal of Marketing Research, 2024
[ Show Abstract ] [ Paper (Open Access) ] [ Code ]

Routines shape many aspects of day-to-day consumption. While prior work has established the importance of habits in consumer behavior, little work has been done to understand the implications of routines —- which we define as repeated behaviors with recurring, temporal structures —- for customer management. One reason for this dearth is the difficulty of measuring routines from transaction data, particularly when routines vary substantially across customers. We propose a new approach for doing so, which we apply in the context of ridesharing. We model customer-level routines with Bayesian nonparametric Gaussian processes (GPs), leveraging a novel kernel that allows for flexible yet precise estimation of routines. These GPs are nested in inhomogeneous Poisson processes of usage, allowing us to estimate customers' routines, and decompose their usage into routine and non-routine parts. We show the value of detecting routines for customer relationship management (CRM) in the context of ridesharing, where we find that routines are associated with higher future usage and activity rates, and more resilience to service failures. Moreover, we show how these outcomes vary by the types of routines customers have, and by whether trips are part of the customer's routine, suggesting a role for routines in segmentation and targeting.

Mega or Micro? Influencer Selection Using Follower Elasticity
Zijun Tian, Ryan Dew, and Raghu Iyengar

Journal of Marketing Research, 2024

Adaptive Preference Measurement with Unstructured Data
Ryan Dew
Management Science, 2024
[ Show Abstract ] [ Paper (Journal) ] [ Paper (SSRN) ] [ Web Appendix ] [ Code ]

Many products are most meaningfully described using unstructured data like text or images. Unstructured data are also common in e-commerce, where products are often described by photos and text, but not with standardized sets of attributes. While much is known about how to efficiently measure consumer preferences when products can be meaningfully described by structured attributes, there is scant research on doing the same for unstructured data. This paper introduces a real-time, adaptive survey design framework for measuring preferences over unstructured data, leveraging Bayesian optimization. By adaptively choosing items to display based on uncertainty around a nonparametric utility model, the proposed method maximizes information gain per question, enabling quick estimation of individual-level preferences. The approach operates on embeddings of the unstructured data, thereby eliminating the requirement for manual coding of product attributes. We apply the method to measuring preferences over clothing, and highlight its potential both for the general task of marketing research, and for the specific task of designing customer onboarding surveys to mitigate the cold-start recommendation problem. We also develop methods for interpreting the nonparametric utility functions, which allow us to reconstruct consumer valuations of discrete attributes, even for attributes that were not considered or available a priori.

Probabilistic Machine Learning: New Frontiers for Modeling Consumers and their Choices
Ryan Dew, Nicolas Padilla, Lan E. Luo, Shin Oblander, Asim Ansari, Khaled Boughanmi, Michael Braun, Fred Feinberg, Jia Liu, Thomas Otter, Longxiu Tian, Yixin Wang, and Mingzhang Yin
International Journal of Research in Marketing, 2026
[ Show Abstract ] [ Paper (Journal) ] [ Paper (SSRN) ] [ Code Companion ]

Making sense of massive, individual-level data is challenging: marketing researchers and analysts need flexible models that can accommodate rich patterns of heterogeneity and dynamics, work with and link diverse data types, and scale to modern data sizes. Practitioners also need tools that can quantify uncertainty in models and predictions of consumer behavior to inform optimal decision-making. In this paper, we demonstrate the promise of probabilistic machine learning (PML), which refers to the pairing of probabilistic modeling and machine learning methods, in pushing the frontier of combining flexibility, scalability, interpretability, and uncertainty quantification for building better models of consumers and their choices. Specifically, we overview both PML models and inference methods, and highlight their utility for addressing four common classes of marketing problems: (1) uncovering heterogeneity, (2) flexibly modeling nonlinearities and dynamics, (3) handling high-dimensional and unstructured data, and (4) addressing missingness, often via data fusion. We also discuss promising directions in enriching marketing models, reflecting recent developments in representation learning, causal inference, experimentation and decision-making, and theory-based behavioral modeling.

Working Papers:

Modeling Correlated Dynamics in Marketing Sensitivities
Ryan Dew, Yuhao Fan
Last updated: June 2025
[ Show Abstract ] [ Working Paper ]

Customer preferences for brands, prices, and other marketing mix elements vary over time, differ across individuals, and can be correlated across product categories. Leveraging these correlations allows data-rich categories to improve insights in sparser ones. Yet, while methods exist for modeling static cross-category correlations and single-category dynamics, none model correlated preference dynamics across categories. To fill this gap, the authors develop a Hierarchical Dynamic Factor (HDF) model that captures such correlated dynamics. HDF represents preference parameters via heterogeneous weights on common cross-category dynamic latent factors, estimated using Bayesian nonparametric Gaussian processes, enabling precise individual-level predictions of evolving preferences. Applying HDF to grocery purchase data reveals that preference changes are indeed correlated across categories, and that modeling these correlations improves pricing and targeting decisions.

Identification of Nonlinear and Dynamic Effects in Marketing Mix Models
Ryan Dew, Nicolas Padilla, Anya Shchetkina
Authors contributed equally. Last updated: June 2026
[ Show Abstract ] [ Working Paper ]

Recent years have seen a resurgence in interest in marketing mix models (MMMs), which are aggregate-level models of marketing effectiveness. Often these models incorporate nonlinear effects, and either implicitly or explicitly assume dynamic, or time-varying, effects. In this paper, we show that nonlinear and dynamic effects are often not identifiable from standard marketing mix data: while certain data patterns may be suggestive of nonlinear effects, such patterns may also emerge under simpler models that incorporate dynamics in marketing effectiveness. This lack of identification is problematic because nonlinearities and dynamics suggest fundamentally different optimal marketing allocations. We examine this identification issue through theory and simulations, wherein we explore the exact conditions under which conflation between the two types of models is likely to occur. In doing so, we introduce a flexible Bayesian nonparametric model that allows us to both flexibly simulate and estimate different data generating processes. We show that conflating the two types of effects is especially likely in the presence of autocorrelated marketing variables, which is common in practice, especially given the common use of stock variables to capturing long-run effects of advertising. We illustrate these ideas through numerous empirical applications to real-world marketing mix data, showing the prevalence of the conflation issue in practice. Finally, we show how marketers can avoid this conflation, by designing experiments that strategically manipulate spending in ways that pin down model form.

Learning Heterogeneity from Unstructured Data: An Application to Chatbot Personalization
Khai Chiong, Ryan Dew
Last updated: May 2026
[ Show Abstract ] [ Working Paper ]

Understanding and leveraging individual customer heterogeneity is fundamental to effective marketing personalization, yet little work has examined how unstructured data describing individual consumers can predict their preferences. We develop a hierarchical Bayesian framework that maps consumer-generated unstructured data, represented as embeddings, to heterogeneous preference parameters. Our approach generalizes the classic multilevel model to allow for neural networks to parameterize the link from customer data to model parameters. We estimate the model using amortized variational inference, enabling efficient, real-time targeting and preference learning. The Bayesian formulation naturally accommodates both observed heterogeneity (via embeddings) and unobserved heterogeneity (via posterior updating). We apply our method to chatbot personalization, where a user's initial query is mapped to expected stylistic preferences in the response. Our results demonstrate that substantial heterogeneity in style preferences can be predicted from query content alone, but incorporating choice data to uncover unobserved heterogeneity significantly improves prediction. Moreover, such improvements are systematic: specific attributes of conversations, and specific types of prompts benefit more from eliciting choice data than others, suggesting a role for adaptivity in how data are gathered. Our framework enables scalable chatbot personalization through prompt modifications without requiring model retraining. Beyond this application, our framework demonstrates how consumer-side unstructured data can enhance preference measurement in choice models, with applications extending to any setting where consumer-generated content can inform personalization strategies.

Optimal Product Design Synthesis: Pairing Generative Models with Adaptive Preference Measurement
Weixin He, Ryan Dew
[ Show Abstract ] [ Working Paper ]

Designing good products requires understanding consumer preferences across various product attributes. In many industries, like fashion, product attributes are high-dimensional and inseparable, and are most naturally described by unstructured data like images. While various approaches have been proposed for optimizing product designs over structured attributes, there is limited research on identifying ideal product designs using unstructured data. In this work, we propose pairing an adaptive nonparametric utility optimization framework with a generative image model based on Stable Diffusion in a preference measurement framework to identify individual-level ideal product designs. The framework uses hypothetical images generated on-the-fly in a survey. This approach allows the designer to provide an initial outline of the design, leaving detailed features to be generated by the model. The model produces query images that are designed to efficiently learn the types of products a consumer prefers. By the end of the survey, the framework can generate several distinct yet closely related product designs, each aligning with the consumer's preferences, which can aid designers in refining product concepts. We validate our method in two ways: first, in simulations, where we show that for a wide range of synthetic preference targets, the framework recovers a matching product; second, with real consumers, using a custom-built web application that deploys our framework to generate new shoe concepts. The empirical results show that designs identified by our framework are rated substantially more favorably than randomly generated benchmark designs from the same generative model.

Selected Research in Progress:

Unified Marketing Measurement and Optimal Test Timing
with Nicolas Padilla and Connor Campbell

Graph Representation Learning for Inferring Market Structure
with Mingyung Kim

Modeling Habit Formation
with Christophe Van den Bulte

How Do Influencers Learn From Feedback?
with Zijun Tian and Raghu Iyengar