Research Overview:
I am an applied methodologist, interested in the intersection of modern probabilistic machine learning and marketing. I focus on problems in customer analytics, preference measurement, and design, with an eye to developing and applying flexible, interpretable, computational tools to drive insights in these domains. I also study how rich, unstructured data like text and images can be used in classic data-driven marketing contexts, like preference measurement, and in contexts which previously have been difficult to study from a data-driven perspective, like branding and design. Methodologically, I am interested in Bayesian nonparametrics, Bayesian computation, deep generative models, and representation learning. I'm honored to have received several awards for my research, including the 2018 INFORMS Society for Marketing Science Doctoral Dissertation Award, and the 2018 Marketing Section of the American Statistical Association's Doctoral Research Award, as well as to have been a finalist for the Frank M. Bass and Paul Green Awards.
Publications:
- Bayesian Nonparametric Customer Base Analysis with Model-based Visualizations
Ryan Dew and Asim Ansari
Marketing Science, 2018
This paper was a finalist for the 2019 Frank M. Bass Award.
[Show Abstract] [Paper] [Code Notebook] [Replication Data] [Stan Code]
Modern marketers are responsible for understanding and managing customer spending behavior across many different products. Dynamics in spending result from both predictable customer-level effects, which are characterized by interpurchase time, customer lifetime, and past purchase frequency, as well as calendar time effects, which are driven by managerial actions such as product changes and promotions, and by general trends and random shocks. Understanding these dynamics in spending is further complicated by a lack of knowledge of all of the factors that influence spending for a given product, a problem exacerbated in large multiproduct firms by information asymmetries that can exist between the product teams that execute marketing actions and the marketing analytics team responsible for customer base analysis. A comprehensive understanding of customer base dynamics therefore requires a modeling framework that flexibly integrates both known and unknown calendar time determinants of spending with the individual-level effects that robustly predict spend activity. In this paper, we develop a Bayesian nonparametric framework based on Gaussian process priors to understand and predict customer spending. Our model separates out calendar time effects from individual-level dynamics by modeling both sets of factors as unknown latent functions that jointly determine spend propensity. The primary output of our Gaussian Process Propensity Model (GPPM) is a set of estimated curves that provides a visual and easily comprehensible representation of purchasing dynamics, which we call the model-based dashboard. We illustrate the utility of our modeling framework on data from two popular free-to-play mobile video games. We show how the GPPM's model-based dashboard can be useful for assessing patterns and disruptions in spending. We also show how the GPPM exhibits superior forecasting ability compared to existing customer base analysis benchmarks, including hazard and buy-till-you-die models.
- Modeling Dynamic Heterogeneity using Gaussian Processes
Ryan Dew, Asim Ansari, Yang Li
Journal of Marketing Research, 2020
This paper was a finalist for the 2020 Paul Green Award.
[Show Abstract] [Paper]
Marketing research relies on individual-level estimates to understand the rich heterogeneity that exists in consumers, firms, and products. While much of the literature focuses on capturing static cross-sectional heterogeneity, little research has been done on modeling dynamic heterogeneity, or the heterogeneous evolution of individual-level model parameters. In this work, we propose a novel framework for capturing the dynamics of heterogeneity, using individual-level, latent, Bayesian nonparametric Gaussian processes. Similar to standard heterogeneity specifications, our Gaussian Process Dynamic Heterogeneity (GPDH) specification models individual-level parameters as flexible variations around population-level trends, allowing for sharing of statistical information both across individuals and within individuals over time. This hierarchical structure provides precise individual-level insights regarding parameter dynamics. We show that GPDH nests existing heterogeneity specifications, and that not flexibly capturing individual-level dynamics may result in biased parameter estimates. Substantively, we apply GPDH to two problems: understanding preference dynamics, and modeling the evolution of online reviews. Across both applications, we find robust evidence of dynamic heterogeneity, and illustrate GPDH's rich managerial insights, with implications for targeting, pricing, and market structure analysis.
- Letting Logos Speak: Leveraging Multiview Representation Learning for Data-Driven Logo Design
Ryan Dew, Asim Ansari, Olivier Toubia
Marketing Science, 2022.
This paper is a finalist for the 2022 Frank M. Bass and John D.C. Little Awards.
[Show Abstract] [Paper] [Explore Our Data] [Personality-based Logo Generator]
Logos serve a fundamental role in branding as the visual figurehead of the brand. Yet, due to the difficulty of using unstructured image data, prior research on logo design has been largely limited to non-quantitative studies. In this work, we explore logo design from a data-driven perspective. In particular, we aim to answer several key questions: first, to what degree can logos represent a brand's personality? Second, what are the key visual elements in logos that elicit brand and firm relevant associations, such as brand personality traits? Finally, given text describing a firm's brand or function, can we suggest features of a logo that elicit the firm's desired image? To answer these questions, we develop a novel logo feature extraction algorithm, that uses modern image processing tools to decompose unstructured pixel-level image data into meaningful visual features. We then analyze the links between firm identity and the features of logos through a deep, multiview generative model, which links visual features of logos with textual descriptions of firms and consumer ratings of brand personality by learning representations of brand identity. We apply our modeling framework on a dataset of hundreds of logos, textual descriptions from firms' websites, third party descriptions of firms, and consumer evaluations of brand personality to explore these questions.
Working Papers:
- Detecting Routines in Ride-sharing: Implications For Customer Management
Ryan Dew, Eva Ascarza, Oded Netzer, and Nachum Sicherman
Conditionally accepted at Journal of Marketing Research
[Show Abstract] [Working Paper]
Routines are central to consumer behavior in many industries, including ride-sharing, where consumers may use the same app to take the same trips on a regular basis. While prior work has established the importance of repeat behavior for marketing, little work has been done to understand the implications of routines, which we define as repeated behavior with a distinct, recurring, temporal structure. Partly, this lack of research stems from the statistical problem of estimating routines. In this paper, we propose a new approach to measuring routine usage, which we apply in the context of ride-sharing. Specifically, we model usage of the platform as an individual-level inhomogeneous Poisson point process, where the rate of usage is determined partly by a Bayesian nonparametric Gaussian process. In estimating this rate function, we leverage a unique cyclical kernel structure, that allows for precise estimation of recurrent behavior. We then use this model to estimate individual-level routines in usage of a ride-sharing service. We show that more routine users tend to be more valuable customers, with high individual-level “routineness” being significantly associated with higher future usage and lower churn rates.
- Mega or Micro? Influencer Selection Using Follower Elasticity
Zijun Tian, Ryan Dew, and Raghu Iyengar
R+R at Journal of Marketing Research
[Show Abstract] [Working Paper]
Despite the explosive growth of influencer marketing, wherein companies sponsor social media personalities to promote their brands, there is little research to guide companies’ selection of influencer partners. One common criterion is popularity: while some firms sponsor “mega” influencers with millions of followers, other firms partner with “micro” influencers, who may only have several thousands of followers, but may also cost less to sponsor. To quantify this trade-off between reach and cost, we develop a framework for estimating the follower elasticity of impressions, or FEI, which measures a video’s percentage gain in impressions corresponding to a percentage increase in the follower size of its creator. Computing FEI involves estimating the causal effect of an influencer’s popularity on the view counts of their videos, which we achieve through a combination of a unique dataset collected from TikTok, a representation learning model for quantifying video content, and a machine learning-based causal inference method. We find that FEI is always positive, but often nonlinearly related to follower size, suggesting different optimal sponsorship strategies than those observed in practice. We examine the factors that predict variation in these FEI curves, and show how firms can use these results to better determine influencer partnerships.
- A Gaussian Process Model of Cross-Category Dynamics in Brand Choice
Ryan Dew, Yuhao Fan
[Show Abstract] [Working Paper]
Understanding individual customers’ sensitivities to prices, promotions, brand, and other aspects of the marketing mix is fundamental to a wide swath of marketing problems, including targeting and pricing. Companies that operate across many product categories have a unique opportunity, insofar as they can use purchasing data from one category to augment their insights in another. Such cross-category insights are especially crucial in situations where purchasing data may be rich in one category, and scarce in another. An important aspect of how consumers behave across categories is dynamics: preferences are not stable over time, and changes in individual-level preference parameters in one category may be indicative of changes in other categories, especially if those changes are driven by external factors. Yet, despite the rich history of modeling cross-category preferences, the marketing literature lacks a framework that flexibly accounts for correlated dynamics, or the cross-category interlinkages of individual-level sensitivity dynamics. In this work, we propose such a framework, leveraging individual-level, latent, multi-output Gaussian processes to build a nonparametric Bayesian choice model that allows information sharing of preference parameters across customers, time, and categories. We apply our model to grocery purchase data, and show that our model detects interesting dynamics of customers’ price sensitivities across multiple categories. Managerially, we show that capturing correlated dynamics yields substantial predictive gains, relative to benchmarks. Moreover, we find that capturing correlated dynamics can have implications for understanding changes in consumers preferences over time, and developing targeted marketing strategies based on those dynamics.
- Preference Measurement with Unstructured Data, with Applications to Adaptive Onboarding Surveys
Ryan Dew
[Show Abstract] [Request Working Paper]
A common problem in recommendation engines is the cold start problem: how can we make a recommendation to a new customer, without any prior purchase data? Such problems are particularly salient for increasingly common online subscription businesses, where initial recommendations can shape whether potential customers decide to subscribe, and how their preferences evolve subsequently. The need to assess a new customers' preferences quickly, and without prior purchase data, has led to the increasing prevalence of customer onboarding surveys, wherein companies ask potential or current customers a series of questions aimed at understanding their preferences, without having observed any purchasing. While such onboarding surveys are a relatively recent development in e-commerce, the idea of learning the most information about a customer’s preferences as possible using the fewest questions has been studied extensively in “offline” marketing research, in the context of adaptive conjoint analysis. In this work, I bridge these two domains using a combination of representation learning for unstructured data, and Bayesian optimization for on-the-fly estimation of preferences. I apply this framework both in the context of an on-boarding survey for an online subscription business, and in the context of traditional preference measurement.
Selected Research in Progress:
- Bayesian Analysis of A/B Tests with Partially Observed Assignment: An Application to Free Cancellation Programs
Yuhao Fan, Ryan Dew, Eric T. Bradlow, Peter Fader
- Unified Marketing Measurement under Privacy Regulations
with Nicolas Padilla