Adaptive Experimentation with Delayed Binary Feedback
(with Carlos Carrion, Xiliang Lin, Fuhua Ji, Yongjun Bao, and Weipeng Yan)
WWW 2022
,
Simulation Code
,
Blog Post
Abstract
Conducting experiments with objectives that take significant delays to materialize (e.g. conversions, add-to-cart events, etc.) is challenging. Although the classical "split sample testing" is still valid for the delayed feedback, the experiment will take longer to complete, which also means spending more resources on worse-performing strategies due to their fixed allocation schedules. Alternatively, adaptive approaches such as "multi-armed bandits" are able to effectively reduce the cost of experimentation. But these methods generally cannot handle delayed objectives directly out of the box. This paper presents an adaptive experimentation solution tailored for delayed binary feedback objectives by estimating the real underlying objectives before they materialize and dynamically allocating variants based on the estimates. Experiments show that the proposed method is more efficient for delayed feedback compared to various other approaches and is robust in different settings. In addition, we describe an experimentation product powered by this algorithm. This product is currently deployed in the online experimentation platform of this http URL, a large e-commerce company and a publisher of digital ads.
Blending Advertising with Organic Content in E-Commerce: A Virtual Bids Optimization Approach
(with Carlos Carrion, Harikesh Nair, Xianghong Luo, Yulin Lei, Xiliang Lin, Wenlong Chen, Qiyu Hu, Changping Peng, Yongjun Bao, and Weipeng Yan)
Abstract
In e-commerce platforms, sponsored and non-sponsored content are jointly displayed to users and both may interactively influence their engagement behavior. The former content helps advertisers achieve their marketing goals and provides a stream of ad revenue to the platform. The latter content contributes to users’ engagement with the platform, which is key to its long-term health. A burning issue for e-commerce platform design is how to blend advertising with content in a way that respects these interactions and balances these multiple business objectives. This paper describes a system developed for this purpose in the context of blending personalized sponsored content with non-sponsored content on the product detail pages of JD.COM, an e-commerce company. This system has three key features: (1) Optimization of multiple competing business objectives through a new virtual bids approach and the expressiveness of the latent, implicit valuation of the platform for the multiple objectives via these virtual bids. (2) Modeling of users’ click behavior as a function of their characteristics, the individual characteristics of each sponsored content and the influence exerted by other sponsored and non-sponsored content displayed alongside through a deep learning approach; (3) Consideration of externalities in the allocation of ads, thereby making it directly compatible with a Vickrey-Clarke-Groves (VCG) auction scheme for the computation of payments in the presence of these externalities. The system is currently deployed and serving all traffic through JD.COM’s mobile application. Experiments demonstrating the performance and advantages of the system are presented.
Causal Meta-Mediation Analysis: Inferring Dose-Response Function From Summary Statistics of Many Randomized Experiments
(with Xuan Yin, Tianbo Li, and Liangjie Hong)
KDD 2020
,
Simulation Code
,
Blog Post
Abstract
It is common in the internet industry to use offline-developed algorithms to power online products that contribute to the success of a business. Offline-developed algorithms are guided by offline evaluation metrics, which are often different from online business key performance indicators (KPIs). To maximize business KPIs, it is important to pick a north star among all available offline evaluation metrics. By noting that online products can be measured by online evaluation metrics, the online counterparts of offline evaluation metrics, we decompose the problem into two parts. As the offline A/B test literature works out the first part: counterfactual estimators of offline evaluation metrics that move the same way as their online counterparts, we focus on the second part: causal effects of online evaluation metrics on business KPIs. The north star of offline evaluation metrics should be the one whose online counterpart causes the most significant lift in the business KPI. We model the online evaluation metric as a mediator and formalize its causality with the business KPI as dose-response function (DRF). Our novel approach, causal meta-mediation analysis, leverages summary statistics of many existing randomized experiments to identify, estimate, and test the mediator DRF. It is easy to implement and to scale up, and has many advantages over the literature of mediation analysis and meta-analysis. We demonstrate its effectiveness by simulation and implementation on real data.
Research Transparency Is on the Rise in Economics
(with Garret Christensen, Elizabeth Paluck, Nicholas Swanson, David J. Birke, Edward Miguel, and Rebecca Littman)
AEA Papers and Proceedings, 2020
Abstract
Has there been meaningful movement toward open science practices within the social sciences in recent years? Discussions about changes in practices such as posting data and pre-registering analyses have been marked by controversy—including controversy over the extent to which change has taken place. This study, based on the State of Social Science (3S) Survey, provides the first comprehensive assessment of awareness of, attitudes towards, perceived norms regarding, and adoption of open science practices within a broadly representative sample of scholars from four major social science disciplines: economics, political science, psychology, and sociology. We observe a steep increase in adoption: as of 2017, over 80% of scholars had used at least one such practice, rising from one quarter a decade earlier. Attitudes toward research transparency are on average similar between older and younger scholars, but the pace of change differs by field and methodology. According with theories of normal science and scientific change, the timing of increases in adoption coincides with technological innovations and institutional policies. Patterns are consistent with most scholars underestimating the trend toward open science in their discipline.
The Environmental and Economic Consequences of Internalizing Border Spillovers
(with Shaoda Wang)
(Revise and Resubmit)
,
American Economic Journal: Applied Economics
Abstract
This paper studies how centralized decision-making can help local governments internalize regional environmental spillovers, and investigates the associated economic and welfare consequences. Utilizing novel firm-level geocoded emission and production panel datasets, and exploiting more than 3000 cases of township mergers in China, we find that as township mergers eliminate the borders between neighboring townships, the negative externalities of polluting firms located on these borders are suddenly internalized by the new jurisdiction. As a result, these firms spend more effort on emission abatement, which leads to lower emissions, as well as lower output and profit levels. Further analysis suggests that household welfare improves with the internalization of border spillovers, as reflected by increased residential land price around the merging borders.
SmartStorage: Automated Storage System with Reinforcement Learning
(with Fengshi Niu)
Github Repo
Abstract
This paper applies model-based deep reinforcement learning to solve a simplified storage assignment problem. First, we train an LSTM order predictor from a long historical order sequence and use it for state transformation and reward variance reduction. Second, we run an approximate value iteration until convergence. Our algorithm is specifically designed to address the tradeoff between travel-time efficiency and the repositioning costs. Our experiments evaluate this algorithm in a variety of simulated environments with a varying number of products (<= 1000) and different stochastic order processes. In all cases, our algorithm significantly reduces overall storage costs compared to random assignment heuristic. The performance gap between our algorithm and the oracle tabular value iteration with access to latent order probability is shown to be small. Our experiments also show tentative evidence that our algorithm scales up linearly in training time per iteration with respect to the number of products, despite the factorial growth of the permutations.