9.57am-10.00am Zoltan Szabo
Opening Words
10.00am-10.30am Wicher Bergsma (Social Statistics)
Model-based estimation of a Gaussian covariance or precision kernel
Considering a Gaussian setting, a variety of useful models involve linear restrictions on the covariance kernel or the precision matrix. A key example is graphical models involving patterns of zeroes in the precision matrix. Alternatively, stationary Gaussian distributions involve linear restrictions on the covariance kernel, or, equivalently, the precision kernel. Furthermore, covariate information can be encoded via linear restrictions, in order to improve both estimation and understanding of the population distribution.
As a mathematical framework for sets of linearly restricted positive definite kernels, incorporating the aforementioned examples, we introduce a class of families of reproducing kernel Krein spaces. For each family, a generalized Wishart/inverse-Wishart prior can serve as a prior on the convex cone of positive definite kernels, allowing an (empirical) Bayes estimator for the covariance or precision kernel. This approach also addresses the difficulty of ensuring that the estimated covariance/precision kernel is positive definite.
10.30am-11.00am Tom Dorrington Ward (Engage Smarter)
Evaluating and assuring AI Agents for financial services
The arrival of ChatGPT in November 2022 started a new wave of “generative AI” applications. One area with huge potential for generative AI to make a difference is in helping people to make better financial decisions. However, using AI Agents to provide consumer financial guidance requires assurance: there are risks with providing incorrect guidance and advice is a regulated activity. In this short talk, Tom Dorrington Ward, CTO & Co-Founder of Engage Smarter AI will survey emerging techniques – including AI architectures, and evaluation processes – for evaluating and assuring AI Agents. He will also highlight elements which make expert financial guidance a particularly complex use case. Finally, he will describe how Engage Smarter AI’s own framework for evaluating and assuring AI Agents in financial services brings together these different elements.
11.00am-11.30am Ieva Kazlauskaite (Data Science - from August)
Calculating exposure to extreme sea level risk will require high resolution ice sheet models
The West Antarctic Ice Sheet (WAIS) is losing ice and its annual contribution to sea level is increasing. The future behaviour of WAIS will impact societies worldwide, yet deep uncertainty remains in the expected rate of ice loss. High-impact low-likelihood scenarios of sea level rise are needed by risk-averse stakeholders but are particularly difficult to constrain. In this work, we combine traditional model simulations of the Amundsen Sea sector of WAIS with Gaussian process emulation to show that ice-sheet models capable of resolving kilometre-scale basal topography will be needed to assess the probability of extreme scenarios of sea-level rise. This resolution exceeds many state-of-the-art continent-scale simulations. Our ice-sheet model simulations show that coarser resolutions tend to project higher sea-level contributions than finer resolutions, inflating the tails of the distribution. We therefore caution against relying purely upon simulations coarser than 4-5km when assessing the potential for societally important high-impact sea level rise.
(11.30am-12.00pm Break)
12.00pm-12.30pm Despoina Makariou (St. Gallen)
Estimation of heterogeneous treatment effects in the primary catastrophe bond market using causal forests
A causal machine learning approach for estimating heterogeneous treatment effects in the primary catastrophe bond market
We introduce a causal random forest approach to predict treatment heterogeneity in alternative capital markets. We focus on predicting the effect of issuance timing in the spreads of an insurance linked security called catastrophe bond. Studying the issuance timing is important for optimising the cost of capital and ensuring the success of the bond offering. We construct a causal random forest and we find that issuing a catastrophe bond in the first half of a calendar year is associated to a lower spread and this result varies according to several factors such as market conditions, type of the underlying asset and size of the issuance.
12.30pm-1.00pm Zoltan Szabo (Data Science)
Minimax Rate of HSIC Estimation
Kernel techniques (such as Hilbert-Schmidt independence criterion - HSIC; also called distance covariance) are among the most powerful approaches in data science and statistics to measure the statistical independence of M ≥ 2 random variables. Despite various existing HSIC estimators designed since its introduction close to two decades ago, the fundamental question of the rate at which HSIC can be estimated is still open; this forms the focus of the talk for translation-invariant kernels on R^d. [This is joint work with Florian Kalinke. Preprint: https://arxiv.org/abs/2403.07735]
(1.00pm-2.30pm Lunch)
2.30pm-3.00pm Yining Chen (Data Science / Time Series and Statistical Learning)
Detecting changes in production frontiers
3.00pm-3.30pm Kostas Kardaras (Probability in Finance and Insurance)
Equilibrium models of production and capacity expansion
We consider a model with producers making decisions on how much to produce and how much to invest in expansion of capacity of future production. With demand functions exogenously given, we study a multi-agent setting where prices are formed within equilibrium. Depending on the form of the production function, this leads to either a singular or standard control problem. The solutions to the latter are either given explicitly, or characterised via a second-order non-linear ODE. (Based on works with Junchao Jia, Alexander Pavlis and Michael Zervos.)
3.30pm-4.00pm Dima Karamshuk (Meta)
Content Moderation at Scale – Protecting Integrity of Online Communities on Meta Platforms
To enable content moderation on large social media platforms it is important to timely detect harmful viral content. The detection problem is difficult because content virality results from interactions between user interests, content characteristics, feed ranking, and community structure.
This talk will shed the light on the design of the algorithms which can efficiently solve this problem at Meta scale.
(4.00pm-4.30pm Break)
4.30pm-5.00pm Giulia Livieri (Probability in Finance and Insurance)
On Mean Field Games and Applications
Mean field games theory is a branch of game theory, namely a set of concepts, mathematical tools, theorems, and algorithms, which, like all game theory, helps (micro- or macro-) economists, sociologists, engineers, and even urban planners, to model situations of agents who take decisions in a context of strategic interactions. In this talk, I will introduce mean field games through some “toy models” to progressively discover the concepts and the mathematics behind this theory. I will possibly conclude with the presentation of some very preliminary results on a mean field game model of shipping, where the model is also calibrated on real data (co-authors: Michele Bergami, Simone Moawad, Barath Raaj Suria Narayanan (PG students, LSE); Evan Chien Yi Chow (ADIA); Charles-Albert Lehalle).
5.00pm-5.30pm Chengchun Shi (Data Science / Time Series and Statistical Learning)
Switchback designs can enhance policy evaluation in reinforcement learning
Time series experiments, in which experimental units receive a sequence of treatments over time, are prevalent in technological companies, including ride-sharing platforms and trading companies. These companies frequently employ such experiments for A/B testing, to evaluate the performance of a newly developed policy, product, or treatment relative to a baseline control. Many existing solutions require that the experimental environment be fully observed to ensure the data collected satisfies the Markov assumption. This condition, however, is often violated in real-world scenarios. Such gap between theoretical assumptions and practical realities challenges the reliability of existing approaches and calls for more rigorous investigations of A/B testing procedures.
In this paper, we study the optimal experimental design for A/B testing in partially observable environments. We introduce a controlled (vector) autoregressive moving average model to effectively capture a rich class of partially observable environments. Within this framework, we derive closed-form expressions, i.e., efficiency indicators, to assess the statistical efficiency of various sequential experimental designs in estimating the average treatment effect (ATE). A key innovation of our approach lies in the introduction of a weak signal assumption, which significantly simplifies the computation of the asymptotic mean squared errors of ATE estimators in time series experiments. We next proceed to develop two data-driven algorithms to estimate the optimal design: one utilizing constrained optimization, and the other employing reinforcement learning. We demonstrate the superior performance of our designs using a dispatch simulator and two real datasets from a ride-sharing company.
6.00pm-8.00pm Poster Session & Reception