Representation learning for relational data
In this talk, we will consider representation learning of relational data which can be represented by a hypergraph. A hypergraph is defined by a set of nodes and a set of hyperedges (subsets of nodes). Many relational data can be naturally represented by hypergraphs – e.g. sets of items in recommender systems, entities and relations in knowledge graphs, and teams of online users in online platforms. For example, in online labor platforms, nodes may represent online workers, hyperedges may represent teams of online workers, and the existence of a hyperedge may indicate a successful team collaboration. A challenge is to learn representations of individual items that explain observed data at a hyperedge level, e.g. to learn skills of individual users from observed team performance outputs. Based on these representations, we are able to answer queries such as identifying workers with high abilities and predicting the outcome for new projects.
We will consider a model of random hypergraphs known as the generalized beta model, where each node is associated with an unknown scalar parameter and a hyperedge exists with a probability that is according to a logistic function of the sum of individual node parameters. The well-known beta model of random graphs is accommodated as a special case. There are several fundamental open questions for generalized beta models such as (a) statistical inference questions on the existence and uniqueness of the maximum likelihood estimate of the model parameters and how this is related to graph-theoretic properties of the observed data, and (b) computational and statistical efficiency of algorithms for parameter estimation.
We will discuss relations between different conditions for the existence and uniqueness of the maximum likelihood parameter estimator. A key property for bounding the mean squared error of the maximum likelihood parameter estimator is the smallest eigenvalue of the Gram matrix of the design matrix. For beta models, we will show that this eigenvalue is strictly positive if, and only if, a certain graph-theoretic measure of non-bipartiteness is strictly positive. We will further show how this correspondence can be extended to certain classes of hypergraphs.
For the rest of the talk, we will review known results on the efficiency of algorithms for parameter estimation and certain relevant statistical hypotheses testing problems. The goal is to identify all items or a set of items with high abilities with a minimum sampling complexity. This will cover the work on group testing which are commonly studied for Boolean parameter vectors. This existing work provides results on the computation complexity of algorithms under certain assumptions how hyperedges are tested (either in an adaptive or a non-adaptive manner). We will discuss interesting open questions in this direction, as well as some alternative models that try to capture intricate interactions between items and their group performance.