Overview

Link prediction is a task with a variety of real-world applications, such as recommendation and data mining. Link prediction models are often transductive, meaning that they learn a fixed representation for each node in a graph. This is a powerful approach as it allows the representations to be pre-trained or optimized end-to-end with downstream objectives, but comes with apparent drawbacks that could limit their applicability in realistic settings where graphs evolve over time. As new nodes and edges appear, these models become stale, suggesting they may struggle to generalize to unseen nodes whose representations have not yet been optimized. An alternative approach is to perform inductive learning, which primarily relies on the presence of node attributes that are not always available in practice.

The goal of this project is to quantify the limitations of transductive learning approaches to link prediction under an evolving distribution. We will evaluate these methods from both an intrinsic perspective by examining the changes in the learned representations relative to the evolution of the graph over time, as well as extrinsically, by measuring each model's ability to maintain its performance as time passes beyond the period in which it was originally trained. These experiments will yield insight into the ability of each model to generalize to unseen nodes and edges, deepening our understanding of generalization in graph representation learning methods and quantitatively demonstrating the need (or lack thereof) for "fresh" models that stay up-to-date with the graph as it evolves.