Our work on how LLMs store relations selected as NeurIPS Spotlight paper
When we ask large language models questions, they often seem to “know” facts about the world, such as which city is in which country, or what language is spoken where. But how do these models store and retrieve that knowledge?
A recent study (Hernandez et al., 2023) found that relations like “capital of” are often represented by linear relationships in the latent space of models. Our research takes this line of work further by looking at collections of relations at once.
Our first observation is that these operators don’t really act like unique keys for each type of relation. Instead, many of them overlap, pointing to the same underlying “properties” of things. For example, “in which country do they speak language X” and “in which country is landmark X” both tie back to the broad property “country of X.”
We have found that, without losing much accuracy, a large set of such linear operators can be compressed into a much more compact linear formula called a tensor network. A particularly interesting case is mathematical relations like “X is larger than Y by Z,” where such tensor networks achieve almost perfect generalization.