HUN-REN Alfréd Rényi Institute of Mathematics

Artificial Intelligence Group

Publications

Diverse beam search to find densest-known planar unit distance graphs

Experimental Mathematics

13 Jun 2025

Peter Engel (Cornell University)

Owen Hammond-Lee (Georgia Institute of Technology)

Yiheng Su (University of Wisconsin–Madison)

Dániel Varga

Pál Zsámboki

Beam Search

Unit Distance Graphs

Exploitation-Exploration

Planar Geometry

Combinatorial Optimization

This paper addresses the problem of determining the maximum number of edges in a unit distance graph (UDG) of n vertices using computer search. An unsolved problem of Paul Erdős asks the maximum number of edges 𝑢⁡(𝑛) a UDG of n vertices can have. Those UDGs that attain 𝑢⁡(𝑛) are called “maximally dense.” In this paper, we seek to demonstrate a computer algorithm to generate dense UDGs for vertex counts up to at least 100.

Piercing intersecting convex sets

Linear Algebra and its Applications

1 April 2025

Imre Bárány

Travis Dillon

Dömötör Pálvölgyi

Dániel Varga

Helly-type theorems

Line transversals

Linear programming

Assume two finite families A and B of convex sets in R3 have the property that A ∩ B ≠ ∅ for every A ∈ A and B∈B. Is there a constant γ > 0 (independent of A and B) such that there is a line intersecting γ|A| sets in A or γ|B| sets in B? This is an intriguing Helly-type question from a paper by Martínez, Roldan and Rubin. We confirm this in the special case when all sets in A lie in parallel planes and all sets in B lie in parallel planes; in fact, one of the two families has a transversal by a single line.

A characterization of complex Hadamard matrices appearing in families of MUB triplets

Designs, Codes and Cryptography

3 October 2024

Ákos K. Matszangosz

Ferenc Szöllősi (Shimane University)

Complex Hadamard matrix

Mutually unbiased bases

Gröbner bases

It is shown that a normalized complex Hadamard matrix of order 6 having three distinct columns each containing at least one -1 entry, necessarily belongs to the transposed Fourier family, or to the family of 2-circulant complex Hadamard matrices. The proofs rely on solving polynomial systems of equations by Gröbner basis techniques, and make use of a structure theorem concerning regular Hadamard matrices. As a consequence, members of these two families can be easily recognized in practice.

Towards Unbiased Exploration in Partial Label Learning

JMLR

24 December 2024

Zsolt Zombori

Agapi Rissaki (Northeastern University)

Kristóf Szabó

Wolfgang Gatterbauer (Northeastern University)

Michael Benedikt (University of Oxford)

Machine learning

Partial label learning

We consider learning a probabilistic classifier from partially-labelled supervision (inputs denoted with multiple possibilities) using standard neural architectures with a softmax as the final layer. We identify a bias phenomenon that can arise from the softmax layer in even simple architectures that prevents proper exploration of alternative options, making the dynamics of gradient descent overly sensitive to initialization. We introduce a novel loss function that allows for unbiased exploration within the space of alternative outputs.

Global Sinkhorn Autoencoder — Optimal Transport on the latent representation of the full dataset

Annales Univ. Sci. Budapest., Sect. Comp. 57 (2024) 101–115

26 June 2024

Adrián Csiszárik

Melinda F. Kiss

Balázs Maga

Ákos Matszangosz

Dániel Varga

Optimal Transport

Wasserstein distance

Sinkhorn Autoencoder

Generative models

We propose an Optimal Transport (OT)-based generative model from the Wasserstein Autoencoder (WAE) family of models, with the following innovative property: the optimization of the latent point positions takes place over the full training dataset rather than over a minibatch. Our contributions are the following: We define a new class of global Wasserstein Autoencoder models, and implement an Optimal Transport-based incarnation we call the Global Sinkhorn Autoencoder. We implement several metrics for evaluating such models, both in the unsupervised setting, and in a semi-supervised setting, which are the following: the global OT loss, which measures the OT loss on the full test dataset; the reconstruction error on the full test dataset; a so-called covered area which measures how well the latent points are matched; and two types of clustering measures.

Mode Combinability: Exploring Convex Combinations of Permutation Aligned Models

Neural Networks, Elsevier

23 February 2024

Adrián Csiszárik

Melinda F. Kiss

Péter Kőrösi-Szabó

Márton Muntag

Gergely Papp

Dániel Varga

Representation learning

Deep learning

We explore element-wise convex combinations of two permutation-aligned neural network parameter vectors $\Theta_A$ and $\Theta_B$ of size $d$. We conduct extensive experiments by examining various distributions of such model combinations parametrized by elements of the hypercube and its vicinity. Our findings reveal that broad regions of the hypercube form surfaces of low loss values, indicating that the notion of linear mode connectivity extends to a more general phenomenon which we call mode combinability.

Training BERT Models to Carry Over a Coding System Developed on One Corpus to Another

LREC-COLING

20-25 May 2024

Dalma Galambos (Pázmány Péter Catholic University)

Pál Zsámboki

Language modeling

Domain adaptation

This paper describes how we train BERT models to carry over a coding system developed on the paragraphs of a Hungarian literary journal to another. The aim of the coding system is to track trends in the perception of literary translation around the political transformation in 1989 in Hungary. To evaluate not only task performance but also the consistence of the annotation, moreover, to get better predictions from an ensemble, we use 10-fold crossvalidation.

The density of planar sets avoiding unit distances

Mathematical Programming, Springer

06 October 2023

Gergely Ambrus

Adrián Csiszárik

Máté Matolcsi

Dániel Varga

Pál Zsámboki

Discrete geometry

Computer aided profs

By improving upon previous estimates on a problem posed by L. Moser, we prove a conjecture of Erdős that the density of any measurable planar set avoiding unit distances is less than 1/4. Our argument implies the upper bound of 0.2470.

Lemmas: Generation, Selection, Application

TABLEAUX 2023

14 September 2023

Michael Rawson (TU Wien)

Christoph Wernhard (University of Potsdam)

Zsolt Zombori

Wolfgang Bibel (Technical University Darmstadt)

Automated reasoning

Noting that lemmas are a key feature of mathematics, we engage in an investigation of the role of lemmas in automated theorem proving. The paper describes experiments with a combined system involving learning technology that generates useful lemmas for automated theorem provers, demonstrating improvement for several representative systems and solving a hard problem not solved by any system for twenty years. By focusing on condensed detachment problems we simplify the setting considerably, allowing us to get at the essence of lemmas and their role in proof search.

Reproducibility Study of ”Label-Free Explainability for Unsupervised Models”

The ML Reproducibility Challenge 2022

31 July 2023

Gergely Papp

Julius Wagenbach

Laurens Jans de Vries

Niklas Mather

Reproducibility Study

Explainable AI

Unsupervised Learning

In this work, we present our reproducibility study of Label-Free Explainability for Unsupervised Models, a paper that introduces two post‐hoc explanation techniques for neural networks: (1) label‐free feature importance and (2) label‐free example importance. Our study focuses on the reproducibility of the authors’ most important claims: (i) perturbing features with the highest importance scores causes higher latent shift than perturbing random pixels, (ii) label‐free example importance scores help to identify training examples that are highly related to a given test example, (iii) unsupervised models trained on different tasks show moderate correlation among the highest scored features and (iv) low correlation in example scores measured on a fixed set of data points, and (v) increasing the disentanglement with β in a β‐VAE does not imply that latent units will focus on more different features.

Piercing the chessboard

SIAM Journal on Discrete Mathematics

13 July 2023

Gergely Ambrus

Imre Bárány

Péter Frankl

Dániel Varga

Discrete geometry

Computer aided profs

We consider the minimum number of lines $h_n$ and $p_n$ needed to intersect or pierce, respectively, all the cells of the $n \times n$ chessboard. Determining these values can also be interpreted as a strengthening of the classical plank problem for integer points. Using the symmetric plank theorem of K. Ball, we prove that $h = \lceil \frac{n}{2} \rceil$ for each $n \leq 1$. Studying the piercing problem, we show that $0.

On higher order Fourier analysis in characteristic p

Ergodic theory and dynamical systems, 2023

27 January 2023

Pablo Candela

Diego González-Sánchez

Balázs Szegedy

Dynamical Systems

Ergodic theory

In this paper, the nilspace approach to higher-order Fourier analysis is developed in the setting of vector spaces over a prime field $\mathbb{F}_p$, with applications mainly in ergodic theory. A key requisite for this development is to identify a class of nilspaces adequate for this setting. We introduce such a class, whose members we call $p$-homogeneous nilspaces. One of our main results characterizes these objects in terms of a simple algebraic property.

Regularity and inverse theorems for uniformity norms on compact abelian groups and nilmanifolds

Journal fur die Reine und Angewandte Mathematik, Volume 2022, Issue 789, Pages 1 - 42, 1 August 2022

01 August 2022

Pablo Candela

Balázs Szegedy

Nilspaces

Cubic Couplings

We prove a general form of the regularity theorem for uniformity norms, and deduce an inverse theorem for these norms which holds for a class of compact nilspaces including all compact abelian groups, and also nilmanifolds; in particular we thus obtain the first non-abelian versions of such theorems. We derive these results from a general structure theorem for cubic couplings, thereby unifying these results with the Host–Kra Ergodic Structure Theorem. A unification of this kind had been propounded as a conceptual prospect by Host and Kra.

Negative Sampling in Variational Autoencoders

IEEE 2nd Conference on Information Technology and Data Science (CITDS), 2022

16-18 May 2022

Adrián Csiszárik

Beatrix Benkő

Dániel Varga

Representation learning

Autoencoders

Deep learning

Modern deep artificial neural networks have achieved great success in the domain of computer vision and beyond. However, their application to many real-world tasks is undermined by certain limitations, such as overconfident uncertainty estimates on out-of-distribution data or performance deterioration under data distribution shifts. Several types of deep learning models used for density estimation through probabilistic generative modeling have been shown to fail to detect out-of-distribution samples by assigning higher likelihoods to anomalous data.

Optimal transport with f-divergence regularization and generalized Sinkhorn algorithm

AISTATS 2022

28-30 March 2022

Dávid Terjék

Diego González-Sánchez

Optimal transport

Deep learning

Entropic regularization provides a generalization of the original optimal transport problem. It introduces a penalty term defined by the Kullback-Leibler divergence, making the problem more tractable via the celebrated Sinkhorn algorithm. Replacing the Kullback-Leibler divergence with a general $f$-divergence leads to a natural generalization. The case of divergences defined by superlinear functions was recently studied by Di Marino and Gerolin. Using convex analysis, we extend the theory developed so far to include all $f$-divergences defined by functions of Legendre type, and prove that under some mild conditions, strong duality holds, optimums in both the primal and dual problems are attained, the generalization of the $c$-transform is well-defined, and we give sufficient conditions for the generalized Sinkhorn algorithm to converge to an optimal solution.

On F ω 2-affine-exchangeable probability measures

Studia Mathematica 279 (2024), 1-69

2024

Pablo Candela

Diego González-Sánchez

Balázs Szegedy

Affine-Exchangeability

Random Infinite-Dimensional Cube

For any standard Borel space , let denote the space of Borel probability measures on . In relation to a difficult problem of Aldous in exchangeability theory, and in connection with arithmetic combinatorics, Austin raised the question of describing the structure of affine-exchangeable probability measures on product spaces indexed by the vector space F, i.e., the measures in F that are invariant under the coordinate permutations on F induced by all affine automorphisms of F.

Similarity and Matching of Neural Network Representations

NeurIPS, 2021

6-14 December 2021

Adrián Csiszárik

Péter Kőrösi-Szabó

Ákos K. Matszangosz

Gergely Papp

Dániel Varga

Representation learning

Similarity of representations

Deep learning

We employ a toolset — dubbed Dr. Frankenstein — to analyse the similarity of representations in deep neural networks. With this toolset we aim to match the activations on given layers of two trained neural networks by joining them with a stitching layer. We demonstrate that the inner representations emerging in deep convolutional neural networks with the same architecture but different initialisations can be matched with a surprisingly high degree of accuracy even with a single, affine stitching layer.

The Role of Entropy in Guiding a Connection Prover

TABLEAUX 2021

6-9 September 2021

Zsolt Zombori (Rényi AI, Eötvös Loránd University)

Jozef Urban (Czech Technical University in Prague)

Miroslav Olšák (University of Innsbruck)

Formal reasoning

In this work we study how to learn good algorithms for selecting reasoning steps in theorem proving. We explore this in the connection tableau calculus implemented by leanCoP where the partial tableau provides a clean and compact notion of a state to which a limited number of inferences can be applied. We start by incorporating a state-of-the-art learning algorithm — a graph neural network (GNN) – into the plCoP theorem prover.

Towards Finding Longer Proofs

TABLEAUX 2021

6-9 September 2021

Zsolt Zombori (Rényi AI, Eötvös Loránd University)

Adrián Csiszárik (Rényi AI, Eötvös Loránd University)

Henryk Michalewski (University of Warsaw, Google Inc.)

Cezary Kaliszyk (University of Warsaw, University of Innsbruck)

Josef Urban (Czech Technical University in Prague)

Formal reasoning

In this work we study how to learn good algorithms for selecting reasoning steps in theorem proving. We explore this in the connection tableau calculus implemented by leanCoP where the partial tableau provides a clean and compact notion of a state to which a limited number of inferences can be applied. We start by incorporating a state-of-the-art learning algorithm — a graph neural network (GNN) – into the plCoP theorem prover.

On measure-preserving Fp-omega-systems of order k

Journal d'Analyse Mathématique, 2024

5-11 September 2021

Pablo Candela

Diego González-Sánchez

Balázs Szegedy

Dynamical Systems

Combinatorics

Building on previous work in the nilspace-theoretic approach to the study of Host–Kra factors of measure-preserving systems, we prove that every ergodic $\mathbb{F}_p^\omega$-system of order $k$ is a factor of an Abramov $\mathbb{F}_p^\omega$-system of order $k$. This answers a question of Jamneshan, Shalom and Tao.

Ordering Subgoals in a Backward Chaining Prover

Proceedings of The 6th Conference on Artificial Intelligence and Theorem Proving (AITP), 2021

5-11 September 2021

Gergő Csaba Kertész (Budapest University of Technology and Economics, Budapest)

Gergely Papp

Péter Szeredi (Budapest University of Technology and Economics, Budapest)

Dániel Varga

Zsolt Zombori (Rényi AI, Eötvös Loránd University)

Formal reasoning

Many automated theorem provers are based on backward chaining: reasoning starts from a goal statement which we aim to prove and each inference step reduces one of the goals to a (possibly empty) set of new subgoals. We thus maintain a set of open goals that need to be proven and the proof is complete once the open goal set becomes empty. For each goal, there can be several valid inferences, resulting in different successor goal sets and selecting the right inference constitutes the core problem of such theorem provers which has been thoroughly studied in the past half century.

Ordering Subgoals in a Backward Chaining Prover

Proceedings of The 6th Conference on Artificial Intelligence and Theorem Proving (AITP), 2021

5-11 September 2021

Kristóf Szabó (Eötvös Loránd University)

Zsolt Zombori

Formal reasoning

World models represent the basic mechanisms of a system and can provide predictions about how transformations (actions) affect the state of the system. Such models have recently gained attention in Reinforcement Learning (RL) and in several domains model based learning systems performed similarly or better than highly tuned model free variants [1, 8, 12]. World models can increase sample efficiency since trajectories can be generated without interacting with the environment, and they can aid exploration by yielding a semantically meaningful latent structure that allows for identifying promising directions.

A Refinement of Cauchy-Schwarz Complexity, with Applications

European journal of combinatorics, 2022

24 August 2021

Pablo Candela

Diego González-Sánchez

Balázs Szegedy

Cauchy-Schwarz complexity

generalized von Neumann theorem

We introduce a notion of complexity for systems of linear forms called \emph{sequential Cauchy-Schwarz complexity}, which is parametrized by two positive integers $k,\ell$ and refines the notion of Cauchy-Schwarz complexity introduced by Green and Tao. We prove that if a system of linear forms has sequential Cauchy-Schwarz complexity at most $(k,\ell)$ then any average of 1-bounded functions over this system is controlled by the $2^{1-\ell}$-th power of the Gowers $U^{k+1}$-norms of the functions.

Towards solving the 7-in-a-row game

IEEE Conference on Games (CoG), 2021

17-20 August 2021

Czifra Domonkos

Csóka Endre

Zombori Zsolt

Makay Géza (University of Szeged)

Computer aided profs

Our paper explores the game theoretic value of the 7-in-a-row game. We reduce the problem to solving a finite board game, which we target using Proof Number Search. We present a number of heuristic improvements to Proof Number Search and examine their effect within the context of this particular game. Although our paper does not solve the 7-in-a-row game, our experiments indicate that we have made significant progress towards it.

Moreau-Yosida f-divergences

International Conference on Machine Learning, 2021

18-24 Jul 2021

Terjék Dávid

f-divergences

Variational representations of $f$-divergences are central to many machine learning algorithms, with Lipschitz constrained variants recently gaining attention. Inspired by this, we define the Moreau-Yosida approximation of $f$-divergences with respect to the Wasserstein-$1$ metric. The corresponding variational formulas provide a generalization of a number of recent results, novel special cases of interest and a relaxation of the hard Lipschitz constraint. Additionally, we prove that the so-called tight variational representation of $f$-divergences can be to be taken over the quotient space of Lipschitz functions, and give a characterization of functions achieving the supremum in the variational representation.

Action convergence of operators and graphs

Canadian Journal of Mathematics, Volume 74, Issue 1, February 2022 , pp. 72 - 121

17 September 2020

Ágnes Backhausz

Balázs Szegedy

Graph Limits

Operator

Random Matrix

We present a new approach to graph limit theory that unifies and generalizes the two most well-developed directions, namely dense graph limits (even the more general Lp limits) and Benjamini–Schramm limits (even in the stronger local-global setting). We illustrate by examples that this new framework provides a rich limit theory with natural limit objects for graphs of intermediate density. Moreover, it provides a limit theory for bounded operators (called P-operators) of the form L∞(Ω)→L1(Ω) for probability spaces Ω .

Az Erdős–Moser sejtés bizonyítása

Érintő

September 2023

Ambrus Gergely

Varga Dániel

Discrete Geometry

Mathematical Proofs

A sík legfeljebb mekkora hányada színezhető ki úgy, hogy két kiszínezett pont nem lehet pontosan egység távolságra egymástól? Ezt a geometriai kérdést Leo Moser fogalmazta meg az 1960-as évek elején, Hadwiger és Nelson egy rokon problémájával kapcsolatban. Leo Moser és Erdős Pál sejtése szerint ez a hányad nem érheti el az $\frac{1}{4}$-et; a jelenleg ismert legerősebb, 0,2293 értékű alsó korlátot Hallard Croft 1967-es konstrukciója adja. A problémával kapcsolatban számos kutatócsoport publikált már részeredményeket, amelyek a kezdeti 0,2857-es felső sűrűség-becslést az elmúlt 60 évben fokozatosan 0.

1
2
3