Publications
Diverse beam search to find densest-known planar unit distance graphs
Experimental Mathematics
13 Jun 2025
Peter Engel (Cornell University)
Owen Hammond-Lee (Georgia Institute of Technology)
Yiheng Su (University of Wisconsin–Madison)
Dániel Varga
Pál Zsámboki
Beam Search
Unit Distance Graphs
Exploitation-Exploration
Planar Geometry
Combinatorial Optimization
This paper addresses the problem of determining the maximum number of edges in a unit distance graph (UDG) of n vertices using computer search. An unsolved problem of Paul Erdős asks the maximum number of edges 𝑢(𝑛) a UDG of n vertices can have. Those UDGs that attain 𝑢(𝑛) are called “maximally dense.” In this paper, we seek to demonstrate a computer algorithm to generate dense UDGs for vertex counts up to at least 100.
Piercing intersecting convex sets
Linear Algebra and its Applications
1 April 2025
Imre Bárány
Travis Dillon
Dömötör Pálvölgyi
Dániel Varga
Helly-type theorems
Line transversals
Linear programming
Assume two finite families A and B of convex sets in R3 have the property that A ∩ B ≠ ∅ for every A ∈ A and B∈B. Is there a constant γ > 0 (independent of A and B) such that there is a line intersecting γ|A| sets in A or γ|B| sets in B? This is an intriguing Helly-type question from a paper by Martínez, Roldan and Rubin. We confirm this in the special case when all sets in A lie in parallel planes and all sets in B lie in parallel planes; in fact, one of the two families has a transversal by a single line.
A characterization of complex Hadamard matrices appearing in families of MUB triplets
Designs, Codes and Cryptography
3 October 2024
Ákos K. Matszangosz
Ferenc Szöllősi (Shimane University)
Complex Hadamard matrix
Mutually unbiased bases
Gröbner bases
It is shown that a normalized complex Hadamard matrix of order 6 having three distinct columns each containing at least one -1 entry, necessarily belongs to the transposed Fourier family, or to the family of 2-circulant complex Hadamard matrices. The proofs rely on solving polynomial systems of equations by Gröbner basis techniques, and make use of a structure theorem concerning regular Hadamard matrices. As a consequence, members of these two families can be easily recognized in practice.
Towards Unbiased Exploration in Partial Label Learning
JMLR
24 December 2024
Zsolt Zombori
Agapi Rissaki (Northeastern University)
Kristóf Szabó
Wolfgang Gatterbauer (Northeastern University)
Michael Benedikt (University of Oxford)
Machine learning
Partial label learning
We consider learning a probabilistic classifier from partially-labelled supervision (inputs denoted with multiple possibilities) using standard neural architectures with a softmax as the final layer. We identify a bias phenomenon that can arise from the softmax layer in even simple architectures that prevents proper exploration of alternative options, making the dynamics of gradient descent overly sensitive to initialization. We introduce a novel loss function that allows for unbiased exploration within the space of alternative outputs.
Global Sinkhorn Autoencoder — Optimal Transport on the latent representation of the full dataset
Annales Univ. Sci. Budapest., Sect. Comp. 57 (2024) 101–115
26 June 2024
Adrián Csiszárik
Melinda F. Kiss
Balázs Maga
Ákos Matszangosz
Dániel Varga
Optimal Transport
Wasserstein distance
Sinkhorn Autoencoder
Generative models
We propose an Optimal Transport (OT)-based generative model from the Wasserstein Autoencoder (WAE) family of models, with the following innovative property: the optimization of the latent point positions takes place over the full training dataset rather than over a minibatch. Our contributions are the following:
We define a new class of global Wasserstein Autoencoder models, and implement an Optimal Transport-based incarnation we call the Global Sinkhorn Autoencoder. We implement several metrics for evaluating such models, both in the unsupervised setting, and in a semi-supervised setting, which are the following: the global OT loss, which measures the OT loss on the full test dataset; the reconstruction error on the full test dataset; a so-called covered area which measures how well the latent points are matched; and two types of clustering measures.
Mode Combinability: Exploring Convex Combinations of Permutation Aligned Models
Neural Networks, Elsevier
23 February 2024
Adrián Csiszárik
Melinda F. Kiss
Péter Kőrösi-Szabó
Márton Muntag
Gergely Papp
Dániel Varga
Representation learning
Deep learning
We explore element-wise convex combinations of two permutation-aligned neural network parameter vectors $\Theta_A$ and $\Theta_B$ of size $d$. We conduct extensive experiments by examining various distributions of such model combinations parametrized by elements of the hypercube and its vicinity. Our findings reveal that broad regions of the hypercube form surfaces of low loss values, indicating that the notion of linear mode connectivity extends to a more general phenomenon which we call mode combinability.
Training BERT Models to Carry Over a Coding System Developed on One Corpus to Another
LREC-COLING
20-25 May 2024
Dalma Galambos (Pázmány Péter Catholic University)
Pál Zsámboki
Language modeling
Domain adaptation
This paper describes how we train BERT models to carry over a coding system developed on the paragraphs of a Hungarian literary journal to another. The aim of the coding system is to track trends in the perception of literary translation around the political transformation in 1989 in Hungary. To evaluate not only task performance but also the consistence of the annotation, moreover, to get better predictions from an ensemble, we use 10-fold crossvalidation.
The density of planar sets avoiding unit distances
Mathematical Programming, Springer
06 October 2023
Gergely Ambrus
Adrián Csiszárik
Máté Matolcsi
Dániel Varga
Pál Zsámboki
Discrete geometry
Computer aided profs
By improving upon previous estimates on a problem posed by L. Moser, we prove a conjecture of Erdős that the density of any measurable planar set avoiding unit distances is less than 1/4. Our argument implies the upper bound of 0.2470.
Lemmas: Generation, Selection, Application
TABLEAUX 2023
14 September 2023
Michael Rawson (TU Wien)
Christoph Wernhard (University of Potsdam)
Zsolt Zombori
Wolfgang Bibel (Technical University Darmstadt)
Automated reasoning
Noting that lemmas are a key feature of mathematics, we engage in an investigation of the role of lemmas in automated theorem proving. The paper describes experiments with a combined system involving learning technology that generates useful lemmas for automated theorem provers, demonstrating improvement for several representative systems and solving a hard problem not solved by any system for twenty years. By focusing on condensed detachment problems we simplify the setting considerably, allowing us to get at the essence of lemmas and their role in proof search.
Reproducibility Study of ”Label-Free Explainability for Unsupervised Models”
The ML Reproducibility Challenge 2022
31 July 2023
Gergely Papp
Julius Wagenbach
Laurens Jans de Vries
Niklas Mather
Reproducibility Study
Explainable AI
Unsupervised Learning
In this work, we present our reproducibility study of Label-Free Explainability for Unsupervised Models, a paper that introduces two post‐hoc explanation techniques for neural networks: (1) label‐free feature importance and (2) label‐free example importance. Our study focuses on the reproducibility of the authors’ most important claims: (i) perturbing features with the highest importance scores causes higher latent shift than perturbing random pixels, (ii) label‐free example importance scores help to identify training examples that are highly related to a given test example, (iii) unsupervised models trained on different tasks show moderate correlation among the highest scored features and (iv) low correlation in example scores measured on a fixed set of data points, and (v) increasing the disentanglement with β in a β‐VAE does not imply that latent units will focus on more different features.
Piercing the chessboard
SIAM Journal on Discrete Mathematics
13 July 2023
Gergely Ambrus
Imre Bárány
Péter Frankl
Dániel Varga
Discrete geometry
Computer aided profs
We consider the minimum number of lines $h_n$ and $p_n$ needed to intersect or pierce, respectively, all the cells of the $n \times n$ chessboard. Determining these values can also be interpreted as a strengthening of the classical plank problem for integer points. Using the symmetric plank theorem of K. Ball, we prove that $h = \lceil \frac{n}{2} \rceil$ for each $n \leq 1$. Studying the piercing problem, we show that $0.
On higher order Fourier analysis in characteristic p
Ergodic theory and dynamical systems, 2023
27 January 2023
Pablo Candela
Diego González-Sánchez
Balázs Szegedy
Dynamical Systems
Ergodic theory
In this paper, the nilspace approach to higher-order Fourier analysis is developed in the setting of vector spaces over a prime field $\mathbb{F}_p$, with applications mainly in ergodic theory. A key requisite for this development is to identify a class of nilspaces adequate for this setting. We introduce such a class, whose members we call $p$-homogeneous nilspaces. One of our main results characterizes these objects in terms of a simple algebraic property.
Regularity and inverse theorems for uniformity norms on compact abelian groups and nilmanifolds
Journal fur die Reine und Angewandte Mathematik, Volume 2022, Issue 789, Pages 1 - 42, 1 August 2022
01 August 2022
Pablo Candela
Balázs Szegedy
Nilspaces
Cubic Couplings
We prove a general form of the regularity theorem for uniformity norms, and deduce an inverse theorem for these norms which holds for a class of compact nilspaces including all compact abelian groups, and also nilmanifolds; in particular we thus obtain the first non-abelian versions of such theorems. We derive these results from a general structure theorem for cubic couplings, thereby unifying these results with the Host–Kra Ergodic Structure Theorem. A unification of this kind had been propounded as a conceptual prospect by Host and Kra.
Negative Sampling in Variational Autoencoders
IEEE 2nd Conference on Information Technology and Data Science (CITDS), 2022
16-18 May 2022
Adrián Csiszárik
Beatrix Benkő
Dániel Varga
Representation learning
Autoencoders
Deep learning
Modern deep artificial neural networks have achieved great success in the domain of computer vision and beyond. However, their application to many real-world tasks is undermined by certain limitations, such as overconfident uncertainty estimates on out-of-distribution data or performance deterioration under data distribution shifts. Several types of deep learning models used for density estimation through probabilistic generative modeling have been shown to fail to detect out-of-distribution samples by assigning higher likelihoods to anomalous data.
Optimal transport with f-divergence regularization and generalized Sinkhorn algorithm
AISTATS 2022
28-30 March 2022
Dávid Terjék
Diego González-Sánchez
Optimal transport
Deep learning
Entropic regularization provides a generalization of the original optimal transport problem. It introduces a penalty term defined by the Kullback-Leibler divergence, making the problem more tractable via the celebrated Sinkhorn algorithm. Replacing the Kullback-Leibler divergence with a general $f$-divergence leads to a natural generalization. The case of divergences defined by superlinear functions was recently studied by Di Marino and Gerolin. Using convex analysis, we extend the theory developed so far to include all $f$-divergences defined by functions of Legendre type, and prove that under some mild conditions, strong duality holds, optimums in both the primal and dual problems are attained, the generalization of the $c$-transform is well-defined, and we give sufficient conditions for the generalized Sinkhorn algorithm to converge to an optimal solution.
On F ω 2-affine-exchangeable probability measures
Studia Mathematica 279 (2024), 1-69
2024
Pablo Candela
Diego González-Sánchez
Balázs Szegedy
Affine-Exchangeability
Random Infinite-Dimensional Cube
For any standard Borel space , let denote the space of Borel probability measures on . In relation to a difficult problem of Aldous in exchangeability theory, and in connection with arithmetic combinatorics, Austin raised the question of describing the structure of affine-exchangeable probability measures on product spaces indexed by the vector space F, i.e., the measures in F that are invariant under the coordinate permutations on F induced by all affine automorphisms of F.
Similarity and Matching of Neural Network Representations
NeurIPS, 2021
6-14 December 2021
Adrián Csiszárik
Péter Kőrösi-Szabó
Ákos K. Matszangosz
Gergely Papp
Dániel Varga
Representation learning
Similarity of representations
Deep learning
We employ a toolset — dubbed Dr. Frankenstein — to analyse the similarity of representations in deep neural networks. With this toolset we aim to match the activations on given layers of two trained neural networks by joining them with a stitching layer. We demonstrate that the inner representations emerging in deep convolutional neural networks with the same architecture but different initialisations can be matched with a surprisingly high degree of accuracy even with a single, affine stitching layer.
The Role of Entropy in Guiding a Connection Prover
TABLEAUX 2021
6-9 September 2021
Zsolt Zombori (Rényi AI, Eötvös Loránd University)
Jozef Urban (Czech Technical University in Prague)
Miroslav Olšák (University of Innsbruck)
Formal reasoning
In this work we study how to learn good algorithms for selecting reasoning steps in theorem proving. We explore this in the connection tableau calculus implemented by leanCoP where the partial tableau provides a clean and compact notion of a state to which a limited number of inferences can be applied. We start by incorporating a state-of-the-art learning algorithm — a graph neural network (GNN) – into the plCoP theorem prover.
Towards Finding Longer Proofs
TABLEAUX 2021
6-9 September 2021
Zsolt Zombori (Rényi AI, Eötvös Loránd University)
Adrián Csiszárik (Rényi AI, Eötvös Loránd University)
Henryk Michalewski (University of Warsaw, Google Inc.)
Cezary Kaliszyk (University of Warsaw, University of Innsbruck)
Josef Urban (Czech Technical University in Prague)
Formal reasoning
In this work we study how to learn good algorithms for selecting reasoning steps in theorem proving. We explore this in the connection tableau calculus implemented by leanCoP where the partial tableau provides a clean and compact notion of a state to which a limited number of inferences can be applied. We start by incorporating a state-of-the-art learning algorithm — a graph neural network (GNN) – into the plCoP theorem prover.
On measure-preserving Fp-omega-systems of order k
Journal d'Analyse Mathématique, 2024
5-11 September 2021
Pablo Candela
Diego González-Sánchez
Balázs Szegedy
Dynamical Systems
Combinatorics
Building on previous work in the nilspace-theoretic approach to the study of Host–Kra factors of measure-preserving systems, we prove that every ergodic $\mathbb{F}_p^\omega$-system of order $k$ is a factor of an Abramov $\mathbb{F}_p^\omega$-system of order $k$. This answers a question of Jamneshan, Shalom and Tao.
Ordering Subgoals in a Backward Chaining Prover
Proceedings of The 6th Conference on Artificial Intelligence and Theorem Proving (AITP), 2021
5-11 September 2021
Gergő Csaba Kertész (Budapest University of Technology and Economics, Budapest)
Gergely Papp
Péter Szeredi (Budapest University of Technology and Economics, Budapest)
Dániel Varga
Zsolt Zombori (Rényi AI, Eötvös Loránd University)
Formal reasoning
Many automated theorem provers are based on backward chaining: reasoning starts from a goal statement which we aim to prove and each inference step reduces one of the goals to a (possibly empty) set of new subgoals. We thus maintain a set of open goals that need to be proven and the proof is complete once the open goal set becomes empty. For each goal, there can be several valid inferences, resulting in different successor goal sets and selecting the right inference constitutes the core problem of such theorem provers which has been thoroughly studied in the past half century.
Ordering Subgoals in a Backward Chaining Prover
Proceedings of The 6th Conference on Artificial Intelligence and Theorem Proving (AITP), 2021
5-11 September 2021
Kristóf Szabó (Eötvös Loránd University)
Zsolt Zombori
Formal reasoning
World models represent the basic mechanisms of a system and can provide predictions about how transformations (actions) affect the state of the system. Such models have recently gained attention in Reinforcement Learning (RL) and in several domains model based learning systems performed similarly or better than highly tuned model free variants [1, 8, 12]. World models can increase sample efficiency since trajectories can be generated without interacting with the environment, and they can aid exploration by yielding a semantically meaningful latent structure that allows for identifying promising directions.
A Refinement of Cauchy-Schwarz Complexity, with Applications
European journal of combinatorics, 2022
24 August 2021
Pablo Candela
Diego González-Sánchez
Balázs Szegedy
Cauchy-Schwarz complexity
generalized von Neumann theorem
We introduce a notion of complexity for systems of linear forms called \emph{sequential Cauchy-Schwarz complexity}, which is parametrized by two positive integers $k,\ell$ and refines the notion of Cauchy-Schwarz complexity introduced by Green and Tao. We prove that if a system of linear forms has sequential Cauchy-Schwarz complexity at most $(k,\ell)$ then any average of 1-bounded functions over this system is controlled by the $2^{1-\ell}$-th power of the Gowers $U^{k+1}$-norms of the functions.
Towards solving the 7-in-a-row game
IEEE Conference on Games (CoG), 2021
17-20 August 2021
Czifra Domonkos
Csóka Endre
Zombori Zsolt
Makay Géza (University of Szeged)
Computer aided profs
Our paper explores the game theoretic value of the 7-in-a-row game. We reduce the problem to solving a finite board game, which we target using Proof Number Search. We present a number of heuristic improvements to Proof Number Search and examine their effect within the context of this particular game. Although our paper does not solve the 7-in-a-row game, our experiments indicate that we have made significant progress towards it.
Moreau-Yosida f-divergences
International Conference on Machine Learning, 2021
18-24 Jul 2021
Terjék Dávid
f-divergences
Variational representations of $f$-divergences are central to many machine learning algorithms, with Lipschitz constrained variants recently gaining attention. Inspired by this, we define the Moreau-Yosida approximation of $f$-divergences with respect to the Wasserstein-$1$ metric. The corresponding variational formulas provide a generalization of a number of recent results, novel special cases of interest and a relaxation of the hard Lipschitz constraint. Additionally, we prove that the so-called tight variational representation of $f$-divergences can be to be taken over the quotient space of Lipschitz functions, and give a characterization of functions achieving the supremum in the variational representation.
Action convergence of operators and graphs
Canadian Journal of Mathematics, Volume 74, Issue 1, February 2022 , pp. 72 - 121
17 September 2020
Ágnes Backhausz
Balázs Szegedy
Graph Limits
Operator
Random Matrix
We present a new approach to graph limit theory that unifies and generalizes the two most well-developed directions, namely dense graph limits (even the more general Lp limits) and Benjamini–Schramm limits (even in the stronger local-global setting). We illustrate by examples that this new framework provides a rich limit theory with natural limit objects for graphs of intermediate density. Moreover, it provides a limit theory for bounded operators (called P-operators) of the form L∞(Ω)→L1(Ω) for probability spaces Ω .
Az Erdős–Moser sejtés bizonyítása
Érintő
September 2023
Ambrus Gergely
Varga Dániel
Discrete Geometry
Mathematical Proofs
A sík legfeljebb mekkora hányada színezhető ki úgy, hogy két kiszínezett pont nem lehet pontosan egység távolságra egymástól? Ezt a geometriai kérdést Leo Moser fogalmazta meg az 1960-as évek elején, Hadwiger és Nelson egy rokon problémájával kapcsolatban. Leo Moser és Erdős Pál sejtése szerint ez a hányad nem érheti el az $\frac{1}{4}$-et; a jelenleg ismert legerősebb, 0,2293 értékű alsó korlátot Hallard Croft 1967-es konstrukciója adja. A problémával kapcsolatban számos kutatócsoport publikált már részeredményeket, amelyek a kezdeti 0,2857-es felső sűrűség-becslést az elmúlt 60 évben fokozatosan 0.