I'm broadly interested in computer vision problems.
If I have to be more specific, I like working on learning visual representations from imagery data with different forms of supervision (including no supervision at all!) so that they are useful for a range of vision tasks.
Ret4Loc | Tailoring Retrieval Representations to Long-term Visual Localization
Yannis Kalantidis*, Mert Bulent Sariyildiz*, Rafael S. Rezende, Philippe Weinzaepfel, Diane Larlus and Gabriela Csurka
Visual localization methods generally rely on a first image retrieval step whose role is crucial.
In this paper, we improve this retrieval step and tailor it to the final localization task.
We propose to synthesize variants of the training set images, obtained from
generative text-to-image models, in order to automatically expand the training set
towards a number of nameable variations that particularly hurt visual localization.
ImageNet-SD | Fake it till you make it: Learning(s) from a synthetic ImageNet clone
Mert Bulent Sariyildiz, Karteek Alahari, Diane Larlus and Yannis Kalantidis
Recent text-to-image generative models, generate fairly realistic images.
Could such models render real images obsolete for training image prediction models?
We answer part of this provocative question by questioning the need for real images when training models for ImageNet-1K classification.
We show that models trained on synthetic images exhibit strong generalization properties and perform on par with models trained on real data.
t-ReX | No reason for no supervision: Improving the generalization of supervised models
Mert Bulent Sariyildiz, Yannis Kalantidis, Karteek Alahari and Diane Larlus
We revisit supervised learning on ImageNet-1K and propose a training setup which
improves transfer learning performance of supervised models.
ImageNet-CoG | Concept Generalization in Visual Representation
Mert Bulent Sariyildiz, Yannis Kalantidis, Diane Larlus and Karteek Alahari
We propose a benchmark tailored for measuring concept generalization
capabilities of models trained on ImageNet-1K.
MoCHi | Hard Negative Mixing for Contrastive Learning
Yannis Kalantidis, Mert Bulent Sariyildiz, Noe Pion, Philippe Weinzaepfel and Diane Larlus
For contrastive learning, sampling more or harder negatives often improve
We propose two ways to synthesize more negatives using the MoCo framework.
ICMLM | Learning Visual Representations with Caption Annotations
Mert Bulent Sariyildiz, Julien Perez and Diane Larlus
Images often come with accompanying text describing the scene in images.
We propose a method to learn visual representations using (image, caption)
Key protected classification for collaborative learning
Mert Bulent Sariyildiz, Ramazan Gokberk Cinbis and Erman Ayday
Recognition, Vol. 104, August 2020
Vanilla collaborative learning frameworks are vulnerable to an active adversary
that runs a generative adversarial network attack.
We propose a classification model that is resilient against such attacks by
GMN | Gradient Matching Generative Networks for Zero-Shot Learning
Mert Bulent Sariyildiz and Ramazan Gokberk Cinbis
CVPR 2019, oral presentation
Zero-shot learning models may suffer from the domain-shift due to the difference
between data distributions of seen and unseen concepts.
We propose a generative model to synthesize samples for unseen concepts given
their visual attributes and use these samples for training a classifier for both
seen and unseen concepts.
Huge thanks to Jon Barron, who provides
the template of this website.