Loading...

Peter Kocsis

PhD student in 2D/3D Material and Lighting Reconstruction

Supervisor: Prof. Dr. Matthias Niessner
Visual Computing & Artificial Intelligence Lab, Technical University of Munich
peter.kocsis(at)tum.de

About

PhD student in Computer Vision & Graphics, focusing on 2D/3D material and lighting estimation for relightable reconstruction and generation. Experienced in training GenAI models for intrinsic image decomposition and lifting image understanding to 3D. Started as a mechatronics engineer giving strong physical understanding, did research in monocular localization, neural control, reinforcement learning for motion planning, and active learning for low-data generalization.

Publications

IntrinsiX: High-Quality PBR Generation using Image Priors

NeurIPS 2025
Peter Kocsis, Lukas Höllein, Matthias Nießner
We introduce direct PBR generation from text. Instead of training control modules to image generators, direct PBR generation allows for explicit control. First, we train separate LoRA modules for each intrinsic components. Then, we introduce cross-intrinsic attention with a rerendering loss to achieve coherent PBR generation. Our method outperforms text->image->PBR baselines. Our predictions can also be distilled into 3D scenes opening up large-scale PBR texture generation.

LightIt: Illumination Modeling and Control for Diffusion Models

CVPR 2024
Peter Kocsis, Julien Philip, Kalyan Sunkavalli, Matthias Nießner, Yannick Hold-Geoffroy
Recent generative methods lack lighting control, which is crucial to numerous artistic aspects of image generation such as setting the overall mood or cinematic appearance. To overcome these limitations, we propose to condition the generation on shading and normal maps. We model the lighting with single bounce shading, which includes cast shadows. We first train a shading estimation module to generate a dataset of real-world images and shading pairs. Then, we train a control network using the estimated shading and normals as input. Our method demonstrates high-quality image generation and lighting control in numerous scenes.

Intrinsic Image Diffusion for Single-view Material Estimation

CVPR 2024
Peter Kocsis, Vincent Sitzmann, Matthias Nießner
Intrinsic image decomposition is a highly ambigous task. Deep-learning-based methods often fail due to the lack of large-scale real world data. We propose to formulate the problem probabilistically and generate possible decompositions using a generative model. This way, we can also utilize the strong image prior of diffusion models for the task of material estimation, which largely helps generalization.

The Unreasonable Effectiveness of Fully-Connected Layers for Low-Data Regimes

NeurIPS 2022
Peter Kocsis, Peter Súkeník, Guillem Brasó, Matthias Niessner, Laura Leal-Taixé, Ismail Elezi
Convolutional neural networks were the standard for solving many computer vision tasks until recently, when Transformers of MLP-based architectures have started to show competitive performance. These architectures typically have a vast number of weights and need to be trained on massive datasets; hence, they are not suitable for their use in low-data regimes. In this work, we propose a simple yet effective framework to improve generalization from small amounts of data. We augment modern CNNs with fully-connected (FC) layers and show the massive impact this architectural change has in low-data regimes.

Projects

Active Learning with Transformers

2021
Technical University of Munich
During my masters' thesis, I was working on using inter-sample message passing for active learning. Active learning requires uncertainty estimation of the unlabeled pool. Providing inter-sample information to the network helps to better find the out-of-domain samples.

Reinforcement Learning for Motion Planning

2020
Technical University of Munich
Commonroad is a generic framework for developing and testing motion planning algorithms for autonomous vehicles. Besides working on the platform as working student, I was also participating in researching reinforcement-learning-based motion planning with dense and sparse rewards.

Neural Ball-Balancing Table

2019
Budapest University of Technology and Economics
During my bachelors' thesis, I have constructed a ball-balancing table and implemented various control algorithms. A virtual twin has been implemented in Unity and trained a neural-network-based controller, then transferred it to the real world device.

Monocular Localization

2017
Machine Perception Research Laboratory
The goal of the project was to reimplement and potentially improve the paper "Visual localization within LIDAR maps for automated urban driving" (Ryan W. W. and Ryan M. E., 2014). Given a pre-scanned map, we render synthetic views around an estimated pose. Then, we match the synthetic views to the camera feed.

Teaching

Intrinsic Image Distillation for Room-Scale Material Reconstruction

Thesis: Bendeguz Timar, M.Sc.
Technical University of Munich

Variance Reduction Techniques for Inverse Path Tracing

Thesis: Youssef Hafez M.Sc.
Technical University of Munich

ShadedSDF: Volumetric Surface Reconstruction with Appearance Decomposition Constraints

Thesis: Mohamed Ebbed M.Sc.
Technical University of Munich

Multi-Bounce Appearance Decomposition

Thesis: Yue Chen M.Sc.
Technical University of Munich

3D Scanning & Spatial Learning Practical

Technical University of Munich
Masters'level practical course with hands-on research projects. Topics include 3D reconstruction, material estimation, inverse rendering, and 3D deep learning.

3D Vision Seminar

Technical University of Munich
Masters'level seminar course exploring the latest research in 3D computer vision and graphics. Topics include 3D reconstruction techniques, inverse rendering.
Designed By HTML Codex