PhD student in 2D/3D Material and Lighting Reconstruction
Supervisor: Prof. Dr. Matthias Niessner
Visual Computing & Artificial Intelligence Lab, Technical University of Munich
peter.kocsis(at)tum.de
About
PhD student in Computer Vision & Graphics, focusing on 2D/3D material and lighting estimation for relightable reconstruction and generation. Experienced in training GenAI models for intrinsic image decomposition and lifting image understanding to 3D. Started as a mechatronics engineer giving strong physical understanding, did research in monocular localization, neural control, reinforcement learning for motion planning, and active learning for low-data generalization.
IntrinsiX: High-Quality PBR Generation using Image Priors
NeurIPS 2025
Peter Kocsis, Lukas Höllein, Matthias Nießner
We introduce direct PBR generation from text. Instead of training control modules to image generators, direct PBR generation allows for explicit control. First, we train separate LoRA modules for each intrinsic components. Then, we introduce cross-intrinsic attention with a rerendering loss to achieve coherent PBR generation. Our method outperforms text->image->PBR baselines. Our predictions can also be distilled into 3D scenes opening up large-scale PBR texture generation.
LightIt: Illumination Modeling and Control for Diffusion Models
CVPR 2024
Peter Kocsis, Julien Philip, Kalyan Sunkavalli, Matthias Nießner, Yannick Hold-Geoffroy
Recent generative methods lack lighting control, which is crucial to numerous artistic aspects of image generation such as setting the overall mood or cinematic appearance. To overcome these limitations, we propose to condition the generation on shading and normal maps. We model the lighting with single bounce shading, which includes cast shadows. We first train a shading estimation module to generate a dataset of real-world images and shading pairs. Then, we train a control network using the estimated shading and normals as input. Our method demonstrates high-quality image generation and lighting control in numerous scenes.
Intrinsic Image Diffusion for Single-view Material Estimation
CVPR 2024
Peter Kocsis, Vincent Sitzmann, Matthias Nießner
Intrinsic image decomposition is a highly ambigous task.
Deep-learning-based methods often fail due to the lack of large-scale real world data.
We propose to formulate the problem probabilistically and generate possible decompositions using a generative model.
This way, we can also utilize the strong image prior of diffusion models for the task of material estimation, which largely helps generalization.
The Unreasonable Effectiveness of Fully-Connected Layers for Low-Data Regimes
NeurIPS 2022
Peter Kocsis, Peter Súkeník, Guillem Brasó, Matthias Niessner, Laura Leal-Taixé, Ismail Elezi
Convolutional neural networks were the standard for solving many computer vision tasks until recently, when Transformers of MLP-based architectures have started to show competitive performance. These architectures typically have a vast number of weights and need to be trained on massive datasets; hence, they are not suitable for their use in low-data regimes. In this work, we propose a simple yet effective framework to improve generalization from small amounts of data. We augment modern CNNs with fully-connected (FC) layers and show the massive impact this architectural change has in low-data regimes.
During my masters' thesis, I was working on using inter-sample message passing for active learning. Active learning requires uncertainty estimation of the unlabeled pool. Providing inter-sample information to the network helps to better find the out-of-domain samples.
Commonroad is a generic framework for developing and testing motion planning algorithms for autonomous vehicles. Besides working on the platform as working student, I was also participating in researching reinforcement-learning-based motion planning with dense and sparse rewards.
During my bachelors' thesis, I have constructed a ball-balancing table and implemented various control algorithms. A virtual twin has been implemented in Unity and trained a neural-network-based controller, then transferred it to the real world device.
The goal of the project was to reimplement and potentially improve the paper "Visual localization within LIDAR maps for automated urban driving" (Ryan W. W. and Ryan M. E., 2014). Given a pre-scanned map, we render synthetic views around an estimated pose. Then, we match the synthetic views to the camera feed.
Teaching
Intrinsic Image Distillation for Room-Scale Material Reconstruction
Thesis: Bendeguz Timar, M.Sc.
Technical University of Munich
Variance Reduction Techniques for Inverse Path Tracing
Thesis: Youssef Hafez M.Sc.
Technical University of Munich
ShadedSDF: Volumetric Surface Reconstruction with Appearance Decomposition Constraints
Thesis: Mohamed Ebbed M.Sc.
Technical University of Munich
Multi-Bounce Appearance Decomposition
Thesis: Yue Chen M.Sc.
Technical University of Munich
3D Scanning & Spatial Learning Practical
Technical University of Munich
Masters'level practical course with hands-on research projects. Topics include 3D reconstruction, material estimation, inverse rendering, and 3D deep learning.
3D Vision Seminar
Technical University of Munich
Masters'level seminar course exploring the latest research in 3D computer vision and graphics. Topics include 3D reconstruction techniques, inverse rendering.