Hello! I am currently the Chief Technology Officer and Chief Scientist at Molecule.one, a biotech startup that combines in a closed loop high-throughput organic chemistry laboratory with machine learning models. I am passionate about improving fundamental aspects of deep learning and how it can be used to empower scientific discovery.

In my scientific work, I have largely focused on the role of optimization in the success of deep learning. My main contribution is discovering the break-even point phenomenon, which governs how the learning rate impacts the trained deep neural network in most deep learning experiments. For more details see our ICLR 2020 paper (spotlight).

I completed a post-doc with Kyunghyun Cho and Krzysztof Geras at New York University, and was also an an Assistant Professor at Jagiellonian University (member of GMUM.net). I received my PhD from Jagiellonian University co-supervised by Jacek Tabor and Amos Storkey (University of Edinburgh). During PhD, I spent two summers as a visiting researcher with Yoshua Bengio, and collaborated with Google Research in Zurich.

I do my best to contribute to the broad machine learning community. Currently, I serve as an Action Editor for TMLR and an area chair for ICLR 2022 (before that NeurIPS 2020-22, ICML 2020-22, ICLR 2020-21).

My email is staszek.jastrzebski (on gmail).


Current students

  • [PhD] Łukasz Maziarka (UJ), co-advised with Jacek Tabor
  • [PhD] Mateusz Pyla (UJ & IDEAS), co-advised with Tomasz Trzciński
  • [MSc] Piotr Helm (UJ)

Previous students

  • [MSc] Przemysław Kaleta (PW) - Speeding-up retrosynthesis, co-advised with Piotr Miłoś
  • [MSc] Aleksandra Talar (UJ) - Out-of-distribution generalization in molecule property prediction
  • [PhD] Maciej Szymczak (UJ), co-advised with Jacek Tabor
  • [MSc] Sławomir Mucha (UJ) - Pretraining in deep learning in cheminformatics
  • [MSc] Tobiasz Ciepliński (UJ) - Evaluating generative models in chemistry using docking simulators
  • [BSc] Michał Zmysłowski (UW) - Is noisy quadratic model of training of deep neural networks realistic enough?
  • [MSc] Olivier Astrand (NYU) - Memorization in deep learning
  • [MSc] Tomasz Wesołowski (UJ) - Relevance of enriching word embeddings in modern deep natural language processing
  • [MSc] Andrii Krutsylo (UJ) - Physics aware representation for drug discovery
  • [BSc] Michał Soboszek (UJ) - Evaluating word embeddings
  • [MSc] Jakub Chłędowski (UJ) - Representation learning for textual entailment
  • [MSc] Mikołaj Sacha (UJ) - Meta learning and sharpness of the minima

Selected Publications

For a full list please see my Google Scholar profile.

Logo image

Differences between human and machine perception in medical diagnosis

T. Makino, S. Jastrzebski, Witold Oleszkiewicz, [...], Kyunghyun Cho, Krzysztof J Geras

Nature Scientific Reports 2022

Logo image

Catastrophic Fisher Explosion: Early Phase Fisher Matrix Impacts Generalization

S. Jastrzebski, D. Arpit, O. Astrand, G. Kerg, H. Wang, C. Xiong, R. Socher, K. Cho*, K. Geras*

International Conference on Machine Learning 2021
paper talk

Logo image

The Break-Even Point on the Optimization Trajectories of Deep Neural Networks

S. Jastrzębski, M. Szymczak, S. Fort, D. Arpit, J. Tabor, K. Cho*, K. Geras*

International Conference On Learning Algorithms 2020 (Spotlight)
paper talk

Logo image

Three Factors Influencing Minima in SGD

S. Jastrzębski*, Z. Kenton*, D. Arpit, N. Ballas, A. Fischer, Y. Bengio, A. Storkey

International Conference on Artificial Neural Networks 2018 (oral), International Conference on Learning Representations 2018 (workshop)

Logo image

A Closer Look at Memorization in Deep Networks

D. Arpit*, S. Jastrzębski*, N. Ballas*, D. Krueger*, T. Maharaj, E. Bengio, A. Fischer, A. Courville, S. Lacoste-Julien, Y. Bengio

International Conference on Machine Learning 2017
paper poster slides