I am a graduate student in the Paul G. Allen School for Computer Science & Engineering at the University of Washington, where I am very fortunate to be advised by Noah Smith. In addition, I am currently a visiting researcher at Facebook AI Research on Luke Zettlemoyer’s team where I mainly work with Mike Lewis.

In my research I try to better understand the building blocks of neural NLP—the embedding/softmax layers, and the transformer architecture—in order to make them faster, smaller, and more accurate.

In the summer of 2019 I interened at Facebook AI Research with Omer Levy.

Previously, I completed my Bachelor’s and Master’s degrees in Computer Science at Tel Aviv University (where I was advised by Lior Wolf and also worked with Jonathan Berant) and briefly worked as a software developer.

My brother Ori Press is a computer vision researcher.

Contact me

@ofirpress on Twitter

Papers (Google Scholar)

Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
Ofir Press, Noah A. Smith, Mike Lewis
[paper] [code] [bib]
[Yannic Kilcher’s video]

Shortformer: Better Language Modeling using Shorter Inputs
Ofir Press, Noah A. Smith, Mike Lewis
ACL 2021
[paper] [code] [bib]
[ACL video (summarizes the important bits, 12 min)] [video (detailed overview, 1 hour ]

Improving Transformer Models by Reordering their Sublayers
Ofir Press, Noah A. Smith, Omer Levy
ACL 2020
[paper] [summary] [code] [bib]
[ACL video (summarizes the important bits, 12 min)] [video (detailed overview, 35 min)]

You May Not Need Attention
Ofir Press, Noah A. Smith
[paper] [summary] [code] [bib]

Language Generation with Recurrent Generative Adversarial Networks without Pre-training
Ofir Press*, Amir Bar*, Ben Bogin*, Jonathan Berant, Lior Wolf
1st Workshop on Learning to Generate Natural Language at ICML 2017
[paper] [summary] [code] [bib]

Using the Output Embedding to Improve Language Models
Ofir Press, Lior Wolf
EACL 2017
Introduced the weight tying method which is now used in GPT, BERT and many other state of the art language & translation models.
[paper] [summary] [blog post] [code] [bib]