ASSET: Autoregressive Semantic Scene Editing With Transformers at High Resolutions
DescriptionWe present ASSET, a transformer architecture able to produce state-of-the-art results in image editing at high resolutions. The key idea of ASSET is a novel attention mechanism that efficiently captures long-range interactions important for realistic image synthesis. We present qualitative and quantitative results, along with user studies, demonstrating ASSET's effectiveness.