Google Proposes ARDMs: Efficient Autoregressive Models That Learn to Generate in any Order – Synced

Deep generative models that apply a likelihood function to data distribution have made impressive progress in modelling different sources of data such as images, text and video. A popular such model type is autoregressive models (ARMs), which, although effective, require a pre-specified order for their data generation. ARMs consequently may not be the best choice for generation tasks that involve specific types of data, such as images.

In a new paper, a Google Research team proposes Autoregressive Diffusion Models (ARDMs), a model class encompassing and generalizing order-agnostic autoregressive models and discrete diffusion models that do not require causal masking of model representations and can be trained using an efficient objective that scales favourably to highly-dimensional data.

The team summarises the main contributions of their work as:

The researchers explain that from an engineering perspective, the main challenge in parameterizing an ARM is the need to enforce the triangular or causal dependence. To address this, they took inspiration from modern diffusion-based generative models, deriving an objective that is only optimized for a single step at a time. In this way, a different objective for an order-agnostic ARM could be derived.

The team then leveraged an important property of this parametrization that the distribution over multiple variables is predicted at the same time to enable the parallel and independent generation of variables.

The researchers also identified an interesting property of upscale ARDM training: complexity is not changed by modelling multiple stages. This enabled them to experiment with adding an arbitrary number of stages during training without any increase in computational complexity.

The team applied two methods to the parametrization of the upscaling distributions: direct parametrization, which requires only distribution parameter outputs that are relevant for the current stage, making it efficient; and data parametrization, which can automatically compute the appropriate probabilities for experimentation with new downscaling processes, but may be expensive as a high number of classes are involved.

In their empirical study, the team compared ARDMs to other order-agnostic generative models, evaluating performance on a character modelling task using the text8 dataset. As expected, the proposed ARDMs performed competitively with existing generative models, and outperformed competing approaches on per-image lossless compression.

Overall, the study validates the effectiveness of the proposed ARDMs as a new class of models at the intersection of autoregressive and discrete diffusion models, whose benefits are summarized as:

The paper Autoregressive Diffusion Models is on arXiv.

Author: Hecate He |Editor: Michael Sarazen

We know you dont want to miss any news or research breakthroughs.Subscribe to our popular newsletterSynced Global AI Weeklyto get weekly AI updates.

Like Loading...

See the original post:
Google Proposes ARDMs: Efficient Autoregressive Models That Learn to Generate in any Order - Synced

Related Posts

Comments are closed.