Abstract

Experimental design for optimizing black-box functions is a fundamental problem in many science and engineering fields. In this problem, sample efficiency is crucial due to the time, money, and safety costs of real-world design evaluations. Existing approaches either rely on active data collection or access to large, labeled datasets of past experiments, making them impractical in many real-world scenarios. In this work, we address the more challenging yet realistic setting of few-shot experimental design, where only a few labeled data points of input designs and their corresponding values are available. We introduce Experiment Pretrained Transformers (ExPT), a foundation model for few-shot experimental design that combines unsupervised learning and in-context pretraining. In ExPT, we only assume knowledge of a finite collection of unlabelled data points from the input domain and pretrain a transformer neural network to optimize diverse synthetic functions defined over this domain. Unsupervised pretraining allows ExPT to adapt to any design task at test time in an in-context fashion by conditioning on a few labeled data points from the target task and generating the candidate optima. We evaluate ExPT on few-shot experimental design in challenging domains and demonstrate its superior generality and performance compared to existing methods.

A Foundation Model for Experimental Design

ExPT follows a pretraining-adaptation approach in order to create and adapt a a foundation model for experimental design. In the pretraining phase, the model is given a large unlabeled dataset. The domain of the dataset can range from robot designs to DNA sequences. This unlabeled dataset is then used to create a synthetic dataset using a novel approach that uses Gaussian Processes to scale up the pretraining. During adaptation, the model is given a few real-world designs and their corresponding values and is able to produce new and high quality designs.

Model Architecture

ExPT has an encoder-decoder architecture. A transformer-based encoder takes in context points (y₁,x₁), ... (y_m, x_m) as well as target y values y_m+1, ... , y_N to produce embeddings h_m+1, ... , h_N. Then a VAE decoder generates designs x_m+1, ... , x_N corresponding to the target objective values.

Empirical Highlights

A single pretrained ExPT foundation model can be used in a few-shot setting to generate designs for multiple independent objectives: for example, robots optimizing for horizontal speed, vertical speed while maintaining horizontal motion, and energy used while travelling.

4 few-shot tasks are constructed using data from the Design-Bench benchmark. These are D'Kitty and Ant from the domain of robotics, and TFBind-8 and TFBind-10 from the domain of genetics. We consider two different scenarios consisting of 1% of the total available data. Both scenarios are few-shot settings and demonstrate the performance of the model even when the designs available are not high performing.

The models are given access to the 1% worst-performing datapoints.
The models are given access to a random 1% subset of the datapoints.

Further details about these experiments as well as other experimental setups, such as synthetic tasks, multi-objective tasks, and ablation studies comparing forward and inverse modeling can be found in Section 3 of the paper.

BibTeX

Tung Nguyen, Sudhanshu Agrawal, Aditya Grover
ExPT: Synthetic Pretraining for Few-Shot Experimental Design
In NeurIPS, 2023.

If you find our work useful, please consider citing us using the following BibTeX :

[BibTeX]

This website was made using a template which can be found here.