Using DALL·E to Generate Photorealistic Stimuli for Cognitive Psychology Experiments – Part 1

By Aedan Li – July 17, 2022

As someone who spends a lot of time manually creating stimuli, I am well aware of the challenge (and time) it takes to handcraft images for cognitive psychology and neuroscience experiments. What if there were a way to exponentially speed up the creation of complex photorealistic stimuli? Critically, tools like DALL·E may be able to accelerate the stimulus design process (e.g., Goetschalckx, Andonian, & Wagemans, 2021)

DALL·E is a 12-billion parameter version of GPT-3 trained to generate images from text descriptions. Basically, you input a sentence and the algorithm will automatically generate high-resolution photorealistic images corresponding to the input text.

I received beta access and like a kid eating candy for the first time, I spent the past few days obsessively exploring whether DALL·E could be used to generate stimuli for my experiments.

As a prelude, the results were even better than I had originally expected. Every image below was generated by DALL·E. 🤯

A kid eating candy, vibrant colors

1. Object Categories

In the most straightforward application, I could create photorealistic object images, including animals, scenes, plants, and tools. This functionality may be important for any general cognitive experiment, like studies manipulating object exemplars and categories (Murphy, 2022). In the below, each image took 10 seconds to create.

Two pitbulls
Two hammers
Two plants

2. Similar Object Lures

Another experimental manipulation involves showing participants similar images of study objects (i.e., a similar lure; Kim & Yassa, 2013). These object images may take considerable time to develop, especially for experiments with hundreds of study objects.

DALL·E provides a “variations” option that allows the generation of similar variants of any image with a single mouse click.

A tambourine on a beach

You can also generate variations of your own uploaded image. To test this capability, I uploaded a fribble (Barry, Griffith, De Rossi, & Hermans, 2014). See the results for yourself.

Variations of a “fribble” image

3. Object Position

In some cases, the experimenter may wish to display objects in different spatial positions. For example, testing object-in-place memory, thought to be linked to the human medial temporal lobes (e.g., Yeung et al., 2019).

DALL·E can generate the relative locations of realistic objects using modifiers such as “on top” or “below”.

A beaver on top of a table
A beaver below a table

Furthermore, object position can be directly edited by the experimenter. I uploaded an image of a farmers market and then changed the location of a cat. Can you spot the change? (e.g., Simons, Franconeri, & Reimer, 2000).

A farmers market
A cat at the farmers market
A cat at the farmers market
A cat at the farmers market

4. Object Features

DALL·E is fascinatingly good at editing the features associated with object concepts. For example, I manipulated the texture of the frogs below, which would be challenging and time consuming to create manually with existing tools.

In addition to features like texture, you can manipulate abstract features like complexity, emotional content, and even traits like “old-fashioned”.

Happy scene of a truck
Complex plate of food
Simple plate of food
Sad scene of a truck
Upset toy set
Abstract object
Shiny toy set
Enthusiastic toy set
Old-fashioned toy set
Spooky toy set

5. Object Dimensionality

I was able to easily generate line drawings as well as more complex 3D renders. This latter capability might be especially useful for future studies pairing realistic objects with virtual reality (e.g., Bohil, Alicea, & Bioccca, 2011).

Line drawing of a dog
Line drawing of an airplane
3D render of a clay chair
3D render of a character from Animal Crossing

6. Object Relations in Complex Scenes

Intriguingly, DALL·E could manipulate the position and quality of multiple objects in a scene, including their realism, resolution, locations in the real world, and even whether a photo of an image was poorly taken or not. The one limitation was that comprehensible language could not be displayed on signs or buildings.

A small duck standing next to a drawing of a swan
Zoomed in picture of the Eiffel Tower
A hyper-realistic photo of a small corgi next to a street sign during winter
A blurry photo of a small corgi next to a street sign during winter
A badly shot photo of a small corgi next to a street sign during winter

7. General Image Style

The original design of DALL·E may have been to generate art, and indeed, the algorithm is fantastic at doing so. This functionality may be applicable for experiments that manipulate the general category shared across multiple images.

A painting of Toronto in the style of Anime
A painting of Toronto in the style of Abstract art
A painting of Toronto in the style of Cubism

Conclusion – Part 1

Some researchers have suggested that artificial intelligence is the “new electricity“, which will transform every industry and create enormous economic value. In this preliminary investigation, I have likely only scratched the surface of what artificial neural networks can do for stimulus generation in cognitive psychology and neuroscience experiments.

It was really fun exploring DALL·E (thank you to OpenAI)! If anyone reading has future ideas or general thoughts about where to take this project next, please feel free to shoot me a message on Twitter or by email (aedanyue.li@utoronto.ca).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s