Mastodon @Mastodon

**Frederik Brudy** @Kopfnuss · Oct 31, 2023

Oct 31, 2023

Creating images of imaginative worlds using generative AI can be a challenge to control using text-to-image alone. At #uist2023 Hai Dang will present #WorldSmith today, work we did while he was interning with us at Autodesk Research in Toronto (with @fraseranderson, George Fitzmaurice, and me).

A screenshot of WorldSmith depicting the high-level interface. The screenshot shows the Global Tile View, Detail View, Tree View, and Result View.

**Frederik Brudy** @Kopfnuss · Oct 31, 2023

Oct 31, 2023

Frederik Brudy @Kopfnuss

WorldSmith is a novel UI & workflow for creators to composite scenes using iterative, multi-modal prompts for #generativeAI. It allows creators to specify their intent through text, sketching, or region-based input.

A screenshot showing a user sketch and the resulting generated images based on the sketch.

A screenshot showing 3 user-created regions highlighted in different colors. For each region, the user has created a corresponding region description.

Frederik Brudy @Kopfnuss@mastodon.social

Images can be blended together to form one cohesive depiction of a fictional world.

A screenshot showing four image tiles. Tiles are blended into a cohesive image.

Oct 31, 2023, 03:35 PM··Elk

1boost·1favorite

**Frederik Brudy** @Kopfnuss · Oct 31, 2023

Oct 31, 2023

Frederik Brudy @Kopfnuss

The progress is tracked in an interactive graph, offering a dynamic way to explore and evolve their creations.

A screenshot with enlarged elements that depict the contents of a tree view at different iteration stages.

**Frederik Brudy** @Kopfnuss · Oct 31, 2023

Oct 31, 2023

Frederik Brudy @Kopfnuss

#WorldSmith conceptualizes two expressive prompting techniques for #generativeAI: hierarchical prompting (which attaches prompts to different layers of the composition) and spatial prompting (allowing users to specify spatial relations through direct input).