The realm of digital art is being revolutionised by artificial intelligence (AI) technologies, enabling striking visual expressions through text-to-image synthesis. This has been made possible through pioneering AI models, such as Diffusion and OpenAI’s CLIP, allowing artists to generate images based on textual descriptions.
In view of the novelty of the field and its rapid evolution crafting effective prompts for the above-mentioned AI models remains challenging. Hence, this study set out to address this gap by investigating the impact of various prompt structures and guidelines on AI art generation, focusing on the connection between textual prompts and the resulting images.
Central to this study is an experimental approach exploring the effectiveness of the following distinct prompt structures:
- [Medium] of [Subject] in the style of [Style]
- [Style] [Medium] of [Subject]
- [Medium] of [Subject], [Style]
- [Medium] of [Subject] with [Style] style
- [Medium], [Subject], [Style]
- [Subject], [Medium], [Style]
The prompts had three main keywords in common, namely: Subject, Style, and Medium – with variations for each. The variations for Subject included ‘dog’, ‘forest’, and ‘person’; for Style, they comprised ‘baroque’, ‘neo-plasticism’, and ‘digital art’; for Medium, the options were ‘3D render’, ‘painting’, and ‘photograph’. These variations were selected according to the analysis of prompt inputs for popular text-to-image generators, Stable Diffusion and DALLE-2, as well as a review of existing guidelines and popular terms in the AI art community.
A survey was conducted whereby respondents were required to evaluate image sets generated using these prompts across various criteria, such as aesthetics, adherence to the combination of keywords, and distinctiveness. The accompanying image provides a visual representation of the outcomes from the selected prompts.
Survey respondents were targeted from the art community, including digital artists, painters, and graphic designers, to ensure a reliable evaluation of the image sets. In addition, interviews with a renowned visual-effects artist and an AI artist provided further insight, with both acknowledging future potential of AI art and significant influence on the film and art production scene.
The study consisted in analysing the survey data and extracting valuable insights about the effectiveness of different prompt structures. By identifying trends and patterns, with the help of plots to visualise the data, the research sought to reveal the relationship between the selected prompts and the visual outcomes. This shed light on the performance of the prompts, including: those resulting in high-quality, meaningful images that would successfully convey the intended artistic vision; and any association between the prompts and demographic variables.
The study also explored the potential of AI tools, such as the latest chatbot model GPT-4 (as at April 2023), in refining prompt formation by ranking prompt effectiveness. This research aspect highlighted the exciting possibility of AI assisting in the development of its own creative potential.
As AI art continues to evolve, this study offers a timely and useful investigation into the intricacies of prompt design for the generation of AI art. By providing insights and guidance for artists and designers, the study sought to empower them to harness the full potential of AI technologies, pushing the boundaries of creativity, thus paving the way for potentially groundbreaking artistic expression.
Figure 1. Artificial intelligence images generated by different prompt structures
Student: Gabriel Vella
Supervisor: Dr Vanessa Camilleri