Script-Generated Picture Book Technology Based on Large Language Models and AIGC

Dejiang Wang, Zhuoran Zhai, Ngai Cheong, Li Peng

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This paper mainly discusses how to use the large language models such as GPT and Ernie model combined with the AIGC tools represented by stable diffusion, which uses a random story script to generate images with fixed style, character characteristics, and continuous plots. The article provides a detailed introduction to how to build an assembly line, using a large language model and a story script to generate the prompt words required for stable diffusion. Subsequently, by comparing the characteristics of traditional picture book production and the image results of using language models word prompts, summarize the limitations of text to images. This leads to a supervised multi round iterative LoRA model scheme that utilizes the CLIP to achieve character IP fixation. Simultaneously using the ControlNet model and inpainting to preprocess and reprocess the image can achieve controllable character poses and fixed backgrounds in the picture book. Finally, we will evaluate and summarize the new scheme and analyze its strengths in picture book creation accordingly.

Original languageEnglish
Title of host publicationICDTE 2023 - 2023 7th International Conference on Digital Technology in Education
PublisherAssociation for Computing Machinery
Pages104-108
Number of pages5
ISBN (Electronic)9798400708527
DOIs
Publication statusPublished - 8 Sept 2023
Event7th International Conference on Digital Technology in Education, ICDTE 2023 - Virtual, Online, China
Duration: 8 Sept 202310 Sept 2023

Publication series

NameACM International Conference Proceeding Series

Conference

Conference7th International Conference on Digital Technology in Education, ICDTE 2023
Country/TerritoryChina
CityVirtual, Online
Period8/09/2310/09/23

Fingerprint

Dive into the research topics of 'Script-Generated Picture Book Technology Based on Large Language Models and AIGC'. Together they form a unique fingerprint.

Cite this