TY - GEN
T1 - Script-Generated Picture Book Technology Based on Large Language Models and AIGC
AU - Wang, Dejiang
AU - Zhai, Zhuoran
AU - Cheong, Ngai
AU - Peng, Li
N1 - Publisher Copyright:
© 2023 ACM.
PY - 2023/9/8
Y1 - 2023/9/8
N2 - This paper mainly discusses how to use the large language models such as GPT and Ernie model combined with the AIGC tools represented by stable diffusion, which uses a random story script to generate images with fixed style, character characteristics, and continuous plots. The article provides a detailed introduction to how to build an assembly line, using a large language model and a story script to generate the prompt words required for stable diffusion. Subsequently, by comparing the characteristics of traditional picture book production and the image results of using language models word prompts, summarize the limitations of text to images. This leads to a supervised multi round iterative LoRA model scheme that utilizes the CLIP to achieve character IP fixation. Simultaneously using the ControlNet model and inpainting to preprocess and reprocess the image can achieve controllable character poses and fixed backgrounds in the picture book. Finally, we will evaluate and summarize the new scheme and analyze its strengths in picture book creation accordingly.
AB - This paper mainly discusses how to use the large language models such as GPT and Ernie model combined with the AIGC tools represented by stable diffusion, which uses a random story script to generate images with fixed style, character characteristics, and continuous plots. The article provides a detailed introduction to how to build an assembly line, using a large language model and a story script to generate the prompt words required for stable diffusion. Subsequently, by comparing the characteristics of traditional picture book production and the image results of using language models word prompts, summarize the limitations of text to images. This leads to a supervised multi round iterative LoRA model scheme that utilizes the CLIP to achieve character IP fixation. Simultaneously using the ControlNet model and inpainting to preprocess and reprocess the image can achieve controllable character poses and fixed backgrounds in the picture book. Finally, we will evaluate and summarize the new scheme and analyze its strengths in picture book creation accordingly.
UR - http://www.scopus.com/inward/record.url?scp=85182738158&partnerID=8YFLogxK
U2 - 10.1145/3626686.3626704
DO - 10.1145/3626686.3626704
M3 - Conference contribution
AN - SCOPUS:85182738158
T3 - ACM International Conference Proceeding Series
SP - 104
EP - 108
BT - ICDTE 2023 - 2023 7th International Conference on Digital Technology in Education
PB - Association for Computing Machinery
T2 - 7th International Conference on Digital Technology in Education, ICDTE 2023
Y2 - 8 September 2023 through 10 September 2023
ER -