TY - GEN
T1 - An artificial intelligence approach to automatically generate Cantonese meeting minutes for e-government
AU - Li, Pinyan
AU - Hoi, Lap Man
AU - Wang, Yapeng
AU - Im, Sio Kei
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - When artificial intelligence (AI) technology enters the world of acoustics, many projects that require audio processing are automated and no longer require much manual work. In particular, the use of the latest AI technologies to recognize speech and determine speakers makes it possible to generate meeting minutes entirely by machines. In the southeastern region of China (for example Hong Kong and Macao), people use Cantonese as the official language. Internal government meetings are usually conducted in Cantonese, and the executive department will request that the minutes be submitted as soon as possible after the meeting. In addition to the content of the speeches, the minutes must also include the identities of the corresponding speakers. During some periods of intensive meetings on new policy releases, our interpreters faced great pressure to script the exhausting meeting minutes. Due to the presence of local terms and personal names, even state-of-the-art large language models (LLMs) cannot fully suffice. Therefore, we propose a novel approach to solve such problems. This approach is a three-tier software architecture: the data tier (data processing), the service tier (AI models and web services), and the application tier (user interfaces). The implementation work is carried out using modern AI models (OpenAI's Whisper and Nvidia's TitaNet) and a dataset we created (Cantonese Policy Address, CPA). Training results (the word error rate is 33.81% and the equal error rate is 0.54) and validation results (confusion matrix up to 97%) show that our proposed approach improves automatic recognition precision, thus helping people understand the spirit of the meeting more effectively and quickly.
AB - When artificial intelligence (AI) technology enters the world of acoustics, many projects that require audio processing are automated and no longer require much manual work. In particular, the use of the latest AI technologies to recognize speech and determine speakers makes it possible to generate meeting minutes entirely by machines. In the southeastern region of China (for example Hong Kong and Macao), people use Cantonese as the official language. Internal government meetings are usually conducted in Cantonese, and the executive department will request that the minutes be submitted as soon as possible after the meeting. In addition to the content of the speeches, the minutes must also include the identities of the corresponding speakers. During some periods of intensive meetings on new policy releases, our interpreters faced great pressure to script the exhausting meeting minutes. Due to the presence of local terms and personal names, even state-of-the-art large language models (LLMs) cannot fully suffice. Therefore, we propose a novel approach to solve such problems. This approach is a three-tier software architecture: the data tier (data processing), the service tier (AI models and web services), and the application tier (user interfaces). The implementation work is carried out using modern AI models (OpenAI's Whisper and Nvidia's TitaNet) and a dataset we created (Cantonese Policy Address, CPA). Training results (the word error rate is 33.81% and the equal error rate is 0.54) and validation results (confusion matrix up to 97%) show that our proposed approach improves automatic recognition precision, thus helping people understand the spirit of the meeting more effectively and quickly.
UR - https://www.scopus.com/pages/publications/105033151144
U2 - 10.1109/SMC58881.2025.11342794
DO - 10.1109/SMC58881.2025.11342794
M3 - Conference contribution
AN - SCOPUS:105033151144
T3 - Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics
SP - 7053
EP - 7059
BT - 2025 IEEE International Conference on Systems, Man, and Cybernetics
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2025 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2025
Y2 - 5 October 2025 through 8 October 2025
ER -