On the design of web crawlers for constructing an efficient Chinese-Portuguese bilingual corpus system

Sio Tai Cheong, Jiabo Xu, Yue Liu

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Citations (Scopus)

Abstract

Machine Translation is a very popular and important topic in Natural Language Processing (NLP) during the last few decades. This paper focuses on the design of the Web Crawlers for Chinese-Portuguese bilingual corpus construction, and this corpus would be used in corresponding Machine Translation systems. It accomplished a bilingual corpus construction process from bilingual corpus collection with web crawlers based on different sources. By this mean, this system can be considered as an innovative and reasonable attempt in setting up the bilingual corpora with Chinese and Portuguese, and it has solved some practical problems at the initial stage of the corpus construction.

Original languageEnglish
Title of host publicationInternational Conference on Electronics, Information and Communication, ICEIC 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1-4
Number of pages4
ISBN (Electronic)9781538647547
DOIs
Publication statusPublished - 2 Apr 2018
Event17th International Conference on Electronics, Information and Communication, ICEIC 2018 - Honolulu, United States
Duration: 24 Jan 201827 Jan 2018

Publication series

NameInternational Conference on Electronics, Information and Communication, ICEIC 2018
Volume2018-January

Conference

Conference17th International Conference on Electronics, Information and Communication, ICEIC 2018
Country/TerritoryUnited States
CityHonolulu
Period24/01/1827/01/18

Keywords

  • Bilingual Corpus
  • Machine Learning
  • Machine Translation
  • NLP
  • Web Crawler

Fingerprint

Dive into the research topics of 'On the design of web crawlers for constructing an efficient Chinese-Portuguese bilingual corpus system'. Together they form a unique fingerprint.

Cite this