TY - GEN
T1 - Enhancement Spatial Transformer Networks for Text Classification
AU - Chan, Ka Hou
AU - Im, Sio Kei
AU - Ian, Vai Kei
AU - Chan, Ka Man
AU - Ke, Wei
N1 - Publisher Copyright:
© 2020 ACM.
PY - 2020/6/26
Y1 - 2020/6/26
N2 - This paper introduces a 2D transformation based framework for arbitrary-oriented text detection in natural scene images. We present the localization networks within Spatial Transformer Networks (STN), which are designed to generate proposals with text orientation affine information including translation, scaling and rotation. This information will then be adapted as learning parameters to make the proposals to be fitted into the text regular form in terms of the orientation more accurately. Localization network is proposed to project arbitrary-oriented proposals to a feature map for a text region classifier. Compared with any previous text detection systems, this work ensures the relationship between the learning parameters, which can lead to a better approximation for orientation. As a result, this new layer greatly enhances the training accuracy. Moreover, the design and implementation can be easily deployed in the current systems built upon the standard CNNs architecture.
AB - This paper introduces a 2D transformation based framework for arbitrary-oriented text detection in natural scene images. We present the localization networks within Spatial Transformer Networks (STN), which are designed to generate proposals with text orientation affine information including translation, scaling and rotation. This information will then be adapted as learning parameters to make the proposals to be fitted into the text regular form in terms of the orientation more accurately. Localization network is proposed to project arbitrary-oriented proposals to a feature map for a text region classifier. Compared with any previous text detection systems, this work ensures the relationship between the learning parameters, which can lead to a better approximation for orientation. As a result, this new layer greatly enhances the training accuracy. Moreover, the design and implementation can be easily deployed in the current systems built upon the standard CNNs architecture.
KW - Affine Transformation
KW - Homogeneous Matrix
KW - Learning Parameters
KW - Spatial Transformer Networks
UR - http://www.scopus.com/inward/record.url?scp=85090396915&partnerID=8YFLogxK
U2 - 10.1145/3406971.3406981
DO - 10.1145/3406971.3406981
M3 - Conference contribution
AN - SCOPUS:85090396915
T3 - ACM International Conference Proceeding Series
SP - 5
EP - 10
BT - ICGSP 2020 - Proceedings of the 4th International Conference on Graphics and Signal Processing
PB - Association for Computing Machinery
T2 - 4th International Conference on Graphics and Signal Processing, ICGSP 2020
Y2 - 26 June 2020 through 28 June 2020
ER -