1. bookVolumen 12 (2022): Heft 2 (April 2022)
Zeitschriftendaten
License
Format
Zeitschrift
eISSN
2449-6499
Erstveröffentlichung
30 Dec 2014
Erscheinungsweise
4 Hefte pro Jahr
Sprachen
Englisch
access type Uneingeschränkter Zugang

Position-Encoding Convolutional Network to Solving Connected Text Captcha

Online veröffentlicht: 23 Feb 2022
Volumen & Heft: Volumen 12 (2022) - Heft 2 (April 2022)
Seitenbereich: 121 - 133
Eingereicht: 06 Oct 2021
Akzeptiert: 12 Oct 2021
Zeitschriftendaten
License
Format
Zeitschrift
eISSN
2449-6499
Erstveröffentlichung
30 Dec 2014
Erscheinungsweise
4 Hefte pro Jahr
Sprachen
Englisch
Abstract

Text-based CAPTCHA is a convenient and effective safety mechanism that has been widely deployed across websites. The efficient end-to-end models of scene text recognition consisting of CNN and attention-based RNN show limited performance in solving text-based CAPTCHAs. In contrast with the street view image and document, the character sequence in CAPTCHA is non-semantic. The RNN loses its ability to learn the semantic context and only implicitly encodes the relative position of extracted features. Meanwhile, the security features, which prevent characters from segmentation and recognition, extensively increase the complexity of CAPTCHAs. The performance of this model is sensitive to different CAPTCHA schemes. In this paper, we analyze the properties of the text-based CAPTCHA and accordingly consider solving it as a highly position-relative character sequence recognition task. We propose a network named PosConv to leverage the position information in the character sequence without RNN. PosConv uses a novel padding strategy and modified convolution, explicitly encoding the relative position into the local features of characters. This mechanism of PosConv makes the extracted features from CAPTCHAs more informative and robust. We validate PosConv on six text-based CAPTCHA schemes, and it achieves state-of-the-art or competitive recognition accuracy with significantly fewer parameters and faster convergence speed.

[1] Darko Brodić, Alessia Amelio, Nadeem Ahmad, and Syed Khuram Shahzad. Usability analysis of the image and interactive captcha via prediction of the response time. In International Workshop on Multi-disciplinary Trends in Artificial Intelligence, pages 252–265. Springer, 2017.10.1007/978-3-319-69456-6_21 Search in Google Scholar

[2] Elie Bursztein, Jonathan Aigrain, Angelika Moscicki, and John C Mitchell. The end is nigh: Generic solving of text-based captchas. In 8th {USENIX} Workshop on Offensive Technologies ({WOOT} 14), 2014. Search in Google Scholar

[3] Elie Bursztein, Matthieu Martin, and John Mitchell. Text-based captcha strengths and weaknesses. In Proceedings of the 18th ACM conference on Computer and communications security, pages 125–138, 2011.10.1145/2046707.2046724 Search in Google Scholar

[4] Kumar Chellapilla, Kevin Larson, Patrice Y Simard, and Mary Czerwinski. Computers beat humans at single character recognition in reading based human interaction proofs (hips). In Conference on Email and Anti-Spam (CEAS), pages 1–8, 2005.10.1145/1054972.1055070 Search in Google Scholar

[5] Chen Duan, Rong Zhang, and Ke Qing. Feature refine network for text-based captcha recognition. In International Conference on Image and Graphics, pages 64–73. Springer, 2019.10.1007/978-3-030-34110-7_6 Search in Google Scholar

[6] Ian J. Goodfellow and Yaroslav Bulatov and Julian Ibarz and Sacha Arnoud and Vinay Shet, Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks, 1312.6082, 2014. Search in Google Scholar

[7] Ahmad Salah El Ahmad, Jeff Yan, and Lindsay Marshall. The robustness of a new captcha. In Proceedings of the Third European Workshop on System Security, pages 36–41, 2010.10.1145/1752046.1752052 Search in Google Scholar

[8] Haichang Gao, Mengyun Tang, Yi Liu, Ping Zhang, and Xiyang Liu. Research on the security of microsoft’s two-layer captcha. IEEE Transactions on Information Forensics and Security, 12(7):1671–1685, 2017.10.1109/TIFS.2017.2682704 Search in Google Scholar

[9] Haichang Gao, Jeff Yan, Fang Cao, Zhengya Zhang, Lei Lei, Mengyun Tang, Ping Zhang, Xin Zhou, Xuqin Wang, and Jiawei Li. A simple generic attack on text captchas. In The Network and Distributed System Security Symposium (NDSS), pages 1–14, 2016. Search in Google Scholar

[10] Md Amirul Islam, Sen Jia, and Neil D. B. Bruce. How much position information do convolutional neural networks encode?, 2020. Search in Google Scholar

[11] Rosanne Liu, Joel Lehman, Piero Molino, Felipe Petroski Such, Eric Frank, Alex Sergeev, and Jason Yosinski. An intriguing failing of convolutional neural networks and the coordconv solution, 2018. Search in Google Scholar

[12] Pengyuan Lyu, Minghui Liao, Cong Yao, Wenhao Wu, and Xiang Bai. Mask textspotter: An end-to-end trainable neural network for spotting text with arbitrary shapes. In Proceedings of the European Conference on Computer Vision (ECCV), pages 67–83, 2018. Search in Google Scholar

[13] Rabih Al Nachar, Elie Inaty, Patrick J Bonnin, and Yasser Alayli. Breaking down captcha using edge corners and fuzzy logic segmentation/recognition technique. Security and Communication Networks, 8(18):3995–4012, 2015.10.1002/sec.1316 Search in Google Scholar

[14] Liang Qiao, Ying Chen, Zhanzhan Cheng, Yunlu Xu, Yi Niu, Shiliang Pu, and Fei Wu. Mango: A mask attention guided one-stage scene text spotter, 2020. Search in Google Scholar

[15] Sara Sabour, Nicholas Frosst, and Geoffrey E Hinton. Dynamic routing between capsules, 2017. Search in Google Scholar

[16] Mengyun Tang, Haichang Gao, Yang Zhang, Yi Liu, Ping Zhang, and Ping Wang. Research on deep learning techniques in breaking text-based captchas and designing image-based captcha. IEEE Transactions on Information Forensics and Security, 13(10):2522–2537, 2018.10.1109/TIFS.2018.2821096 Search in Google Scholar

[17] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. In Advances in neural information processing systems, pages 5998–6008, 2017. Search in Google Scholar

[18] Luis Von Ahn, Manuel Blum, and John Langford. Telling humans and computers apart automatically. Communications of the ACM, 47(2):56–60, 2004.10.1145/966389.966390 Search in Google Scholar

[19] Zbigniew Wojna, Alexander N Gorban, Dar-Shyang Lee, Kevin Murphy, Qian Yu, Yeqing Li, and Julian Ibarz. Attention-based extraction of structured information from street view imagery. In 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), volume 1, pages 844–850. IEEE, 2017.10.1109/ICDAR.2017.143 Search in Google Scholar

[20] Jeff Yan and Ahmad Salah El Ahmad. A low-cost attack on a microsoft captcha. In Proceedings of the 15th ACM conference on Computer and communications security, pages 543–554, 2008.10.1145/1455770.1455839 Search in Google Scholar

[21] Guixin Ye, Zhanyong Tang, Dingyi Fang, Zhanxing Zhu, Yansong Feng, Pengfei Xu, Xiaojiang Chen, Jungong Han, and Zheng Wang. Using generative adversarial networks to break and protect text captchas. ACM Transactions on Privacy and Security (TOPS), 23(2):1–29, 2020.10.1145/3378446 Search in Google Scholar

[22] Guixin Ye, Zhanyong Tang, Dingyi Fang, Zhanxing Zhu, Yansong Feng, Pengfei Xu, Xiaojiang Chen, and Zheng Wang. Yet another text captcha solver: A generative adversarial network based approach. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pages 332–348, 2018. Search in Google Scholar

[23] Yang Zi, Haichang Gao, Zhouhang Cheng, and Yi Liu. An end-to-end attack on text captchas. IEEE Transactions on Information Forensics and Security, 15:753–766, 2019.10.1109/TIFS.2019.2928622 Search in Google Scholar

Empfohlene Artikel von Trend MD

Planen Sie Ihre Fernkonferenz mit Scienceendo