国产欧美精品一区二区,中文字幕专区在线亚洲,国产精品美女网站在线观看,艾秋果冻传媒2021精品,在线免费一区二区,久久久久久青草大香综合精品,日韩美aaa特级毛片,欧美成人精品午夜免费影视

基于改進(jìn)MTSv2的文本檢測和識別算法研究
DOI:
CSTR:
作者:
作者單位:

江南大學(xué) 物聯(lián)網(wǎng)工程學(xué)院

作者簡(jiǎn)介:

通訊作者:

中圖分類(lèi)號:

基金項目:

國家自然科學(xué)基金青年項目(6170185),國家自然科學(xué)基金(61901206)


Research on Text Detection and Recognition Algorithm Based on Improved MTSv2
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 圖/表
  • |
  • 訪(fǎng)問(wèn)統計
  • |
  • 參考文獻
  • |
  • 相似文獻
  • |
  • 引證文獻
  • |
  • 資源附件
  • |
  • 文章評論
    摘要:

    在自然場(chǎng)景圖像中,豐富的文本內容對于全面理解場(chǎng)景非常重要。針對自然場(chǎng)景文本圖像存在背景復雜、文本粘連、文本多角度等問(wèn)題,提出一種基于改進(jìn)MTSv2的文本檢測和識別算法。檢測算法以MTSv2為基礎網(wǎng)絡(luò ),首先采用CBAM注意力機制增大特征圖中的小型文本的權重,更好捕捉圖像中的關(guān)鍵特征;其次融合CE-FPN結構,減輕多尺度融合產(chǎn)生的特征混疊問(wèn)題;最后引入focal loss函數,減少正負樣本分布不均衡對識別準確率的影響,使網(wǎng)絡(luò )更加關(guān)注難以分類(lèi)的樣本,改善模型的泛化能力。通過(guò)多個(gè)文本數據集進(jìn)行訓練,并在ICDAR2015數據集上進(jìn)行驗證,改進(jìn)后模型對場(chǎng)景文本檢測和識別的準確率達到了89.3%,召回率達到了87.6%,F1值達到了88.5%,相比于原模型都有一定程度的提高。

    Abstract:

    In natural scene images, rich text content is very important for a comprehensive understanding of the scene. Aiming at the problems of complex background, sticky text, and multi-angle text in natural scene text images, a text detection and recognition algorithm based on improved MTSv2 is proposed. The detection algorithm takes MTSv2 as the base network, firstly, the Convolutional Block Attention Module(CBAM) attention mechanism is used to increase the weight of the small text in the feature map, so as to better capture the key features in the image; secondly, the Channel Enhancement-Feature Pyramid Network(CE-FPN) structure is used to alleviate the feature aliasing problem generated by multi-scale fusion; finally, the focal loss function is introduced to reduce the impact of the imbalance of the distribution of the positive and negative samples on the recognition accuracy, so that the network pays more attention to the samples that are difficult to classify and improve the generalization ability of the model. Trained on multiple text datasets and validated on the ICDAR2015 dataset, the accuracy of the improved model for scene text detection and recognition reaches 89.3%, the recall rate reaches 87.6%, and the F1 value reaches 88.5%, which are all improved to a certain extent compared with the original model.
    Keywords:Scene Text; Text Detection; Text Recognition; CBAM; CE-FPN; Attention Mechanism

    參考文獻
    相似文獻
    引證文獻
引用本文

王艷媛,茅正沖,楊雨涵.基于改進(jìn)MTSv2的文本檢測和識別算法研究計算機測量與控制[J].,2024,32(9):256-261.

復制
分享
文章指標
  • 點(diǎn)擊次數:
  • 下載次數:
  • HTML閱讀次數:
  • 引用次數:
歷史
  • 收稿日期:2023-09-05
  • 最后修改日期:2023-10-14
  • 錄用日期:2023-10-16
  • 在線(xiàn)發(fā)布日期: 2024-10-08
  • 出版日期:
文章二維碼
贡嘎县| 东阳市| 拉孜县| 兴城市| 策勒县| 合川市| 云梦县| 丁青县| 临猗县| 朝阳区| 禹州市| 蕉岭县| 宝应县| 巴彦淖尔市| 黔西| 崇义县| 定南县| 盐池县| 邛崃市| 军事| 兴安县| 凯里市| 达州市| 鲜城| 南充市| 南乐县| 保亭| 成武县| 宜丰县| 郸城县| 饶河县| 蕉岭县| 江城| 绥棱县| 郑州市| 全椒县| 建瓯市| 三都| 砀山县| 怀化市| 台湾省|