E-SWAN:Efficient Sliding Window Analysis Network for Real-Time Speech Steganography Detection  

作  者:Kening Wang Feipeng Gao Jie Yang Hao Zhang 

机构地区:[1]School of Engineering and Technology,Jiyang College of Zhejiang A & F University,Zhuji,311800,China [2]College of Mathematics and Computer Science,Zhejiang A & F University,Hangzhou,311300,China

出  处:《Computers, Materials & Continua》2025年第3期4797-4820,共24页计算机、材料和连续体(英文)

基  金:supported in part by the Zhejiang Provincial Natural Science Foundation of China under Grant LQ20F020004;in part by the National College Student Innovation and Research Training Program under Grant 202313283002.

摘  要:With the rapid advancement of Voice over Internet Protocol(VoIP)technology,speech steganography techniques such as Quantization Index Modulation(QIM)and Pitch Modulation Steganography(PMS)have emerged as significant challenges to information security.These techniques embed hidden information into speech streams,making detection increasingly difficult,particularly under conditions of low embedding rates and short speech durations.Existing steganalysis methods often struggle to balance detection accuracy and computational efficiency due to their limited ability to effectively capture both temporal and spatial features of speech signals.To address these challenges,this paper proposes an Efficient Sliding Window Analysis Network(E-SWAN),a novel deep learning model specifically designed for real-time speech steganalysis.E-SWAN integrates two core modules:the LSTM Temporal Feature Miner(LTFM)and the Convolutional Key Feature Miner(CKFM).LTFM captures long-range temporal dependencies using Long Short-Term Memory networks,while CKFM identifies local spatial variations caused by steganographic embedding through convolutional operations.These modules operate within a sliding window framework,enabling efficient extraction of temporal and spatial features.Experimental results on the Chinese CNV and PMS datasets demonstrate the superior performance of E-SWAN.Under conditions of a ten-second sample duration and an embedding rate of 10%,E-SWAN achieves a detection accuracy of 62.09%on the PMS dataset,surpassing existing methods by 4.57%,and an accuracy of 82.28%on the CNV dataset,outperforming state-of-the-art methods by 7.29%.These findings validate the robustness and efficiency of E-SWAN under low embedding rates and short durations,offering a promising solution for real-time VoIP steganalysis.This work provides significant contributions to enhancing information security in digital communications.

关 键 词:STEGANALYSIS SPEECH convolutional sliding window deep learning 

分 类 号:TN9[电子电信—信息与通信工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象