音频捕获技术指攻击者通过软硬件手段非法获取目标设备周围声学信息的情报收集行为,通常利用操作系统API或应用程序接口访问麦克风等音频输入设备。防御措施主要依赖检测异常进程的录音权限获取行为、监控可疑音频文件生成,以及分析应用程序的非常规API调用模式。由于合法语音应用普遍需要持续访问音频设备,准确区分恶意行为存在较高误报风险。
为规避传统检测机制,攻击者发展出新型音频捕获匿迹技术,通过重构数据存储介质、寄生合法应用流程、融合环境声学特征及渗透加密信道等手法,将窃取行为深度嵌入正常语音交互场景,显著降低设备层、应用层和网络层的可观测性。
当前音频捕获匿迹技术的共性在于攻击链路的全环节隐形重构。攻击者突破传统物理设备控制模式,转向内存、加密信道和应用逻辑层构建窃取路径:内存驻留技术消除磁盘写入痕迹,使数据生命周期完全处于易失性存储空间;合法服务寄生技术将恶意代码植入可信应用进程,获得系统安全机制的默认放行;环境自适应技术通过上下文感知动态调节攻击强度,实现窃取行为与物理环境的特征同步;加密信道渗透技术则利用通信端点的信任关系,在安全边界内部实施数据截获。这些技术均采用"逻辑层寄生"策略,将攻击行为深度绑定在合法功能实现流程中,使得基于单点异常指标(如麦克风激活时长)或静态特征匹配的检测方法失效。
匿迹技术的演进导致传统基于设备使用监控、文件特征检测的防御体系面临严峻挑战,防御方需构建多维行为基线模型,实施从物理传感器到应用逻辑层的全链路监控,并引入声纹生物特征分析技术,实现对隐蔽音频窃取行为的精准识别与阻断。
| 效应类型 | 是否存在 |
|---|---|
| 特征伪装 | ✅ |
| 行为透明 | ❌ |
| 数据遮蔽 | ✅ |
| 时空释痕 | ✅ |
攻击者通过劫持合法语音应用的录音功能,使恶意音频捕获行为在进程签名、API调用链等维度与正常业务操作完全一致。例如寄生在视频会议软件中时,录音模块的线程调度、内存访问模式均符合应用正常行为特征,有效规避基于进程行为分析的检测系统。
采用端到端加密信道渗透技术时,攻击者通过在加密流量解密瞬间实施内存窃取,确保网络传输层始终维持加密状态。同时使用自定义加密协议传输窃取内容,使得网络流量分析无法直接获取有效载荷,实现双重数据遮蔽效果。
环境自适应录音技术通过动态调整捕获频率和时长,使设备激活模式呈现间歇性、低强度的特征。结合合法语音服务的自然使用周期,将窃取行为的时间特征稀释在用户正常交互过程中,显著降低基于时间序列异常检测的发现概率。
| ID | Name | Description |
|---|---|---|
| G0067 | APT37 |
APT37 has used an audio capturing utility known as SOUNDWAVE that captures microphone input.[1] |
| S0438 | Attor |
Attor's has a plugin that is capable of recording audio using available input sound devices.[2] |
| S0234 | Bandook | |
| S0454 | Cadelspy |
Cadelspy has the ability to record audio from the compromised host.[4] |
| S0338 | Cobian RAT |
Cobian RAT has a feature to perform voice recording on the victim’s machine.[5] |
| S0115 | Crimson |
Crimson can perform audio surveillance using microphones.[6] |
| S0334 | DarkComet |
DarkComet can listen in to victims' conversations through the system’s microphone.[7][8] |
| S0021 | Derusbi | |
| S0213 | DOGCALL |
DOGCALL can capture microphone data from the victim's machine.[10] |
| S0152 | EvilGrab |
EvilGrab has the capability to capture audio from a victim machine.[11] |
| S0143 | Flame |
Flame can record audio using any existing hardware recording devices.[12][13] |
| S0434 | Imminent Monitor |
Imminent Monitor has a remote microphone monitoring capability.[14][15] |
| S0260 | InvisiMole |
InvisiMole can record sound using input audio devices.[16][17] |
| S0163 | Janicab |
Janicab captured audio and sent it out to a C2 server.[18][19] |
| S0283 | jRAT | |
| S0409 | Machete |
Machete captures audio from the computer’s microphone.[21][22][23] |
| S1016 | MacMa | |
| S0282 | MacSpy |
MacSpy can record the sounds from microphones on a computer.[25] |
| S1146 | MgBot |
MgBot can capture input and output audio streams from infected devices.[26][27] |
| S0339 | Micropsia | |
| S0336 | NanoCore | |
| S1090 | NightClub |
NightClub can load a module to leverage the LAME encoder and |
| S0194 | PowerSploit |
PowerSploit's |
| S0192 | Pupy | |
| S0332 | Remcos | |
| S0379 | Revenge RAT |
Revenge RAT has a plugin for microphone interception.[36][37] |
| S0240 | ROKRAT | |
| S0098 | T9000 |
T9000 uses the Skype API to record audio and video calls. It writes encrypted data to |
| S0467 | TajMahal |
TajMahal has the ability to capture VoiceIP application audio on an infected host.[40] |
| S0257 | VERMIN |
This type of attack technique cannot be easily mitigated with preventive controls since it is based on the abuse of system features.
| ID | Data Source | Data Component | Detects |
|---|---|---|---|
| DS0017 | Command | Command Execution |
Monitor executed commands and arguments for actions that can leverage a computer’s peripheral devices (e.g., microphones and webcams) or applications (e.g., voice and video call services) to capture audio recordings for the purpose of listening into sensitive conversations to gather information. |
| DS0009 | Process | OS API Execution |
Monitor for API calls associated with leveraging a computer's peripheral devices (e.g., microphones and webcams) or applications (e.g., voice and video call services) to capture audio recordings for the purpose of listening into sensitive conversations to gather information. |