arXiv論文一覧 - cs.CR updates on arXiv.org

#1 Anomaly Detection in IEC-61850 GOOSE Networks: Evaluating Unsupervised and Temporal Learning for Real-Time Intrusion Detection

著者: Joseph Moore

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.14233

要約:
The IEC-61850 GOOSE protocol underpins time-critical communication in modern digital substations but lacks native security mechanisms, leaving it vulnerable to replay, masquerade, and data injection attacks. Intrusion detection in this setting is challenging due to strict latency constraints (sub-4ms) and limited availability of labeled attack data. This paper evaluates whether unsupervised temporal modeling can provide effective and deployable anomaly detection for GOOSE networks. Five models are compared on the ERENO IEC-61850 dataset: a supervised Random Forest baseline, a feedforward Autoencoder, and three recurrent sequence autoencoders (RNN, LSTM, and GRU). The supervised Random Forest achieves the highest detection performance (F1=0.9516) but fails to meet real-time constraints at 21.8ms per prediction. All four unsupervised models satisfy the 4ms requirement, with the GRU achieving the best accuracy to latency tradeoff among them (F1=0.8737 at 1.118ms). A cross-environment evaluation on an independent dataset shows that all models degrade under distribution shift. However, recurrent models retain substantially higher relative performance than the supervised baseline, suggesting that temporal sequence modeling generalizes better than fitting labeled attack distributions. Anomaly thresholds for the unsupervised models are selected on a held out validation partition to avoid test set leakage. These results support unsupervised temporal models as a practical choice for real-time GOOSE intrusion detection, particularly in environments where labeled training data may be unavailable or where large-scale deployment across diverse substations is required.

#2 Sovereign 2.0: Control-Plane Sovereignty for Cloud Systems Under Disruption

著者: Justin Stark, Scott Wilkie

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.14242

要約:
Cloud sovereignty can no longer be defined by data residency or infrastructure location alone. Under conditions of geopolitical disruption, legal exposure, and expanding service boundaries, sovereignty must be understood as enforceable control over how digital services are governed, operated, and recovered. This paper introduces Sovereign 2.0, a control-plane-centric model that extends sovereignty beyond localisation to include governance authority, privileged access, cryptographic trust, data lifecycle control, observability, and incident response across federated environments. We define management sovereignty as the sovereign ability to govern, operate, evidence, and recover services regardless of underlying infrastructure dependencies. To operationalise this model, we propose a three-layer risk-assurance framework spanning governance, operational, and technical controls, enabling sovereign outcomes to be specified and continuously evidenced under both steady-state and crisis conditions. We further position post-quantum-ready cryptographic control, particularly TLS and key custody, as foundational to long-term sovereign trust. These contributions reframe sovereignty as an evidence-backed control system rather than a property of location, with implications for cloud architecture, procurement, and resilience design.

#3 Head Count: Privacy-Preserving Face-Based Crowd Monitoring

privacy

著者: Fatemeh Marzani, Thijs van Ede, Geert Heijenk, Maarten van Steen

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.14250

要約:
An important aspect of crowd monitoring is knowing how many people we are dealing with. Sometimes, knowing the size of a crowd in a single location and at a specific moment is enough. Matters become problematic when counting the same people across dif ferent locations or counting them over longer periods of time. In those cases, we need to identify and later reidentify a person, which immediately leads to privacy concerns. Until recently, solutions have been based on unique identification of carry-on devices, yet privacy improvements have caused transmitted information to be randomized, rendering this technique mostly useless. We propose to use biometric data instead. We introduce a pipeline that counts people based on face recognition, yet without ever being able to reveal the identity of individuals. To count, a camera initially detects a face, extracts its features, and derives an identifier using a fuzzy extractor. The original facial image is then deleted. Identifiers are inserted into homomorphically encrypted Bloom filters. This allows oblivious set membership testing directly on encrypted data, enabling the system to count across locations or across different moments, without revealing any identities. We provide an initial evaluation of our method that shows promising results.

#4 Challenges and Future Directions in Agentic Reverse Engineering Systems

agent

著者: Salem Radey, Jack West, Kassem Fawaz

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.14317

要約:
Agentic systems built on large language models (LLMs) are increasingly being used for complex security tasks, including binary reverse engineering (RE). Despite recent growth in popularity and capability, these systems continue to face limitations in realistic settings. Cutting-edge systems still fail in complex RE scenarios that involve obfuscation, timing, and unique architecture. In this work, we examine how agentic systems perform reverse engineering tasks with static, dynamic, and hybrid agents. Through an analysis of existing agentic tool usage, we identify several limitations, including token constraints, struggles with obfuscation, and a lack of program guardrails. From these findings, we outline current challenges and position future directions for system designers to overcome from a security perspective.

#5 Understanding Student Experiences with TLS Client Authentication

著者: Abubakar Sadiq Shittu, Clay Shubert, John Sadik, Scott Ruoti

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.14330

要約:
Mutual TLS (mTLS) provides strong, certificate-based authentication for both clients and servers, yet its adoption for user-facing websites remains rare. This paper presents a longitudinal study of mTLS usability, tracking 46 senior and graduate computer science students who configured client certificates from scratch, used them for routine authentication over a semester-long course, and managed credentials across multiple devices. The results reveal that initial setup is a major bottleneck; while daily use was considered smooth, it did not improve long-term usability perceptions. Most concerningly, only 9% of participants fully understood the security implications of certificate-based authentication. We conclude that in a realistic, tooling-heavy deployment utilizing OpenSSL, a custom CA, and a 3072-bit minimum key requirement, even highly technical students struggled significantly. We argue this provides empirical evidence that today mTLS user experience is fundamentally misaligned with non-PKI specialists, and it is difficult to see a path toward mainstream adoption without substantial platform-level changes.

#6 Digital Guardians: The Past and The Future of Cyber-Physical Resilience

著者: Saurabh Bagchi, Hyunseung Kim, Tarek Abdelzaher, Homa Alemzadeh, Somali Chaterji, Glen Chou, Yuying Duan, Fanxin Kong, Michael Lemmon, Yin Li, Mengyu Liu, Wenhao Luo, Meiyi Ma, Sibin Mohan, Ayan Mukhopadhyay, Melkior Ornik, Dimitra Panagou, Kristin Yvonne Rozier, Ivan Ruchkin, Huajie Shao, Sze Zheng Yong, Majid Zamani, Xugui Zhou

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.14360

要約:
Resilience in cyber-physical systems (CPS) is the fundamental ability to maintain safety and critical functionality despite adverse "perturbations," which includes security attacks, environmental disruptions, and hardware or software failures. This survey provides a comprehensive review of CPS resilience, framing the field through five interconnected themes that are required in an integrated whole to achieve real-world resilience. The article first posits that resilience is a system-wide property emerging from interactions between hardware, software, and human users. Second, it addresses the challenges of learning-enabled CPS, which often operate in data-scarce environments characterized by imbalanced or noisy data, requiring innovative solutions like synthetic data generation and foundation model adaptation. Third, the survey examines proactive measures for resilience, which include distinctive aspects of verification, testing, and redundancy. Fourth, it explores recovery mechanisms, moving beyond traditional fault models to design "just good enough" recovery strategies that prioritize safety-critical functions during perturbations. Finally, it highlights the central role of the human, focusing on the different levels of human intervention, the necessity of trust calibration, and the requirement for explainable AI to support human-CPS teaming. These themes are illustrated through representative application domains, primarily Connected and Autonomous Transportation Systems (CATS) and Medical CPS (MCPS). By integrating the five interconnected themes, this survey provides a systematic roadmap for achieving the resilient CPS in increasingly complex and adversarial environments.

#7 AndroScanner: Automated Backend Vulnerability Detection for Android Applications

著者: Harini Dandu

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.14431

要約:
Mobile applications rely on complex backends that introduce significant security risks, yet developers often lack the tools to assess these risks effectively. This paper presents AndroScanner, an automated pipeline for detecting vulnerabilities in Android application backends through combined static and dynamic analysis. AndroScanner extracts backend API calls from APK files using apktool, Androguard, and Frida-based dynamic instrumentation, then vets them against the OWASP API Security Top 10 using APIFuzzer. We evaluate AndroScanner on two Android applications: a purposely vulnerable bank application and a production recruitment application with over 50,000 downloads on Google Play Store. Across both applications, AndroScanner extracted 24 APIs and identified 5 vulnerabilities, including a previously unreported zero-day Excessive Data Exposure vulnerability (ranked 3rd in the OWASP API Security Top 10) in the production application. The vulnerability was responsibly disclosed to the development team prior to publication. AndroScanner is available upon request to assist developers in identifying and remediating backend security risks before deployment.

#8 Robustness Analysis of Machine Learning Models for IoT Intrusion Detection Under Data Poisoning Attacks

backdoor

著者: Fortunatus Aabangbio Wulnye, Justice Owusu Agyemang, Kwame Opuni-Boachie Obour Agyekum, Kwame Agyeman-Prempeh Agyekum, Kingsford Sarkodie Obeng Kwakye, Francisca Adomaa Acheampong

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.14444

要約:
Ensuring the reliability of machine learning-based intrusion detection systems remains a critical challenge in Internet of Things (IoT) environments, particularly as data poisoning attacks increasingly threaten the integrity of model training pipelines. This study evaluates the susceptibility of four widely used classifiers, Random Forest, Gradient Boosting Machine, Logistic Regression, and Deep Neural Network models, against multiple poisoning strategies using three real-world IoT datasets. Results show that while ensemble-based models exhibit comparatively stable performance, Logistic Regression and Deep Neural Networks suffer degradation of up to 40% under label manipulation and outlier-based attacks. Such disruptions significantly distort decision boundaries, reduce detection fidelity, and undermine deployment readiness. The findings highlight the need for adversarially robust training, continuous anomaly monitoring, and feature-level validation within operational Network Intrusion Detection Systems. The study also emphasizes the importance of integrating resilience testing into regulatory and compliance frameworks for AI-driven IoT security. Overall, this work provides an empirical foundation for developing more resilient intrusion detection pipelines and informs future research on adaptive, attack-aware models capable of maintaining reliability under adversarial IoT conditions.

#9 NeuroTrace: Inference Provenance-Based Detection of Adversarial Examples

著者: Firas Ben Hmida, Philemon Hailemariam, Kashif Ali Khan, Birhanu Eshete

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.14457

要約:
Deep neural networks (DNNs) remain largely opaque at inference time, limiting our ability to detect and diagnose malicious input manipulations such as adversarial examples. Existing detection methods predominantly rely on layer-local signals (e.g., activations or attribution scores), leaving cross-layer information flow and execution structure under-explored. We introduce NeuroTrace, a framework and open dataset for analyzing inference provenance through Inference Provenance Graphs (IPGs). IPGs are heterogeneous graphs that capture both activation behavior and parameter-induced dataflow during a model's forward pass, providing a structured representation of how information propagates through the network. NeuroTrace includes (i) a reproducible extraction engine that instruments model execution, (ii) a standardized graph representation compatible with heterogeneous GNNs, and (iii) a benchmark suite spanning multiple adversarial attack families across vision and malware domains. Using this framework, we evaluate IPG-based detectors for adversarial example detection under intra-attack, multi-attack, and cross-threat transfer settings. Our results show that inference provenance provides a strong and transferable signal for distinguishing adversarial and benign inputs, achieving consistently high detection performance and improving over prior graph-based baselines. We further analyze the conditions under which provenance-based detection generalizes across attack types, as well as the associated runtime and storage trade-offs. By releasing the dataset, extraction pipeline, and evaluation protocol, NeuroTrace enables systematic study of inference-time behavior and establishes inference provenance as a practical foundation for building more transparent and auditable machine learning systems.

#10 CBCL: Safe Self-Extending Agent Communication

agent

著者: Hugo O'Connor

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.14512

要約:
Agent communication languages (ACLs) enable heterogeneous agents to share knowledge and coordinate across diverse domains. This diversity demands extensibility, but expressive extension mechanisms can push the input language beyond the complexity classes where full validation is tractable. We present CBCL (Common Business Communication Language), an agent communication language that constrains all messages, including runtime language extensions, to the deterministic context-free language (DCFL) class. CBCL allows agents to define, transmit, and adopt domain-specific "dialect" extensions as first-class messages; three safety invariants (R1--R3), machine-checked in Lean 4 and enforced in a Rust reference implementation, prevent unbounded expansion, applying declared resource limits, and preserving core vocabulary. We formalize the language and its safety properties in Lean 4, implement a reference parser and dialect engine in Rust with property-based and differential tests, and extract a verified parser binary. Our results demonstrate that homoiconic protocol design, where extension definitions share the same representation as ordinary messages, can be made provably safe. As autonomous agents increasingly extend their own communication capabilities, formally bounding what they can express to each other is a precondition for oversight.

#11 Hijacking Large Audio-Language Models via Context-Agnostic and Imperceptible Auditory Prompt Injection

著者: Meng Chen, Kun Wang, Li Lu, Jiaheng Zhang, Tianwei Zhang

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.14604

要約:
Modern Large audio-language models (LALMs) power intelligent voice interactions by tightly integrating audio and text. This integration, however, expands the attack surface beyond text and introduces vulnerabilities in the continuous, high-dimensional audio channel. While prior work studied audio jailbreaks, the security risks of malicious audio injection and downstream behavior manipulation remain underexamined. In this work, we reveal a previously overlooked threat, auditory prompt injection, under realistic constraints of audio data-only access and strong perceptual stealth. To systematically analyze this threat, we propose \textit{AudioHijack}, a general framework that generates context-agnostic and imperceptible adversarial audio to hijack LALMs. \textit{AudioHijack} employs sampling-based gradient estimation for end-to-end optimization across diverse models, bypassing non-differentiable audio tokenization. Through attention supervision and multi-context training, it steers model attention toward adversarial audio and generalizes to unseen user contexts. We also design a convolutional blending method that modulates perturbations into natural reverberation, making them highly imperceptible to users. Extensive experiments on 13 state-of-the-art LALMs show consistent hijacking across 6 misbehavior categories, achieving average success rates of 79\%-96\% on unseen user contexts with high acoustic fidelity. Real-world studies demonstrate that commercial voice agents from Mistral AI and Microsoft Azure can be induced to execute unauthorized actions on behalf of users. These findings expose critical vulnerabilities in LALMs and highlight the urgent need for dedicated defense.

#12 EdgeDetect: Importance-Aware Gradient Compression with Homomorphic Aggregation for Federated Intrusion Detection

著者: Noor Islam S. Mohammad

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.14663

要約:
Federated learning (FL) enables collaborative intrusion detection without raw data exchange, but conventional FL incurs high communication overhead from full-precision gradient transmission and remains vulnerable to gradient inference attacks. This paper presents EdgeDetect, a communication-efficient and privacy-aware federated IDS for bandwidth-constrained 6G-IoT environments. EdgeDetect introduces gradient smartification, a median-based statistical binarization that compresses local updates to $\{+1,-1\}$ representations, reducing uplink payload by $32\times$ while preserving convergence. We further integrate Paillier homomorphic encryption over binarized gradients, protecting against honest-but-curious servers without exposing individual updates. Experiments on CIC-IDS2017 (2.8M flows, 7 attack classes) demonstrate $98.0\%$ multi-class accuracy and $97.9\%$ macro F1-score, matching centralized baselines, while reducing per-round communication from $450$~MB to $14$~MB ($96.9\%$ reduction). Raspberry Pi-4 deployment confirms edge feasibility: $4.2$~MB memory, $0.8$~ms latency, and $12$~mJ per inference with $<0.5\%$ accuracy loss. Under $5\%$ poisoning attacks and severe imbalance, EdgeDetect maintains $87\%$ accuracy and $0.95$ minority class F1 ($p<0.001$), establishing a practical accuracy, communication, and privacy tradeoff for next-generation edge intrusion detection.

#13 Beyond Nodes vs. Edges: A Multi-View Fusion Framework for Provenance-Based Intrusion Detection

著者: Fan Yang, Binyan Xu, Di Tang, Kehuan Zhang

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.14685

要約:
Provenance-based intrusion detection has emerged as a promising approach for analyzing complex attack behaviors through system-level provenance graphs. However, existing defense methods face an inherent granularity limitation. Node-centric detectors, which evaluate anomalies using entities' attributes and local structural patterns, may misclassify benign behavioral changes or configuration modifications as suspicious. In contrast, edge-centric detectors, which focus more on interactions, may lack sufficient contextual awareness of the involved entities, leading to missed detections when compromised entities perform seemingly ordinary operations. These analytical biases highlight a persistent gap between node-centric and edge-centric analyses. To mitigate this gap, we present PROVFUSION, a multi-view detection framework that integrates anomaly signals from three distinct views (i.e., attribute, structure, and causality). The framework fuses heterogeneous anomaly signals through lightweight fusion schemes and determines the final anomaly decisions through a voting-based integration process, providing a more consistent and context-aware assessment of system behavior. This design enables PROVFUSION to capture both entity level deviations and interaction-level anomalies within a consistent analytic pipeline. Experiments on nine widely used benchmark datasets demonstrate that PROVFUSION achieves higher detection accuracy and lower false-positive rates than single node- and edge-centric baselines, maintaining stable performance across scenarios. Overall, the results suggest that our multi-view anomaly fusion together with voting-based decision aggregation offers a practical and effective direction for advancing provenance-based intrusion detection.

#14 Efficient Fuzzy Private Set Intersection from Secret-shared OPRF

privacy

著者: Xinpeng Yang, Meng Hao, Chenkai Weng, Robert H. Deng, Yonggang Wen, Tianwei Zhang

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.14909

要約:
Private set intersection (PSI) enables a sender holding a set $Q$ of size $m$ and a receiver holding a set $W$ of size $n$ to securely compute the intersection $Q \cap W$. Fuzzy PSI (FPSI) is a PSI variant where the receiver learns the items $q \in Q$ for which there exists some $w \in W$ satisfying $\mathsf{dist}(q, w) \le \delta$ under a given distance metric. Although several FPSI works are proposed for $L_{p}$ distance metrics with $p \in [1, \infty]$, they either heavily rely on expensive homomorphic encryptions, or incur undesirable complexity, e.g., exponential to the element dimension, both of which lead to poor practical efficiency. In this work, we propose efficient FPSI protocols for $L_{p \in [1, \infty]}$ distance metrics, primarily leveraging significantly cheaper symmetric-key operations. Our protocols achieve linear communication and computation complexity in the set sizes $m,n$, the dimension $d$, and the distance threshold $\delta$. Our core building block is an oblivious programmable PRF with secret-shared outputs, which may be of independent interest. Furthermore, we incorporate a prefix technique that reduces the dependence on the distance threshold $\delta$ to logarithmic, which is particularly suitable for large $\delta$. We implement our FPSI protocols and compare them with state-of-the-art constructions. Experimental results demonstrate that our protocols consistently and significantly outperform existing works across all settings. Specifically, our protocols achieve a speedup of $12{\sim}145\times$ in running time and a reduction of $3{\sim}8\times$ in communication cost compared to Gao et al.~(ASIACRYPT'24) and a speedup of $9{\sim}80\times$ in running time and a reduction of $5{\sim}19\times$ in communication cost compared to Dang et al.~(CCS'25).

#15 Robustness of Vision Foundation Models to Common Perturbations

著者: Hongbin Liu, Zhengyuan Jiang, Cheng Hong, Neil Zhenqiang Gong

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.14973

要約:
A vision foundation model outputs an embedding vector for an image, which can be affected by common editing operations (e.g., JPEG compression, brightness, contrast adjustments). These common perturbations alter embedding vectors and may impact the performance of downstream tasks using these embeddings. In this work, we present the first systematic study on foundation models' robustness to such perturbations. We propose three robustness metrics and formulate five desired mathematical properties for these metrics, analyzing which properties they satisfy or violate. Using these metrics, we evaluate six industry-scale foundation models (OpenAI, Meta) across nine common perturbation categories, finding them generally non-robust. We also show that common perturbations degrade downstream application performance (e.g., classification accuracy) and that robustness values can predict performance impacts. Finally, we propose a fine-tuning approach to improve robustness without sacrificing utility.

#16 ConGISATA: A Framework for Continuous Gamified Information Security Awareness Training and Assessment

著者: Ofir Cohen, Ron Bitton, Asaf Shabtai, Rami Puzis

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.14996

要約:
The incidence of cybersecurity attacks utilizing social engineering techniques has increased. Such attacks exploit the fact that in every secure system, there is at least one individual with the means to access sensitive information. Since it is easier to deceive a person than it is to bypass the defense mechanisms in place, these types of attacks have gained popularity. This situation is exacerbated by the fact that people are more likely to take risks in their passive form, i.e., risks that arise due to the failure to perform an action. Passive risk has been identified as a significant threat to cybersecurity. To address these threats, there is a need to strengthen individuals' information security awareness (ISA). Therefore, we developed ConGISATA - a continuous gamified ISA training and assessment framework based on embedded mobile sensors; a taxonomy for evaluating mobile users' security awareness served as the basis for the sensors' design. ConGISATA's continuous and gradual training process enables users to learn from their real-life mistakes and adapt their behavior accordingly. ConGISATA aims to transform passive risk situations (as perceived by an individual) into active risk situations, as people tend to underestimate the potential impact of passive risks. Our evaluation of the proposed framework demonstrates its ability to improve individuals' ISA, as assessed by the sensors and in simulations of common attack vectors.

#17 Route to Rome Attack: Directing LLM Routers to Expensive Models via Adversarial Suffix Optimization

著者: Haochun Tang, Yuliang Yan, Jiahua Lu, Huaxiao Liu, Enyan Dai

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.15022

要約:
Cost-aware routing dynamically dispatches user queries to models of varying capability to balance performance and inference cost. However, the routing strategy introduces a new security concern that adversaries may manipulate the router to consistently select expensive high-capability models. Existing routing attacks depend on either white-box access or heuristic prompts, rendering them ineffective in real-world black-box scenarios. In this work, we propose R$^2$A, which aims to mislead black-box LLM routers to expensive models via adversarial suffix optimization. Specifically, R$^2$A deploys a hybrid ensemble surrogate router to mimic the black-box router. A suffix optimization algorithm is further adapted for the ensemble-based surrogate. Extensive experiments on multiple open-source and commercial routing systems demonstrate that {R$^2$A} significantly increases the routing rate to expensive models on queries of different distributions. Code and examples: https://github.com/thcxiker/R2A-Attack.

#18 Emulation-based System-on-Chip Security Verification: Challenges and Opportunities

著者: Tanvir Rahman, Shuvagata Saha, Ahmed Y. Alhurubi, Sujan Kumar Saha, Farimah Farahmandi, Mark Tehranipoor

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.15073

要約:
Increasing system-on-chip (SoC) heterogeneity, deep hardware/software integration, and the proliferation of third-party intellectual property (IP) have brought security validation to the forefront of semiconductor design. While simulation and formal verification remain indispensable, they often struggle to expose vulnerabilities that emerge only under realistic execution conditions, long software-driven interactions, and adversarial stimuli. In this context, hardware emulation is emerging as an increasingly important pre-silicon verification technology because it enables higher-throughput execution of RTL designs under realistic hardware/software workloads while preserving sufficient fidelity for security-oriented analysis. This paper presents a comprehensive survey and perspective on emulation-based security verification and validation. We organize the landscape of prior work across assertion-based security checking, coverage-driven exploration, adversarial testing, information-flow tracking, fault injection, and side-channel-oriented evaluation. We provide a structured view of emulation-enabled security verification workflows, including instrumentation, stimulus generation, runtime monitoring, and evidence-driven analysis. We also examine practical challenges related to observability, scalability, property specification, and the definition of security-oriented coverage metrics for emulation-based verification. Finally, we discuss emerging directions such as AI-assisted emulation, digital security twins, chiplet-scale security exploration, automated vulnerability assessment, and cloud-scale secure emulation. Overall, this paper positions emulation as a promising foundation for the next generation of pre-silicon hardware security assurance.

#19 NFTDELTA: Detecting Permission Control Vulnerabilities in NFT Contracts through Multi-View Learning

著者: Hailu Kuang, Xiaoqi Li, Wenkai Li, Zongwei Li

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.15118

要約:
Permission control vulnerabilities in Non-fungible token (NFT) contracts can result in significant financial losses, as attackers may exploit these weaknesses to gain unauthorized access or circumvent critical permission checks. In this paper, we propose NFTDELTA, a framework that leverages static analysis and multi-view learning to detect permission control vulnerabilities in NFT contracts. Specifically, we extract comprehensive function Control Flow Graph (CFG) information via two views: sequence features (representing execution paths) and graph features (capturing structural control flow). These two views are then integrated to create a unified code representation. We also define three specific categories of permission control vulnerabilities and employ a custom detector to identify defects through multi-view feature similarity analysis. Our evaluation of 795 popular NFT collections identified 241 confirmed permission control vulnerabilities, comprising 214 cases of Bypass Auth Reentrancy, 15 of Weak Auth Validation, and 12 of Loose Permission Management. Manual verification demonstrates the detector's high reliability, achieving an average precision of 97.92% and an F1-score of 81.09%. Furthermore, NFTDELTA demonstrates enhanced efficiency and scalability, proving its effectiveness in securing NFT ecosystems.

#20 Feedback-Driven Execution for LLM-Based Binary Analysis

著者: XiangRui Zhang, Qiang Li, Haining Wang

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.15136

要約:
Binary analysis increasingly relies on large language models (LLMs) to perform semantic reasoning over complex program behaviors. However, existing approaches largely adopt a one-pass execution paradigm, where reasoning operates over a fixed program representation constructed by static analysis tools. This formulation limits the ability to adapt exploration based on intermediate results and makes it difficult to sustain long-horizon, multi-path analysis under constrained context. We present FORGE, a system that rethinks LLM-based analysis as a feedback-driven execution process. FORGE interleaves reasoning and tool interaction through a reasoning-action-observation loop, enabling incremental exploration and evidence construction. To address the instability of long-horizon reasoning, we introduce a Dynamic Forest of Agents (FoA), a decomposed execution model that dynamically coordinates parallel exploration while bounding per-agent context. We evaluate FORGE on 3,457 real-world firmware binaries. FORGE identifies 1,274 vulnerabilities across 591 unique binaries, achieving 72.3% precision while covering a broader range of vulnerability types than prior approaches. These results demonstrate that structuring LLM-based analysis as a decomposed, feedback-driven execution system enables both scalable reasoning and high-quality outcomes in long-horizon tasks.

#21 Structural Dependency Analysis for Masked NTT Hardware: Scalable Pre-Silicon Verification of Post-Quantum Cryptographic Accelerators

著者: Ray Iskander, Khaled Kirah

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.15249

要約:
Post-quantum cryptographic accelerators require side-channel resistance evidence for FIPS 140-3 certification. However, exact masking-verification tools scale only to gadgets of a few thousand cells. We present a four-stage verification hierarchy, D0/D1 structural dependency analysis, fresh-mask refinement, Boolean Single-Authentication Distance Checking (SADC), and arithmetic SADC, that extends sound first-order masking verification to production arithmetic modules. Applied to the 1.17-million-cell Adams Bridge ML-DSA/ML-KEM accelerator, structural analysis completes in seconds across all 30 masked submodules. A multi-cycle extension (MC-D1) reclassifies 12 modules from structurally clean to structurally flagged. On the 5,543-cell ML-KEM Barrett reduction module, the pipeline machine-verifies 198 of 363 structurally flagged wires (54.5%) as first-order secure, reports 165 as candidate insecure for designer triage (a sound upper bound), and leaves 0 indeterminate. Every verdict is cross validated by Z3 and CVC5 with 0 disagreements across 363 wires. The result narrows manual review from hundreds of structural flags to 165 actionable candidates with mathematical certificates, enabling pre-silicon side-channel evidence generation on production ML-KEM hardware.

#22 BitFlipScope: Scalable Fault Localization and Recovery for Bit-Flip Corruptions in LLMs

著者: Muhammad Zeeshan Karamat, Sadman Saif, Christiana Chamon Garcia

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2512.22174

要約:
Large Language Models (LLMs) deployed in practical and safety-critical settings are increasingly susceptible to bit-flip faults caused by hardware degradation, cosmic radiation, or deliberate fault-injection attacks such as Rowhammer. These faults silently corrupt internal parameters and can lead to unpredictable or dangerous model behavior. Localizing these corruptions is essential: without identifying the affected region, it is impossible to diagnose the source of degradation, apply targeted corrective measures, or restore model functionality without resorting to costly fine-tuning or full retraining. This work introduces BitFlipScope, a scalable, software-based framework for identifying fault-affected regions within transformer architectures under two deployment scenarios. When a clean reference model is available, BitFlipScope performs differential analysis of outputs, hidden states, and internal activations for detecting anomalous behavior indicative of corruption to pinpoint or localize faults. When no reference model exists, it uses residual-path perturbation and loss-sensitivity profiling to infer the fault-impacted region directly from the corrupted model. In both settings, the framework not only enables effective fault diagnosis but also supports lightweight performance recovery without fine-tuning, offering a practical path to restoring corrupted models. Together, these capabilities make BitFlipScope an important step toward trustworthy, fault-resilient LLM deployment in hardware-prone and adversarial environments.

#23 Filament: Denning-Style Information Flow Control for Rust

著者: Jeffrey C. Ching, Quan Zhou, Danfeng Zhang

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.14357

要約:
Existing language-based information-flow control (IFC) tools face a fundamental tension: Denning-style systems that track explicit and implicit flows at the variable level typically require compiler modifications, while more coarse-grained approaches, including recent work Cocoon, avoid compiler changes but impose more restrictive programming models. We present Filament, a Denning-style static IFC library for Rust that requires no compiler modifications. Filament addresses three key challenges in building a practical IFC library for Rust. First, it enables fine-grained explicit-flow checking with minimal annotation overhead by leveraging Rust's type inference. Second, it introduces pc_block!, a lightweight construct for enforcing implicit flows via a compile-time program counter label, without requiring compiler support. Third, it provides fcall! and mcall! macros to support seamless and safe interoperability with standard and third-party libraries. Our evaluation shows that Filament incurs negligible compile-time overhead and requires only modest annotations. Moreover, compared to Cocoon, Filament offers a more permissive programming model, reducing the need for frequent escape hatches that bypass security checks.

#24 Decoupling Identity from Utility: Privacy-by-Design Frameworks for Financial Ecosystems

privacy

著者: Ifayoyinsola Ibikunle, Tyler Farnan, Senthil Kumar, Mayana Pereira

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.14495

要約:
Financial institutions face tension between maximizing data utility and mitigating the re-identification risks inherent in traditional anonymization methods. This paper explores Differentially Private (DP) synthetic data as a robust "Privacy by Design" framework to resolve this conflict, ensuring output privacy while satisfying stringent regulatory obligations. We examine two distinct generative paradigms: Direct Tabular Synthesis, which reconstructs high-fidelity joint distributions from raw data, and DP-Seeded Agent-Based Modeling (ABM), which uses DP-protected aggregates to parameterize complex, stateful simulations. While tabular synthesis excels at reflecting static historical correlations for QA testing and business analytics, the DP-Seeded ABM offers a forward-looking "counterfactual laboratory" capable of modeling dynamic market behaviors and black swan events. By decoupling individual identities from data utility, these methodologies eliminate traditional data-clearing bottlenecks, enabling seamless cross-institutional research and compliant decision-making in an evolving regulatory landscape.

#25 Layered Mutability: Continuity and Governance in Persistent Self-Modifying Agents

agent

著者: Krti Tallam

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.14717

要約:
Persistent language-model agents increasingly combine tool use, tiered memory, reflective prompting, and runtime adaptation. In such systems, behavior is shaped not only by current prompts but by mutable internal conditions that influence future action. This paper introduces layered mutability, a framework for reasoning about that process across five layers: pretraining, post-training alignment, self-narrative, memory, and weight-level adaptation. The central claim is that governance difficulty rises when mutation is rapid, downstream coupling is strong, reversibility is weak, and observability is low, creating a systematic mismatch between the layers that most affect behavior and the layers humans can most easily inspect. I formalize this intuition with simple drift, governance-load, and hysteresis quantities, connect the framework to recent work on temporal identity in language-model agents, and report a preliminary ratchet experiment in which reverting an agent's visible self-description after memory accumulation fails to restore baseline behavior. In that experiment, the estimated identity hysteresis ratio is 0.68. The main implication is that the salient failure mode for persistent self-modifying agents is not abrupt misalignment but compositional drift: locally reasonable updates that accumulate into a behavioral trajectory that was never explicitly authorized.

#26 Segment-Level Coherence for Robust Harmful Intent Probing in LLMs

著者: Xuanli He, Bilgehan Sel, Faizan Ali, Jenny Bao, Hoagy Cunningham, Jerry Wei

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.14865

要約:
Large Language Models (LLMs) are increasingly exposed to adaptive jailbreaking, particularly in high-stakes Chemical, Biological, Radiological, and Nuclear (CBRN) domains. Although streaming probes enable real-time monitoring, they still make systematic errors. We identify a core issue: existing methods often rely on a few high-scoring tokens, leading to false alarms when sensitive CBRN terms appear in benign contexts. To address this, we introduce a streaming probing objective that requires multiple evidence tokens to consistently support a prediction, rather than relying on isolated spikes. This encourages more robust detection based on aggregated signals instead of single-token cues. At a fixed 1% false-positive rate, our method improves the true-positive rate by 35.55% relative to strong streaming baselines. We further observe substantial gains in AUROC, even when starting from near-saturated baseline performance (AUROC = 97.40%). We also show that probing Attention or MLP activations consistently outperforms residual-stream features. Finally, even when adversarial fine-tuning enables novel character-level ciphers, harmful intent remains detectable: probes developed for the base LLMs can be applied ``plug-and-play'' to these obfuscated attacks, achieving an AUROC of over 98.85%.

#27 MLDAS: Machine Learning Dynamic Algorithm Selection for Software-Defined Networking Security

著者: Pablo Benlloch, Oscar Romero, Antonio Leon, Jaime Lloret

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.14957

要約:
Network security is a critical concern in the digital landscape of today, with users demanding secure browsing experiences and protection of their personal data. This study explores the dynamic integration of Machine Learning (ML) algorithms with Software-Defined Networking (SDN) controllers to enhance network security through adaptive decision mechanisms. The proposed approach enables the system to dynamically choose the most suitable ML algorithm based on the characteristics of the observed network traffic. This work examines the role of Intrusion Detection Systems (IDS) as a fundamental component of secure communication networks and discusses the limitations of SDN-based attack detection mechanisms. The proposed framework uses adaptive model selection to maintain reliable intrusion detection under varying network conditions. The study highlights the importance of analyzing traffic-type-based metrics to define effective classification rules and enhance the performance of ML models. Additionally, it addresses the risks of overfitting and underfitting, underscoring the critical role of hyperparameter tuning in optimizing model accuracy and generalization. The central contribution of this work is an automated mechanism that adaptively selects the most suitable ML algorithm according to real-time network conditions, prioritizing detection robustness and operational feasibility within SDN environments.

#28 No More Guessing: a Verifiable Gradient Inversion Attack in Federated Learning

著者: Francesco Diana, Chuan Xu, Andr\'e Nusser, Giovanni Neglia

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.15063

要約:
Gradient inversion attacks threaten client privacy in federated learning by reconstructing training samples from clients' shared gradients. Gradients aggregate contributions from multiple records and existing attacks may fail to disentangle them, yielding incorrect reconstructions with no intrinsic way to certify success. In vision and language, attackers may fall back on human inspection to judge reconstruction plausibility, but this is far less feasible for numerical tabular records, fueling the impression that tabular data is less vulnerable. We challenge this perception by proposing a verifiable gradient inversion attack (VGIA) that provides an explicit certificate of correctness for reconstructed samples. Our method adopts a geometric view of ReLU leakage: the activation boundary of a fully connected layer defines a hyperplane in input space. VGIA introduces an algebraic, subspace-based verification test that detects when a hyperplane-delimited region contains exactly one record. Once isolation is certified, VGIA recovers the corresponding feature vector analytically and reconstructs the target via a lightweight optimization step. Experiments on tabular benchmarks with large batch sizes demonstrate exact record and target recovery in regimes where existing state-of-the-art attacks either fail or cannot assess reconstruction fidelity. Compared to prior geometric approaches, VGIA allocates hyperplane queries more effectively, yielding faster reconstructions with fewer attack rounds.

#29 FedIDM: Achieving Fast and Stable Convergence in Byzantine Federated Learning through Iterative Distribution Matching

著者: He Yang, Dongyi Lv, Wei Xi, Song Ma, Hanlin Gu, Jizhong Zhao

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.15115

要約:
Most existing Byzantine-robust federated learning (FL) methods suffer from slow and unstable convergence. Moreover, when handling a substantial proportion of colluded malicious clients, achieving robustness typically entails compromising model utility. To address these issues, this work introduces FedIDM, which employs distribution matching to construct trustworthy condensed data for identifying and filtering abnormal clients. FedIDM consists of two main components: (1) attack-tolerant condensed data generation, and (2) robust aggregation with negative contribution-based rejection. These components exclude local updates that (1) deviate from the update direction derived from condensed data, or (2) cause a significant loss on the condensed dataset. Comprehensive evaluations on three benchmark datasets demonstrate that FedIDM achieves fast and stable convergence while maintaining acceptable model utility, under multiple state-of-the-art Byzantine attacks involving a large number of malicious clients.

#30 MSWasm: Soundly Enforcing Memory-Safe Execution of Unsafe Code

著者: Alexandra E. Michael, Anitha Gollamudi, Jay Bosamiya, Craig Disselkoen, Aidan Denlinger, Conrad Watt, Bryan Parno, Marco Patrignani, Marco Vassena, Deian Stefan

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2208.13583

要約:
Most programs compiled to WebAssembly (Wasm) today are written in unsafe languages like C and C++. Unfortunately, memory-unsafe C code remains unsafe when compiled to Wasm -- and attackers can exploit buffer overflows and use-after-frees in Wasm almost as easily as they can on native platforms. Memory-Safe WebAssembly (MSWasm) proposes to extend Wasm with language-level memory-safety abstractions to precisely address this problem. In this paper, we build on the original MSWasm position paper to realize this vision. We give a precise and formal semantics of MSWasm, and prove that well-typed MSWasm programs are, by construction, robustly memory safe. To this end, we develop a novel, language-independent memory-safety property based on colored memory locations and pointers. This property also lets us reason about the security guarantees of a formal C-to-MSWasm compiler -- and prove that it always produces memory-safe programs (and preserves the semantics of safe programs). We use these formal results to then guide several implementations: Two compilers of MSWasm to native code, and a C-to-MSWasm compiler (that extends Clang). Our MSWasm compilers support different enforcement mechanisms, allowing developers to make security-performance trade-offs according to their needs. Our evaluation shows that the overhead of enforcing memory safety in software ranges from 22% (enforcing spatial safety alone) to 198% (enforcing full memory safety) on the PolyBenchC suite. More importantly, MSWasm's design makes it easy to swap between enforcement mechanisms; as fast (especially hardware-based) enforcement techniques become available, MSWasm will be able to take advantage of these advances almost for free.

#31 Towards Adaptive, Learning-Based Security in Decentralized Applications

著者: Stefan Kambiz Behfar, Jon Crowcroft

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2311.01956

要約:
Web3 systems expose a fundamentally different security landscape from centralized platforms, characterized by composability, pseudonymous identities, decentralized governance, and rapidly evolving attack strategies that span social, application, and protocol layers. Existing security mechanisms, such as static smart contract analysis, blacklist-based phishing detection, and network-level mitigation, operate in isolation and assume fixed threat models, limiting their effectiveness against adaptive, cross-layer adversaries. This position paper argues that securing Web3 requires a shift from static, tool-centric defenses to learning-driven security primitives capable of continuous reasoning, adaptation, and actuation. We introduce AI-powered smart certificates as a new security abstraction: programmable, continuously updated trust artifacts that integrate on-chain verifiability with off-chain machine learning signals derived from user behavior, transaction dynamics, and social context. Unlike traditional certificates or audits, these certificates maintain state, learn under distribution shift, and support automated policy enforcement and revocation in response to evolving threats. We argue that existing paradigms, formal verification, threat modeling, and isolated anomaly detection, are structurally limited in capturing the non-stationary and socio-technical nature of Web3 attacks. We outline an architecture in which AI-powered smart certificates serve as cross-layer sentinels that coordinate heterogeneous security signals in real time, and position smart certificates as a research direction, raising questions around learning under partial observability, adversarial adaptation, and trustworthy ML deployment in decentralized systems.

#32 SLIP: Soft Label Mechanism and Key-Extraction-Guided CoT-based Defense Against Instruction Backdoor in APIs

backdoor

著者: Zhengxian Wu, Juan Wen, Wanli Peng, Haowei Chang, Yinghan Zhou, Yiming Xue

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2508.06153

要約:
Customized Large Language Model (LLM) agents face a critical security threat from black-box instruction backdoors, where malicious behaviors are covertly injected through hidden system instructions. Although existing prompt-based defenses can often detect poisoned inputs, they generally fail to recover correct outputs once the backdoor is activated. In this paper, we first conduct a mechanistic analysis of LLM behavior under instruction backdoors and reveal two pivotal phenomena: (1) cognitive override, in which backdoor triggers dominate the reasoning process and suppress task-relevant context, and (2) abnormal semantic correlation, where triggers establish excessively strong semantic associations with attacker-specified target labels. Based on these insights, we propose a $\textbf{S}$oft $\textbf{L}$abel mechanism and key-extraction-guided CoT-based defense against $\textbf{I}$nstruction backdoors in A$\textbf{P}$Is (SLIP). To counteract the cognitive override, the key-extraction-guided Chain-of-Thought (KCOT) explicitly guides the model to extract task-relevant keywords and phrases rather than only considering the single trigger or overall text semantics. To neutralize the trigger's abnormal semantic correlation, the soft label mechanism (SLM) quantifies semantic correlations and employs statistical clustering to filter anomalous phrases before aggregating reliable keywords and phrases for prediction. Extensive experiments show that SLIP reduces the average attack success rate to 25.13$\%$, improves clean accuracy to 87.15$\%$, and outperforms state-of-the-art black-box defenses.

#33 From Paradigm Shift to Audit Rift: Empirical Analysis and Validation of Security Audit Methodologies for Asynchronous Smart Contract Systems

著者: Yury Yanovich, Sergey Sobolev, Yash Madhwal, Kirill Ziborov, Vladimir Gorgadze, Victoria Kovalevskay, Elizaveta Smirnova, Matvey Mishuris, Subodh Sharma

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2509.10823

要約:
The Open Network (TON) is a high-performance blockchain platform designed for scalability and efficiency, leveraging an asynchronous execution model and a multi-layered architecture. While TON's design offers significant advantages, it also introduces unique challenges for smart contract development and security. This paper introduces a comprehensive audit checklist for TON smart contracts, based on an empirical analysis of 34 professional audit reports containing 233 real-world vulnerabilities. The checklist addresses TON-specific challenges, such as asynchronous message handling, and provides actionable insights for developers and auditors. We also present detailed case studies of vulnerabilities in TON smart contracts, highlighting their implications and offering lessons learned. To validate practical utility, we conducted a practitioner survey (n=11 complete responses), confirming the checklist's value alongside automated tools. By adopting this checklist, developers and auditors can systematically identify and mitigate vulnerabilities, enhancing the security and reliability of TON-based projects. Our work bridges the gap between Ethereum's mature audit methodologies and the emerging needs of the TON ecosystem, fostering a more secure and robust blockchain environment.

#34 It's a Feature, Not a Bug: Secure and Auditable State Rollback for Confidential Cloud Applications

著者: Quinn Burke, Anjo Vahldiek-Oberwagner, Michael Swift, Patrick McDaniel

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2511.13641

要約:
Replay and rollback attacks threaten cloud application integrity by reintroducing authentic yet stale data through an untrusted storage interface to compromise application decision-making. Prior security frameworks mitigate these attacks by enforcing forward-only state transitions (state continuity) with hardware-backed mechanisms, but they categorically treat all rollback as malicious and thus preclude legitimate rollbacks used for operational recovery from corruption or misconfiguration. We present Rebound, a general-purpose security framework that preserves rollback protection while enabling policy-authorized legitimate rollbacks of application binaries, configuration, and data. Key to Rebound is a reference monitor that mediates state transitions, enforces authorization policy, guarantees atomicity of state updates and rollbacks, and emits a tamper-evident log that provides transparency to applications and auditors. We analyze Rebound's security properties and show through an application case study -- with software deployment workflows in GitLab CI -- that it enables robust control over binary, configuration, and raw data versioning with low end-to-end overhead.

#35 Rigorous and Generalized Proof of Security of Bitcoin Protocol with Bounded Network Delay

著者: Christopher Blake, Chen Feng, Xuechao Wang, Qianyu Yu

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2601.09082

要約:
A proof of the security of the Bitcoin protocol is made rigorous, and simplified in certain parts. A computational model in which an adversary can delay transmission of blocks by time $\Delta$ is considered. The protocol is generalized to allow blocks of different scores and a proof within this more general model is presented. An approach used in a previous paper that used random walk theory is shown through a counterexample to be incorrect; an approach involving a punctured block arrival process is shown to remedy this error. Thus, it is proven that with probability one, the Bitcoin protocol will have infinitely many honest blocks so long as the fully-delayed honest mining rate exceeds the adversary mining rate. This means that an adversary cannot censor future transactions of a user in perpetuity, which would render the protocol useless.

#36 De-Anonymization at Scale via Tournament-Style Attribution

著者: Lirui Zhang, Huishuai Zhang

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2601.12407

要約:
As LLMs rapidly advance and enter real-world use, their privacy implications are increasingly important. We study an authorship de-anonymization threat: using LLMs to link anonymous documents to their authors, potentially compromising settings such as double-blind peer review. We propose De-Anonymization at Scale (DAS), a large language model-based method for attributing authorship among tens of thousands of candidate texts. DAS uses a sequential progression strategy: it randomly partitions the candidate corpus into fixed-size groups, prompts an LLM to select the text most likely written by the same author as a query text, and iteratively re-queries the surviving candidates to produce a ranked top-k list. To make this practical at scale, DAS adds a dense-retrieval prefilter to shrink the search space and a majority-voting style aggregation over multiple independent runs to improve robustness and ranking precision. Experiments on anonymized review data show DAS can recover same-author texts from pools of tens of thousands with accuracy well above chance, demonstrating a realistic privacy risk for anonymous platforms. On standard authorship benchmarks (Enron emails and blog posts), DAS also improves both accuracy and scalability over prior approaches, highlighting a new LLM-enabled de-anonymization vulnerability.

#37 CONTEX-T: Contextual Exploitation of Encrypted Traffic for Device Fingerprinting via Transformer Time-Frequency Analysis

著者: Nazmul Islam, Mohammad Zulkernine

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2601.16160

要約:
The rapid expansion of internet of things (IoT) devices has created a pervasive ecosystem where encrypted wireless communications serve as the primary privacy and security protection mechanism. While encryption effectively protects message content, contextual information from packet metadata and statistics inadvertently expose device identities. Various studies have exploited raw packet statistics and their visual representations for device fingerprinting and identification. However, these approaches remain confined to the spatial domain with limited feature representation. Therefore, this paper presents CONTEX-T, a novel framework that exploits device-level information from encrypted traffic metadata using temporal and spectral representation. The experiments show that time-frequency analysis provides new and rich feature representation, revealing a complex and expanding threat landscape that would require robust countermeasures for IoT security management. CONTEX-T first transforms raw packet-length sequences into temporal and spectral representations and then utilizes vision transformers (ViTs) for device identification. We systematically evaluated multiple time-frequency representation techniques and transformer-based models across encrypted traffic samples from various IoT devices. CONTEX-T achieved device classification accuracy exceeding 99% while operating passively on observable contextual metadata. This demonstrates that temporal and spectral signatures persist under strong encryption, highlighting a critical attack surface for IoT network security and management.

#38 Attack Selection Reduces Safety in Concentrated AI Control Settings against Trusted Monitoring

著者: Joachim Schaeffer, Arjun Khandelwal, Tyler Tracy

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2602.04930

要約:
Future AI deployments will likely be monitored for malicious behaviour. The ability of these AIs to subvert monitors by adversarially selecting against them - attack selection - is particularly concerning. To study this, we let a red team create attack policies that attempt to insert attacks into code without being caught by a monitor in the concentrated BigCodeBench backdooring setting. We decompose attack selection into two problems: mapping attacks to a quality score and mapping quality scores to submission probabilities. We frame attack selection as a classification problem and show that safety is significantly more sensitive to FPR than TPR. We find that prompting the attacker model to reason about the monitor while being cautious with attack selection reduces safety from a baseline of 99% to 59% at 0.5% auditing budget, emphasizing that eliciting attack selection capabilities of models is vital to avoid overly optimistic safety scores in control evaluations.

#39 SecureGate: Learning When to Reveal PII Safely via Token-Gated Dual-Adapters for Federated LLMs

著者: Mohamed Shaaban, Mohamed Elmahallawy

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2602.13529

要約:
Federated learning (FL) enables collaborative training across organizational silos without sharing raw data, making it attractive for privacy-sensitive applications. With the rapid adoption of large language models (LLMs), federated fine-tuning of generative LLMs has gained attention as a way to leverage distributed data while preserving confidentiality. However, this setting introduces fundamental challenges: (i) privacy leakage of personally identifiable information (PII) due to LLM memorization, and (ii) a persistent tension between global generalization and local utility under heterogeneous data. Existing defenses, such as data sanitization and differential privacy, reduce leakage but often degrade downstream performance. We propose SecureGate, a privacy-aware federated fine-tuning framework for LLMs that provides fine-grained privacy control without sacrificing utility. SecureGate employs a dual-adapter LoRA architecture: a secure adapter that learns sanitized, globally shareable representations, and a revealing adapter that captures sensitive, organization-specific knowledge. A token-controlled gating module selectively activates these adapters at inference time, enabling controlled information disclosure without retraining. Extensive experiments across multiple LLMs and real-world datasets show that SecureGate improves task utility while substantially reducing PII leakage, achieving up to a 31.66X reduction in inference attack accuracy and a 17.07X reduction in extraction recall for unauthorized requests. Additionally, it maintains 100% routing reliability to the correct adapter and incurs only minimal computational and communication overhead.

#40 Privacy Filters are Captured by Residues: A Characterization of Free Natural Filters and the Cost of Adaptivity

privacy

著者: Matthew Regehr, Bingshan Hu, Ethan Leeman, Pasin Manurangsi, Pierre Tholoniat, Mathias L\'ecuyer

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2602.15815

要約:
We study privacy filters, which enable privacy accounting for differentially private (DP) mechanisms with adaptively chosen privacy characteristics. We develop a general theory that characterizes the worst-case privacy loss of an interaction involving an analyst that respects some restrictions on what queries they may issue. We apply this theory to develop residue filters, which unifies existing privacy filters. We develop the Gaussian DP (GDP) residue filter, which strictly improves upon the na\"ive GDP filter. We also show that residue filters capture the natural filter, which promises greater utility by leveraging exact privacy accounting techniques. Earlier privacy filters consider only simple privacy parameters such as R\'enyi-DP or GDP parameters. Natural filters account for the entire privacy profile of every query, promising more efficient use of a given privacy budget. We show that, contrary to other forms of DP, natural privacy filters are not free in general. We present a characterization of when a family of private queries admits free natural filters for a given budget. In particular, only families of privacy mechanisms that are totally-ordered when composed admit free natural privacy filters with respect to an arbitrary privacy budget. Finally, we show that, while the natural approximate-DP filter can fail in the presence of adaptive adversary, it cannot fail too badly: the output remains approximate-DP with parameters at most poly-logarithmically worse than the intended privacy parameters.

#41 DPSQL+: A Differentially Private SQL Library with a Minimum Frequency Rule

privacy

著者: Tomoya Matsumoto, Shokichi Takakura, Shun Takagi, Satoshi Hasegawa

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2602.22699

要約:
SQL is the de facto interface for exploratory data analysis; however, releasing exact query results can expose sensitive information through membership or attribute inference attacks. Differential privacy (DP) provides rigorous privacy guarantees, but in practice, DP alone may not satisfy governance requirements such as the \emph{minimum frequency rule}, which requires each released group (cell) to include contributions from at least $k$ distinct individuals. In this paper, we present \textbf{DPSQL+}, a privacy-preserving SQL library that simultaneously enforces user-level $(\varepsilon,\delta)$-DP and the minimum frequency rule. DPSQL+ adopts a modular architecture consisting of: (i) a \emph{Validator} that statically restricts queries to a DP-safe subset of SQL; (ii) an \emph{Accountant} that consistently tracks cumulative privacy loss across multiple queries; and (iii) a \emph{Backend} that interfaces with various database engines, ensuring portability and extensibility. Experiments on the TPC-H benchmark demonstrate that DPSQL+ achieves practical accuracy across a wide range of analytical workloads -- from basic aggregates to quadratic statistics and join operations -- and allows substantially more queries under a fixed global privacy budget than prior libraries in our evaluation.

#42 DAO to (Anonymous) DAO Transactions

著者: Minfeng Qi, Lin Zhong, Qin Wang

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.04369

要約:
Blockchain assets are increasingly controlled by organizations rather than individuals. DAO treasuries, consortium wallets, and custodial exchanges rely on threshold authorization and multi-party key management, yet existing payment mechanisms still target single-user wallets, leaving no unified solution for organizational transfers. We formalize the problem of \emph{DAO-to-(anonymous)-DAO} transactions and present \textsc{Dao$^2$}, a framework that enables one threshold-controlled organization to pay another, optionally with recipient anonymity, while keeping received funds under distributed control. \textsc{Dao$^2$} combines three components: \emph{distributed key derivation} (DKD) for non-stealth child addresses, \emph{distributed stealth-address generation} (DSAG) for unlinkable one-time destinations, and \emph{threshold signatures} for authorization. For ordinary transfers, the receiver derives a non-stealth address via DKD; for anonymous transfers, it derives a stealth address via DSAG. The sender then threshold-signs the payment, and the receiver redeems the funds without reconstructing any master secret. We formally prove its security and evaluate a prototype. A complete anonymous DAO-to-DAO transaction for a typical-sized (e.g., 7-member) DAO finishes in under 27\,ms with less than 1.2\,KB of communication, and scales linearly with DAO size.

#43 A Queueing-Theoretic Framework for Dynamic Attack Surfaces: Data-Integrated Risk Analysis and Adaptive Defense

著者: Jihyeon Yun, Abdullah Yasin Etcibasi, Ming Shi, C. Emre Koksal

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.10427

要約:
We develop a queueing-theoretic framework to model the temporal evolution of cyber-attack surfaces, where the number of active vulnerabilities is represented as the backlog of a queue. Vulnerabilities arrive as they are discovered or created, and leave the system when they are patched or successfully exploited. Building on this model, we study how automation affects attack and defense dynamics by introducing an AI amplification factor that scales arrival, exploit, and patching rates. Our analysis shows that even symmetric automation can increase the rate of successful exploits. We validate the model using vulnerability data collected from an open source software supply chain and show that it closely matches real-world attack surface dynamics. Empirical results reveal heavy-tailed patching times, which we prove induce long-range dependence in vulnerability backlog and help explain persistent cyber risk. Utilizing our queueing abstraction for the attack surface, we develop a systematic approach for cyber risk mitigation. We formulate the dynamic defense problem as a constrained Markov decision process with resource-budget and switching-cost constraints, and develop a reinforcement learning (RL) algorithm that achieves provably near-optimal regret. Numerical experiments validate the approach and demonstrate that our adaptive RL-based defense policies significantly reduce successful exploits and mitigate heavy-tail queue events. Using trace-driven experiments on the ARVO dataset, we show that the proposed RL-based defense policy reduces the average number of active vulnerabilities in a software supply chain by over 90% compared to existing defense practices, without increasing the overall maintenance budget. Our results allow defenders to quantify cumulative exposure risk under long-range dependent attack dynamics and to design adaptive defense strategies with provable efficiency.

#44 Critical-CoT: A Robust Defense Framework against Reasoning-Level Backdoor Attacks in Large Language Models

backdoor

著者: Vu Tuan Truong, Long Bao Le

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.10681

要約:
Large Language Models (LLMs), despite their impressive capabilities across domains, have been shown to be vulnerable to backdoor attacks. Prior backdoor strategies predominantly operate at the token level, where an injected trigger causes the model to generate a specific target word, choice, or class (depending on the task). Recent advances, however, exploit the long-form reasoning tendencies of modern LLMs to conduct reasoning-level backdoors: once triggered, the victim model inserts one or more malicious reasoning steps into its chain-of-thought (CoT). These attacks are substantially harder to detect, as the backdoored answer remains plausible and consistent with the poisoned reasoning trajectory. Yet, defenses tailored to this type of backdoor remain largely unexplored. To bridge this gap, we propose Critical-CoT, a novel defense mechanism that conducts a two-stage fine-tuning (FT) process on LLMs to develop critical thinking behaviors, enabling them to automatically identify potential backdoors and refuse to generate malicious reasoning steps. Extensive experiments across multiple LLMs and datasets demonstrate that Critical-CoT provides strong robustness against both in-context learning-based and FT-based backdoor attacks. Notably, Critical-CoT exhibits strong cross-domain and cross-task generalization. Our code is available at hthttps://github.com/tuanvu171/Critical-CoT.

#45 Improving Clean Accuracy via a Tangent-Space Perspective on Adversarial Training

著者: Bongsoo Yi, Rongjie Lai, Yao Li

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2408.14728

要約:
Adversarial training has proven effective in improving the robustness of deep neural networks against adversarial attacks. However, this enhanced robustness often comes at the cost of a substantial drop in accuracy on clean data. In this paper, we address this limitation by introducing Tangent Direction Guided Adversarial Training (TART), a novel method that enhances clean accuracy by exploiting the geometry of the data manifold. We argue that adversarial examples with large components in the normal direction can overly distort the decision boundary and degrade clean accuracy. TART addresses this issue by estimating the tangent direction of adversarial examples and adaptively modulating the perturbation bound based on the norm of their tangential component. To the best of our knowledge, TART is the first adversarial defense framework that explicitly incorporates the concept of tangent space and direction into adversarial training. Extensive experiments on both synthetic and benchmark datasets demonstrate that TART consistently improves clean accuracy while maintaining robustness against adversarial attacks.

#46 AutoRAN: Automated Hijacking of Safety Reasoning in Large Reasoning Models

著者: Jiacheng Liang, Tanqiu Jiang, Yuhui Wang, Rongyi Zhu, Fenglong Ma, Ting Wang

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2505.10846

要約:
This paper presents AutoRAN, the first framework to automate the hijacking of internal safety reasoning in large reasoning models (LRMs). At its core, AutoRAN pioneers an execution simulation paradigm that leverages a weaker but less-aligned model to simulate execution reasoning for initial hijacking attempts and iteratively refine attacks by exploiting reasoning patterns leaked through the target LRM's refusals. This approach steers the target model to bypass its own safety guardrails and elaborate on harmful instructions. We evaluate AutoRAN against state-of-the-art LRMs, including GPT-o3/o4-mini and Gemini-2.5-Flash, across multiple benchmarks (AdvBench, HarmBench, and StrongReject). Results show that AutoRAN achieves approaching 100% success rate within one or few turns, effectively neutralizing reasoning-based defenses even when evaluated by robustly aligned external models. This work reveals that the transparency of the reasoning process itself creates a critical and exploitable attack surface, highlighting the urgent need for new defenses that protect models' reasoning traces rather than merely their final outputs.

#47 Fall into a Pit, Gain in a Wit: Cognitive-Guided Harmful Meme Detection via Misjudgment Risk Pattern Retrieval

著者: Wenshuo Wang, Ziyou Jiang, Junjie Wang, Mingyang Li, Jie Huang, Yuekai Huang, Zhiyuan Chang, Feiyan Duan, Qing Wang

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2510.15946

要約:
Internet memes have emerged as a popular multimodal medium, yet they are increasingly weaponized to convey harmful opinions through subtle rhetorical devices like irony and metaphor. Existing detection approaches, including Multimodal Large Language Model (MLLM)-based techniques, struggle with these implicit expressions, leading to frequent misjudgments. This paper introduces PatMD, a novel approach that detects harmful memes by learning from and proactively mitigating these potential misjudgment risks. Our core idea is to move beyond superficial content-level matching and instead identify the underlying misjudgment risk patterns, proactively guiding the MLLMs to avoid known misjudgment pitfalls. We first construct a knowledge base where each meme is deconstructed into a misjudgment risk pattern explaining why it might be misjudged, either overlooking harmful undertones (false negative) or overinterpreting benign content (false positive). For a given target meme, PatMD retrieves relevant patterns and utilizes them to dynamically guide the MLLM's reasoning. Experiments on a benchmark of 6,626 memes across 5 harmful detection tasks show that PatMD outperforms state-of-the-art baselines, achieving an average of 8.30% improvement in F1-score and 7.71% improvement in accuracy, while exhibiting consistent robustness on unseen and adversarial memes.

#48 Fundamental Limitations of Favorable Privacy-Utility Guarantees for DP-SGD

privacy

著者: Murat Bilgehan Ertan, Marten van Dijk

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2601.10237

要約:
Differentially Private Stochastic Gradient Descent (DP-SGD) is the dominant paradigm for private training, but its fundamental limitations under worst-case adversarial privacy definitions remain poorly understood. We analyze DP-SGD in the $f$-differential privacy framework, which characterizes privacy via hypothesis-testing trade-off curves, and study shuffled sampling over a single epoch with $M$ gradient updates. We derive an explicit suboptimal upper bound on the achievable trade-off curve. This result induces a geometric lower bound on the separation $\kappa$ which is the maximum distance between the mechanism's trade-off curve and the ideal random-guessing line. Because a large separation implies significant adversarial advantage, meaningful privacy requires small $\kappa$. However, we prove that enforcing a small separation imposes a strict lower bound on the Gaussian noise multiplier $\sigma$, which directly limits the achievable utility. In particular, under the standard worst-case adversarial model, shuffled DP-SGD must satisfy $\sigma \ge \frac{1}{\sqrt{2\ln M}}$ $\quad\text{or}\quad$ $\kappa \ge\ \frac{1}{\sqrt{8}}\!\left(1-\frac{1}{\sqrt{4\pi\ln M}}\right)$, and thus cannot simultaneously achieve strong privacy and high utility. Although this bound vanishes asymptotically as $M \to \infty$, the convergence is extremely slow: even for practically relevant numbers of updates the required noise magnitude remains substantial. We further show that the same limitation extends to Poisson subsampling up to constant factors. Our experiments confirm that the noise levels implied by this bound leads to significant accuracy degradation at realistic training settings, thus showing a critical bottleneck in DP-SGD under standard worst-case adversarial assumptions.

#49 Prompt Injection as Role Confusion

著者: Charles Ye, Jasmine Cui, Dylan Hadfield-Menell

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2603.12277

要約:
Language models remain vulnerable to prompt injection attacks despite extensive safety training. We trace this failure to role confusion: models infer the source of text based on how it sounds, not where it actually comes from. A command hidden in a webpage hijacks an agent simply because it sounds like a user instruction. This is not just behavioral: in the model's internal representations, text that sounds like a trusted source occupies the same space as text that actually is one. We design role probes which measure how models internally perceive "who is speaking", showing that attacker-controllable signals (e.g. syntactic patterns, lexical choice) control role perception. We first test this with CoT Forgery, a zero-shot attack that injects fabricated reasoning into user prompts or ingested webpages. Models mistake the text for their own thoughts, yielding 60% attack success on StrongREJECT across frontier models with near-0% baselines. Strikingly, the degree of role confusion strongly predicts attack success. We then generalize these results to standard agent prompt injections, introducing a unifying framework that reframes prompt injection not as an ad-hoc exploit but as a measurable consequence of how models represent role.

#50 Securing Elliptic Curve Cryptocurrencies against Quantum Vulnerabilities: Resource Estimates and Mitigations

著者: Ryan Babbush, Adam Zalcman, Craig Gidney, Michael Broughton, Tanuj Khattar, Hartmut Neven, Thiago Bergamaschi, Justin Drake, Dan Boneh

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2603.28846

要約:
This whitepaper seeks to elucidate implications that the capabilities of developing quantum architectures have on blockchain vulnerabilities and mitigation strategies. First, we provide new resource estimates for breaking the 256-bit Elliptic Curve Discrete Logarithm Problem, the core of modern blockchain cryptography. We demonstrate that Shor's algorithm for this problem can execute with either <1200 logical qubits and <90 million Toffoli gates or <1450 logical qubits and <70 million Toffoli gates. In the interest of responsible disclosure, we use a zero-knowledge proof to validate these results without disclosing attack vectors. On superconducting architectures with 1e-3 physical error rates and planar connectivity, those circuits can execute in minutes using fewer than half a million physical qubits. We introduce a critical distinction between fast-clock (such as superconducting and photonic) and slow-clock (such as neutral atom and ion trap) architectures. Our analysis reveals that the first fast-clock CRQCs would enable on-spend attacks on public mempool transactions of some cryptocurrencies. We survey major cryptocurrency vulnerabilities through this lens, identifying systemic risks associated with advanced features in some blockchains such as smart contracts, Proof-of-Stake consensus, and Data Availability Sampling, as well as the enduring concern of abandoned assets. We argue that technical solutions would benefit from accompanying public policy and discuss various frameworks of digital salvage to regulate the recovery or destruction of dormant assets while preventing adversarial seizure. We also discuss implications for other digital assets and tokenization as well as challenges and successful examples of the ongoing transition to Post-Quantum Cryptography (PQC). Finally, we urge all vulnerable cryptocurrency communities to join the ongoing migration to PQC without delay.

#51 XMark: Reliable Multi-Bit Watermarking for LLM-Generated Texts

intellectual property

著者: Jiahao Xu, Rui Hu, Olivera Kotevska, Zikai Zhang

公開日: Fri, 17 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.05242

要約:
Multi-bit watermarking has emerged as a promising solution for embedding imperceptible binary messages into Large Language Model (LLM)-generated text, enabling reliable attribution and tracing of malicious usage of LLMs. Despite recent progress, existing methods still face key limitations: some become computationally infeasible for large messages, while others suffer from a poor trade-off between text quality and decoding accuracy. Moreover, the decoding accuracy of existing methods drops significantly when the number of tokens in the generated text is limited, a condition that frequently arises in practical usage. To address these challenges, we propose \textsc{XMark}, a novel method for encoding and decoding binary messages in LLM-generated texts. The unique design of \textsc{XMark}'s encoder produces a less distorted logit distribution for watermarked token generation, preserving text quality, and also enables its tailored decoder to reliably recover the encoded message with limited tokens. Extensive experiments across diverse downstream tasks show that \textsc{XMark} significantly improves decoding accuracy while preserving the quality of watermarked text, outperforming prior methods. The code is at https://github.com/JiiahaoXU/XMark.

cs.CR updates on arXiv.org

📋 論文タイトル一覧