arXiv論文一覧 - cs.CR updates on arXiv.org

#1 Evaluating Adversarial Vulnerabilities in Modern Large Language Models

著者: Tom Perel

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.17666

要約:
The recent boom and rapid integration of Large Language Models (LLMs) into a wide range of applications warrants a deeper understanding of their security and safety vulnerabilities. This paper presents a comparative analysis of the susceptibility to jailbreak attacks for two leading publicly available LLMs, Google's Gemini 2.5 Flash and OpenAI's GPT-4 (specifically the GPT-4o mini model accessible in the free tier). The research utilized two main bypass strategies: 'self-bypass', where models were prompted to circumvent their own safety protocols, and 'cross-bypass', where one model generated adversarial prompts to exploit vulnerabilities in the other. Four attack methods were employed - direct injection, role-playing, context manipulation, and obfuscation - to generate five distinct categories of unsafe content: hate speech, illegal activities, malicious code, dangerous content, and misinformation. The success of the attack was determined by the generation of disallowed content, with successful jailbreaks assigned a severity score. The findings indicate a disparity in jailbreak susceptibility between 2.5 Flash and GPT-4, suggesting variations in their safety implementations or architectural design. Cross-bypass attacks were particularly effective, indicating that an ample amount of vulnerabilities exist in the underlying transformer architecture. This research contributes a scalable framework for automated AI red-teaming and provides data-driven insights into the current state of LLM safety, underscoring the complex challenge of balancing model capabilities with robust safety mechanisms.

#2 MURMUR: Using cross-user chatter to break collaborative language agents in groups

agent

著者: Atharv Singh Patlan, Peiyao Sheng, S. Ashwin Hebbar, Prateek Mittal, Pramod Viswanath

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.17671

要約:
Language agents are rapidly expanding from single-user assistants to multi-user collaborators in shared workspaces and groups. However, today's language models lack a mechanism for isolating user interactions and concurrent tasks, creating a new attack vector inherent to this new setting: cross-user poisoning (CUP). In a CUP attack, an adversary injects ordinary-looking messages that poison the persistent, shared state, which later triggers the agent to execute unintended, attacker-specified actions on behalf of benign users. We validate CUP on real systems, successfully attacking popular multi-user agents. To study the phenomenon systematically, we present MURMUR, a framework that composes single-user tasks into concurrent, group-based scenarios using an LLM to generate realistic, history-aware user interactions. We observe that CUP attacks succeed at high rates and their effects persist across multiple tasks, thus posing fundamental risks to multi-user LLM deployments. Finally, we introduce a first-step defense with task-based clustering to mitigate this new class of vulnerability

#3 QDNA-ID Quantum Device Native Authentication

著者: Osamah N. Neamah (Department of Mechatronics Engineering, Karabuk University, Karabuk, Turkey)

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.17692

要約:
QDNA-ID is a trust-chain framework that links physical quantum behavior to digitally verified records. The system first executes standard quantum circuits with random shot patterns across different devices to generate entropy profiles and measurement data that reveal device-specific behavior. A Bell or CHSH test is then used to confirm that correlations originate from genuine non classical processes rather than classical simulation. The verified outcomes are converted into statistical fingerprints using entropy, divergence, and bias features to characterize each device. These features and metadata for device, session, and random seed parameters are digitally signed and time stamped to ensure integrity and traceability. Authenticated artifacts are stored in a hierarchical index for reproducible retrieval and long term auditing. A visualization and analytics interface monitors drift, policy enforcement, and device behavior logs. A machine learning engine tracks entropy drift, detects anomalies, and classifies devices based on evolving patterns. An external verification API supports independent recomputation of hashes, signatures, and CHSH evidence. QDNA-ID operates as a continuous feedback loop that maintains a persistent chain of trust for quantum computing environments.

#4 Pre-cache: A Microarchitectural Solution to prevent Meltdown and Spectre

著者: Subhash Sethumurugan, Hari Cherupalli, Kangjie Lu, John Sartori

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.17726

要約:
Recent work has shown that out-of-order and speculative execution mechanisms used to increase performance in the majority of processors expose the processors to critical attacks. These attacks, called Meltdown and Spectre, exploit the side effects of performance-enhancing features in modern microprocessors to expose secret data through side channels in the microarchitecture. The well known implementations of these attacks exploit cache-based side channels since they are the least noisy channels to exfiltrate data. While some software patches attempted to mitigate these attacks, they are ad-hoc and only try to fix the side effects of the vulnerabilites. They may also impose a performance overhead of up to 30%. In this paper, we present a microarchitecture-based solution for Meltdown and Spectre that addresses the vulnerabilities exploited by the attacks. Our solution prevents flushed instructions from exposing data to the cache. Our approach can also be extended to other memory structures in the microarchitecture thereby preventing variants of the attacks which exploit these memory structures. We further identify two new variant attacks based on exploiting the side effects of speculative and out-of-order execution and show how our solution can be used to prevent these attacks. Evaluation results show that our microarchitectural solution not only restores secure out-of-order and speculative execution, but also has relatively low overhead and does not significantly impact performance for most applications.

#5 The Dark Side of Flexibility: How Aggregated Cyberattacks Threaten the Power Grid

著者: Daniel Myr\'en, Zeeshan Afzal, Mikael Asplund

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.17748

要約:
Flexible energy resources are increasingly becoming common in smart grids. These resources are typically managed and controlled by aggregators that coordinate many resources to provide flexibility services. However, these aggregators and flexible energy resources are vulnerable, which could allow attackers to remotely control flexible energy resources to launch large-scale attacks on the grid. This paper investigates and evaluates the potential attack strategies that can be used to manipulate flexible energy resources to challenge the effectiveness of traditional grid stability measures and disrupt the first-swing stability of the power grid. Our work shows that although a large amount of power is required, the current flexibility capacities could potentially be sufficient to disrupt the grid on a national level.

#6 StealthCup: Realistic, Multi-Stage, Evasion-Focused CTF for Benchmarking IDS

著者: Manuel Kern, Dominik Steffan, Felix Schuster, Florian Skopik, Max Landauer, David Allison, Simon Freudenthaler, Edgar Weippl

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.17761

要約:
Intrusion Detection Systems (IDS) are critical to defending enterprise and industrial control environments, yet evaluating their effectiveness under realistic conditions remains an open challenge. Existing benchmarks rely on synthetic datasets (e.g., NSL-KDD, CICIDS2017) or scripted replay frameworks, which fail to capture adaptive adversary behavior. Even MITRE ATT&CK Evaluations, while influential, are host-centric and assume malware-driven compromise, thereby under-representing stealthy, multi-stage intrusions across IT and OT domains. We present StealthCup, a novel evaluation methodology that operationalizes IDS benchmarking as an evasion-focused Capture-the-Flag competition. Professional penetration testers engaged in multi-stage attack chains on a realistic IT/OT testbed, with scoring penalizing IDS detections. The event generated structured attacker writeups, validated detections, and PCAPs, host logs, and alerts. Our results reveal that out of 32 exercised attack techniques, 11 were not detected by any IDS configuration. Open-source systems (Wazuh, Suricata) produced high false-positive rates >90%, while commercial tools generated fewer false positives but also missed more attacks. Comparison with the Volt Typhoon APT advisory confirmed strong realism: all 28 applicable techniques were exercised, 19 appeared in writeups, and 9 in forensic traces. These findings demonstrate that StealthCup elicits attacker behavior closely aligned with state-sponsored TTPs, while exposing blind spots across both open-source and commercial IDS. The resulting datasets and methodology provide a reproducible foundation for future stealth-focused IDS evaluation.

#7 Characteristics, Root Causes, and Detection of Incomplete Security Bug Fixes in the Linux Kernel

著者: Qiang Liu, Wenlong Zhang, Muhui Jiang, Lei Wu, Yajin Zhou

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.17799

要約:
Security bugs in the Linux kernel emerge endlessly and have attracted much attention. However, fixing security bugs in the Linux kernel could be incomplete due to human mistakes. Specifically, an incomplete fix fails to repair all the original security defects in the software, fails to properly repair the original security defects, or introduces new ones. In this paper, we study the fixes of incomplete security bugs in the Linux kernel for the first time, and reveal their characteristics, root causes, as well as detection. We first construct a dataset of incomplete security bug fixes in the Linux kernel and answer the following three questions. What are the characteristics of incomplete security bug fixes in the Linux kernel? What are the root causes behind them? How should they be detected to reduce security risks? We then have the three main insights in the following. (*Due to the notification of arXiv "The Abstract field cannot be longer than 1,920 characters", the appeared Abstract is shortened. For the full Abstract, please download the Article.)

#8 Homomorphic Encryption-based Vaults for Anonymous Balances on VM-enabled Blockchains

著者: Xavier Salleras

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.17842

要約:
In this work, we present homomorphic encryption-based vaults (Haults), a permissioned privacy-preserving smart wallet protocol for VM-enabled blockchains that keeps users' balances confidential, as well as the amounts transacted to other parties. To comply with regulations, we include optional compliance features that allow specific entities (the auditors) to retrieve transaction amounts or execute force transfers when necessary. Our solution uses ElGamal over elliptic curves to encrypt balances, combined with zero-knowledge proofs to verify the correctness of transaction amounts and the integrity of the sender's updated balance, among other security checks. We provide a detailed explanation of the protocol, including a security discussion and benchmarks from our proof-of-concept implementation, which yield great results. Beyond in-contract issued tokens, we also provide a thorough explanation on how our solution can be compatible with external ones (e.g., Ether or any ERC20).

#9 Beyond Jailbreak: Unveiling Risks in LLM Applications Arising from Blurred Capability Boundaries

著者: Yunyi Zhang, Shibo Cui, Baojun Liu, Jingkai Yu, Min Zhang, Fan Shi, Han Zheng

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.17874

要約:
LLM applications (i.e., LLM apps) leverage the powerful capabilities of LLMs to provide users with customized services, revolutionizing traditional application development. While the increasing prevalence of LLM-powered applications provides users with unprecedented convenience, it also brings forth new security challenges. For such an emerging ecosystem, the security community lacks sufficient understanding of the LLM application ecosystem, especially regarding the capability boundaries of the applications themselves. In this paper, we systematically analyzed the new development paradigm and defined the concept of the LLM app capability space. We also uncovered potential new risks beyond jailbreak that arise from ambiguous capability boundaries in real-world scenarios, namely, capability downgrade and upgrade. To evaluate the impact of these risks, we designed and implemented an LLM app capability evaluation framework, LLMApp-Eval. First, we collected application metadata across 4 platforms and conducted a cross-platform ecosystem analysis. Then, we evaluated the risks for 199 popular applications among 4 platforms and 6 open-source LLMs. We identified that 178 (89.45%) potentially affected applications, which can perform tasks from more than 15 scenarios or be malicious. We even found 17 applications in our study that executed malicious tasks directly, without applying any adversarial rewriting. Furthermore, our experiments also reveal a positive correlation between the quality of prompt design and application robustness. We found that well-designed prompts enhance security, while poorly designed ones can facilitate abuse. We hope our work inspires the community to focus on the real-world risks of LLM applications and foster the development of a more robust LLM application ecosystem.

#10 Towards Automating Data Access Permissions in AI Agents

agent

著者: Yuhao Wu, Ke Yang, Franziska Roesner, Tadayoshi Kohno, Ning Zhang, Umar Iqbal

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.17959

要約:
As AI agents attempt to autonomously act on users' behalf, they raise transparency and control issues. We argue that permission-based access control is indispensable in providing meaningful control to the users, but conventional permission models are inadequate for the automated agentic execution paradigm. We therefore propose automated permission management for AI agents. Our key idea is to conduct a user study to identify the factors influencing users' permission decisions and to encode these factors into an ML-based permission management assistant capable of predicting users' future decisions. We find that participants' permission decisions are influenced by communication context but importantly individual preferences tend to remain consistent within contexts, and align with those of other participants. Leveraging these insights, we develop a permission prediction model achieving 85.1% accuracy overall and 94.4% for high-confidence predictions. We find that even without using permission history, our model achieves an accuracy of 66.9%, and a slight increase of training samples (i.e., 1-4) can substantially increase the accuracy by 10.8%.

#11 Towards Effective, Stealthy, and Persistent Backdoor Attacks Targeting Graph Foundation Models

backdoor

著者: Jiayi Luo, Qingyun Sun, Lingjuan Lyu, Ziwei Zhang, Haonan Yuan, Xingcheng Fu, Jianxin Li

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.17982

要約:
Graph Foundation Models (GFMs) are pre-trained on diverse source domains and adapted to unseen targets, enabling broad generalization for graph machine learning. Despite that GFMs have attracted considerable attention recently, their vulnerability to backdoor attacks remains largely underexplored. A compromised GFM can introduce backdoor behaviors into downstream applications, posing serious security risks. However, launching backdoor attacks against GFMs is non-trivial due to three key challenges. (1) Effectiveness: Attackers lack knowledge of the downstream task during pre-training, complicating the assurance that triggers reliably induce misclassifications into desired classes. (2) Stealthiness: The variability in node features across domains complicates trigger insertion that remains stealthy. (3) Persistence: Downstream fine-tuning may erase backdoor behaviors by updating model parameters. To address these challenges, we propose GFM-BA, a novel Backdoor Attack model against Graph Foundation Models. Specifically, we first design a label-free trigger association module that links the trigger to a set of prototype embeddings, eliminating the need for knowledge about downstream tasks to perform backdoor injection. Then, we introduce a node-adaptive trigger generator, dynamically producing node-specific triggers, reducing the risk of trigger detection while reliably activating the backdoor. Lastly, we develop a persistent backdoor anchoring module that firmly anchors the backdoor to fine-tuning-insensitive parameters, enhancing the persistence of the backdoor under downstream adaptation. Extensive experiments demonstrate the effectiveness, stealthiness, and persistence of GFM-BA.

#12 Correlated-Sequence Differential Privacy

privacy

著者: Yifan Luo, Meng Zhang, Jin Xu, Junting Chen, Jianwei Huang

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.18025

要約:
Data streams collected from multiple sources are rarely independent. Values evolve over time and influence one another across sequences. These correlations improve prediction in healthcare, finance, and smart-city control yet violate the record-independence assumption built into most Differential Privacy (DP) mechanisms. To restore rigorous privacy guarantees without sacrificing utility, we introduce Correlated-Sequence Differential Privacy (CSDP), a framework specifically designed for preserving privacy in correlated sequential data. CSDP addresses two linked challenges: quantifying the extra information an attacker gains from joint temporal and cross-sequence links, and adding just enough noise to hide that information while keeping the data useful. We model multivariate streams as a Coupling Markov Chain, yielding the derived loose leakage bound expressed with a few spectral terms and revealing a counterintuitive result: stronger coupling can actually decrease worst-case leakage by dispersing perturbations across sequences. Guided by these bounds, we build the Freshness-Regulated Adaptive Noise (FRAN) mechanism--combining data aging, correlation-aware sensitivity scaling, and Laplace noise--that runs in linear time. Tests on two-sequence datasets show that CSDP improves the privacy-utility trade-off by approximately 50% over existing correlated-DP methods and by two orders of magnitude compared to the standard DP approach.

#13 SCI-IoT: A Quantitative Framework for Trust Scoring and Certification of IoT Devices

著者: Shreyansh Swami, Ishwardeep Singh, Chinmay Prawah Pant

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.18045

要約:
The exponential growth of the Internet of Things (IoT) ecosystem has amplified concerns regarding device reliability, interoperability, and security assurance. Despite the proliferation of IoT security guidelines, a unified and quantitative approach to measuring trust remains absent. This paper introduces SCI-IoT (Secure Certification Index for IoT), a standardized and quantitative framework for trust scoring, evaluation, and certification of IoT devices. The framework employs a six-tier grading model (Grades A-F), enabling device profiling across consumer, industrial, and critical infrastructure domains. Within this model, 30 distinct Trust Tests assess devices across dimensions such as authentication, encryption, data integrity, resilience, and firmware security. Each test is assigned a criticality-based weight (1.0-2.0) and a performance rating (1-4), converted to a normalized percentage and aggregated through a weighted computation to yield the Secure Certification Index (SCI). The SCI determines the device's Trust Verdict, categorized into five SCI levels, and serves as the foundation for optional grade-based certification. The framework also incorporates critical gate conditions, enforcing absolute compliance in high risk parameters to prevent certification of devices with fundamental vulnerabilities. By unifying quantitative trust scoring with structured certification criteria, SCI-IoT provides a transparent, scalable, and reproducible method to benchmark IoT device trustworthiness. The proposed system aims to streamline manufacturer compliance, improve consumer confidence, and facilitate global interoperability in IoT security certification.

#14 Towards Harnessing the Power of LLMs for ABAC Policy Mining

著者: More Aayush Babasaheb (Indian Institute of Technology Kharagpur, India), Shamik Sural (Indian Institute of Technology Kharagpur, India)

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.18098

要約:
This paper presents an empirical investigation into the capabilities of Large Language Models (LLMs) to perform automated Attribute-based Access Control (ABAC) policy mining. While ABAC provides fine-grained, context-aware access management, the increasing number and complexity of access policies can make their formulation and evaluation rather challenging. To address the task of synthesizing concise yet accurate policies, we evaluate the performance of some of the state-of-the-art LLMs, specifically Google Gemini (Flash and Pro) and OpenAI ChatGPT, as potential policy mining engines. An experimental framework was developed in Python to generate randomized access data parameterized by varying numbers of subjects, objects, and initial policy sets. The baseline policy sets, which govern permission decisions between subjects and objects, serve as the ground truth for comparison. Each LLM-generated policy was evaluated against the baseline policy using standard performance metrics. The results indicate that LLMs can effectively infer compact and valid ABAC policies for small-scale scenarios. However, as the system size increases, characterized by higher numbers of subjects and objects, LLM outputs exhibit declining accuracy and precision, coupled with significant increase in the size of policy generated, which is beyond the optimal size. These findings highlight both the promise and limitations of current LLM architectures for scalable policy mining in access control domains. Future work will explore hybrid approaches that combine prompt optimization with classical rule mining algorithms to improve scalability and interpretability in complex ABAC environments.

#15 ASTRA: Agentic Steerability and Risk Assessment Framework

agent

著者: Itay Hazan, Yael Mathov, Guy Shtar, Ron Bitton, Itsik Mantin

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.18114

要約:
Securing AI agents powered by Large Language Models (LLMs) represents one of the most critical challenges in AI security today. Unlike traditional software, AI agents leverage LLMs as their "brain" to autonomously perform actions via connected tools. This capability introduces significant risks that go far beyond those of harmful text presented in a chatbot that was the main application of LLMs. A compromised AI agent can deliberately abuse powerful tools to perform malicious actions, in many cases irreversible, and limited solely by the guardrails on the tools themselves and the LLM ability to enforce them. This paper presents ASTRA, a first-of-its-kind framework designed to evaluate the effectiveness of LLMs in supporting the creation of secure agents that enforce custom guardrails defined at the system-prompt level (e.g., "Do not send an email out of the company domain," or "Never extend the robotic arm in more than 2 meters"). Our holistic framework simulates 10 diverse autonomous agents varying between a coding assistant and a delivery drone equipped with 37 unique tools. We test these agents against a suite of novel attacks developed specifically for agentic threats, inspired by the OWASP Top 10 but adapted to challenge the ability of the LLM for policy enforcement during multi-turn planning and execution of strict tool activation. By evaluating 13 open-source, tool-calling LLMs, we uncovered surprising and significant differences in their ability to remain secure and keep operating within their boundaries. The purpose of this work is to provide the community with a robust and unified methodology to build and validate better LLMs, ultimately pushing for more secure and reliable agentic AI systems.

#16 eBPF-PATROL: Protective Agent for Threat Recognition and Overreach Limitation using eBPF in Containerized and Virtualized Environments

agent

著者: Sangam Ghimire, Nirjal Bhurtel, Roshan Sahani, Sudan Jha

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.18155

要約:
With the increasing use and adoption of cloud and cloud-native computing, the underlying technologies (i.e., containerization and virtualization) have become foundational. However, strict isolation and maintaining runtime security in these environments has become increasingly challenging. Existing approaches like seccomp and Mandatory Access Control (MAC) frameworks offer some protection up to a limit, but often lack context awareness, syscall argument filtering, and adaptive enforcement, providing the ability to adjust decisions at runtime based on observed application behavior, workload changes, or detected anomalies rather than relying solely on static or predefined rules.This paper introduces eBPF-PATROL (eBPF-Protective Agent for Threat Recognition and Overreach Limitation), an extensible lightweight runtime security agent that uses extended Berkeley Packet Filter (eBPF) technology to monitor and enforce policies in containerized and virtualized environments. By intercepting system calls, analyzing execution context, and applying user-defined rules, eBPF-PATROL detects and prevents real-time boundary violations, such as reverse shells, privilege escalation, and container escape attempts. We describe the architecture, implementation, and evaluation of eBPF-PATROL, demonstrating its low overhead (< 2.5 percent) and high detection accuracy across real-world attack scenarios.

#17 A Novel and Practical Universal Adversarial Perturbations against Deep Reinforcement Learning based Intrusion Detection Systems

著者: H. Zhang, L. Zhang, G. Epiphaniou, C. Maple

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.18223

要約:
Intrusion Detection Systems (IDS) play a vital role in defending modern cyber physical systems against increasingly sophisticated cyber threats. Deep Reinforcement Learning-based IDS, have shown promise due to their adaptive and generalization capabilities. However, recent studies reveal their vulnerability to adversarial attacks, including Universal Adversarial Perturbations (UAPs), which can deceive models with a single, input-agnostic perturbation. In this work, we propose a novel UAP attack against Deep Reinforcement Learning (DRL)-based IDS under the domain-specific constraints derived from network data rules and feature relationships. To the best of our knowledge, there is no existing study that has explored UAP generation for the DRL-based IDS. In addition, this is the first work that focuses on developing a UAP against a DRL-based IDS under realistic domain constraints based on not only the basic domain rules but also mathematical relations between the features. Furthermore, we enhance the evasion performance of the proposed UAP, by introducing a customized loss function based on the Pearson Correlation Coefficient, and we denote it as Customized UAP. To the best of our knowledge, this is also the first work using the PCC value in the UAP generation, even in the broader context. Four additional established UAP baselines are implemented for a comprehensive comparison. Experimental results demonstrate that our proposed Customized UAP outperforms two input-dependent attacks including Fast Gradient Sign Method (FGSM), Basic Iterative Method (BIM), and four UAP baselines, highlighting its effectiveness for real-world adversarial scenarios.

#18 Utilizing Circulant Structure to Optimize the Implementations of Linear Layers

著者: Buji Xu, Xiaoming Sun

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.18226

要約:
In this paper, we propose a novel approach for optimizing the linear layer used in symmetric cryptography. It is observed that these matrices often have circulant structure. The basic idea of this work is to utilize the property to construct a sequence of transformation matrices, which allows subsequent heuristic algorithms to find more efficient implementations. Our results outperform previous works for various linear layers of block ciphers. For Whirlwind M0 , we obtain two implementations with 159 XOR counts (8% better than Yuan et al. at FSE 2025) and depth 17 (39% better than Shi et al. at AsiaCrypt 2024) respectively. For AES MixColumn, our automated method produces a quantum circuit with depth 10, which nearly matches the manually optimized state-of-the-art result by Zhang et al. at IEEE TC 2024, only with 2 extra CNOTs.

#19 Think Fast: Real-Time IoT Intrusion Reasoning Using IDS and LLMs at the Edge Gateway

著者: Saeid Jamshidi, Amin Nikanjam, Negar Shahabi, Kawser Wazed Nafi, Foutse Khomh, Samira Keivanpour, Rolando Herrero

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.18230

要約:
As the number of connected IoT devices continues to grow, securing these systems against cyber threats remains a major challenge, especially in environments with limited computational and energy resources. This paper presents an edge-centric Intrusion Detection System (IDS) framework that integrates lightweight machine learning (ML) based IDS models with pre-trained large language models (LLMs) to improve detection accuracy, semantic interpretability, and operational efficiency at the network edge. The system evaluates six ML-based IDS models: Decision Tree (DT), K-Nearest Neighbors (KNN), Random Forest (RF), Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), and a hybrid CNN-LSTM model on low-power edge gateways, achieving accuracy up to 98 percent under real-world cyberattacks. For anomaly detection, the system transmits a compact and secure telemetry snapshot (for example, CPU usage, memory usage, latency, and energy consumption) via low-bandwidth API calls to LLMs including GPT-4-turbo, DeepSeek V2, and LLaMA 3.5. These models use zero-shot, few-shot, and chain-of-thought reasoning to produce human-readable threat analyses and actionable mitigation recommendations. Evaluations across diverse attacks such as DoS, DDoS, brute force, and port scanning show that the system enhances interpretability while maintaining low latency (<1.5 s), minimal bandwidth usage (<1.2 kB per prompt), and energy efficiency (<75 J), demonstrating its practicality and scalability as an IDS solution for edge gateways.

#20 Lightweight Autoencoder-Isolation Forest Anomaly Detection for Green IoT Edge Gateways

著者: Saeid Jamshidi, Fatemeh Erfan, Omar Abdul-Wahab, Martine Bellaiche, Foutse Khomh

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.18235

要約:
The rapid growth of the Internet of Things (IoT) has given rise to highly diverse and interconnected ecosystems that are increasingly susceptible to sophisticated cyber threats. Conventional anomaly detection schemes often prioritize accuracy while overlooking computational efficiency and environmental impact, which limits their deployment in resource-constrained edge environments. This paper presents \textit{EcoDefender}, a sustainable hybrid anomaly detection framework that integrates \textit{Autoencoder(AE)}-based representation learning with \textit{Isolation Forest(IF)} anomaly scoring. Beyond empirical performance, EcoDefender is supported by a theoretical foundation that establishes formal guarantees for its stability, convergence, robustness, and energy-complexity coupling-thereby linking computational behavior to energy efficiency. Furthermore, experiments on realistic IoT traffic confirm these theoretical insights, achieving up to 94\% detection accuracy with an average CPU usage of only 22\%, 27 ms inference latency, and 30\% lower energy consumption compared to AE-only baselines. By embedding sustainability metrics directly into the security evaluation process, this work demonstrates that reliable anomaly detection and environmental responsibility can coexist within next-generation green IoT infrastructures, aligning with the United Nations Sustainable Development Goals (SDG 9: resilient infrastructure, SDG 13: climate action).

#21 Carbon-Aware Intrusion Detection: A Comparative Study of Supervised and Unsupervised DRL for Sustainable IoT Edge Gateways

著者: Saeid Jamshidi, Foutse Khomh, Kawser Wazed Nafi, Amin Nikanjam, Samira Keivanpour, Omar Abdul-Wahab, Martine Bellaiche

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.18240

要約:
The rapid expansion of the Internet of Things (IoT) has intensified cybersecurity challenges, particularly in mitigating Distributed Denial-of-Service (DDoS) attacks at the network edge. Traditional Intrusion Detection Systems (IDSs) face significant limitations, including poor adaptability to evolving and zero-day attacks, reliance on static signatures and labeled datasets, and inefficiency on resource-constrained edge gateways. Moreover, most existing DRL-based IDS studies overlook sustainability factors such as energy efficiency and carbon impact. To address these challenges, this paper proposes two novel Deep Reinforcement Learning (DRL)-based IDS: DeepEdgeIDS, an unsupervised Autoencoder-DRL hybrid, and AutoDRL-IDS, a supervised LSTM-DRL model. Both DRL-based IDS are validated through theoretical analysis and experimental evaluation on edge gateways. Results demonstrate that AutoDRL-IDS achieves 94% detection accuracy using labeled data, while DeepEdgeIDS attains 98% accuracy and adaptability without labels. Distinctly, this study introduces a carbon-aware, multi-objective reward function optimized for sustainable and real-time IDS operations in dynamic IoT networks.

#22 On Addressing Isolation in Blockchain-Based Self-Sovereign Identity

著者: Andreea Elena Dr\u{a}gnoiu, Andrei Ciobanu, Ruxandra F. Olimid

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.18379

要約:
Self-Sovereign Identity (SSI) grants holders full ownership and control of their digital identities, being the ultimate digital identity model. Operating in a decentralized manner, SSI enables the verification of claims, including privacy-preserving mechanisms. Blockchain, which can be used to implement a Verifiable Data Registry (VDR), is often considered one of the pillars of SSI, along with Decentralized Identifiers (DIDs) and Verifiable Credentials (VCs). Unfortunately, blockchains are mostly siloed, affecting the interoperability and universality of SSI. We investigate the effect of blockchain isolation on blockchain-based SSI. We first define possible scenarios for cross-chain SSI and exemplify with real-life use cases. We then define specific requirements for cross-chain SSI and identify challenges, also in relation to the identified scenarios. We explore various solutions to achieve blockchain interoperability, with a focus on SSI. In particular, we identify the advantages and disadvantages of distinct cross-chain models for cross-chain SSI. Finally, we address the usability of cross-chain SSI and discuss security and privacy aspects, opening the way for future research.

#23 ioPUF+: A PUF Based on I/O Pull-Up/Down Resistors for Secret Key Generation in IoT Nodes

著者: Dilli Babu Porlapothula, Pralay Chakrabarty, Ananya Lakshmi Ravi, Kurian Polachan

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.18412

要約:
In this work, we present ioPUF+, which incorporates a novel Physical Unclonable Function (PUF) that generates unique fingerprints for Integrated Circuits (ICs) and the IoT nodes encompassing them. The proposed PUF generates device-specific responses by measuring the pull-up and pull-down resistor values on the I/O pins of the ICs, which naturally vary across chips due to manufacturing-induced process variations. Since these resistors are already integrated into the I/O structures of most ICs, ioPUF+ requires no custom circuitry, and no new IC fabrication. This makes ioPUF+ suitable for cost-sensitive embedded systems built from Commercial Off-The-Shelf (COTS) components. Beyond introducing a new PUF, ioPUF+ includes a complete datapath for converting raw PUF responses into cryptographically usable secret keys using BCH error correction and SHA-256 hashing. Further ioPUF+ also demonstrate a practical use case of PUF derive secret keys in securing device-to-device communication using AES-encryption. We implemented ioPUF+ on the Infineon PSoC-5 microcontroller and evaluated its performance across 30 devices using standard PUF metrics. The results show excellent reliability (intra-device Hamming distance of 100.00%), strong uniqueness (inter-device Hamming distance of 50.33%), near-ideal uniformity (50.54%), and negligible bit aliasing. Stability tests under temperature and supply-voltage variations show worst-case bit-error rates of only 2.63% and 2.10%, respectively. We also profiled the resource and energy usage of the complete ioPUF+ system, including the PUF primitive, BCH decoding, SHA-256 hashing, and AES encryption. The full implementation requires only 19.8 KB of Flash, exhibits a latency of 600 ms, and consumes 79 mW of power, demonstrating the suitabilitiy of ioPUF+ for resource-constrained IoT nodes.

#24 LLMs as Firmware Experts: A Runtime-Grown Tree-of-Agents Framework

agent

著者: Xiangrui Zhang, Zeyu Chen, Haining Wang, Qiang Li

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.18438

要約:
Large Language Models (LLMs) and their agent systems have recently demonstrated strong potential in automating code reasoning and vulnerability detection. However, when applied to large-scale firmware, their performance degrades due to the binary nature of firmware, complex dependency structures, and heterogeneous components. To address this challenge, this paper presents FIRMHIVE, a recursive agent hive that enables LLMs to act as autonomous firmware security analysts. FIRMHIVE introduces two key mechanisms: (1) transforming delegation into a per-agent, executable primitive and (2) constructing a runtime Tree of Agents (ToA) for decentralized coordination. We evaluate FIRMHIVE using real-world firmware images obtained from publicly available datasets, covering five representative security analysis tasks. Compared with existing LLM-agent baselines, FIRMHIVE performs deeper (about 16x more reasoning steps) and broader (about 2.3x more files inspected) cross-file exploration, resulting in about 5.6x more alerts per firmware. Compared to state-of-the-art (SOTA) security tools, FIRMHIVE identifies about 1.5x more vulnerabilities (1,802 total) and achieves 71% precision, representing significant improvements in both yield and fidelity.

#25 Shadows in the Code: Exploring the Risks and Defenses of LLM-based Multi-Agent Software Development Systems

agent

著者: Xiaoqing Wang, Keman Huang, Bin Liang, Hongyu Li, Xiaoyong Du

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.18467

要約:
The rapid advancement of Large Language Model (LLM)-driven multi-agent systems has significantly streamlined software developing tasks, enabling users with little technical expertise to develop executable applications. While these systems democratize software creation through natural language requirements, they introduce significant security risks that remain largely unexplored. We identify two risky scenarios: Malicious User with Benign Agents (MU-BA) and Benign User with Malicious Agents (BU-MA). We introduce the Implicit Malicious Behavior Injection Attack (IMBIA), demonstrating how multi-agent systems can be manipulated to generate software with concealed malicious capabilities beneath seemingly benign applications, and propose Adv-IMBIA as a defense mechanism. Evaluations across ChatDev, MetaGPT, and AgentVerse frameworks reveal varying vulnerability patterns, with IMBIA achieving attack success rates of 93%, 45%, and 71% in MU-BA scenarios, and 71%, 84%, and 45% in BU-MA scenarios. Our defense mechanism reduced attack success rates significantly, particularly in the MU-BA scenario. Further analysis reveals that compromised agents in the coding and testing phases pose significantly greater security risks, while also identifying critical agents that require protection against malicious user exploitation. Our findings highlight the urgent need for robust security measures in multi-agent software development systems and provide practical guidelines for implementing targeted, resource-efficient defensive strategies.

#26 DEXO: A Secure and Fair Exchange Mechanism for Decentralized IoT Data Markets

著者: Yue Li, Ifteher Alom, Wenhai Sun, Yang Xiao

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.18498

要約:
Opening up data produced by the Internet of Things (IoT) and mobile devices for public utilization can maximize their economic value. Challenges remain in the trustworthiness of the data sources and the security of the trading process, particularly when there is no trust between the data providers and consumers. In this paper, we propose DEXO, a decentralized data exchange mechanism that facilitates secure and fair data exchange between data consumers and distributed IoT/mobile data providers at scale, allowing the consumer to verify the data generation process and the providers to be compensated for providing authentic data, with correctness guarantees from the exchange platform. To realize this, DEXO extends the decentralized oracle network model that has been successful in the blockchain applications domain to incorporate novel hardware-cryptographic co-design that harmonizes trusted execution environment, secret sharing, and smart contract-assisted fair exchange. For the first time, DEXO ensures end-to-end data confidentiality, source verifiability, and fairness of the exchange process with strong resilience against participant collusion. We implemented a prototype of the DEXO system to demonstrate feasibility. The evaluation shows a moderate deployment cost and significantly improved blockchain operation efficiency compared to a popular data exchange mechanism.

#27 LockForge: Automating Paper-to-Code for Logic Locking with Multi-Agent Reasoning LLMs

agent

著者: Akashdeep Saha, Zeng Wang, Prithwish Basu Roy, Johann Knechtel, Ozgur Sinanoglu, Ramesh Karri

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.18531

要約:
Despite rapid progress in logic locking (LL), reproducibility remains a challenge as codes are rarely made public. We present LockForge, a first-of-its-kind, multi-agent large language model (LLM) framework that turns LL descriptions in papers into executable and tested code. LockForge provides a carefully crafted pipeline realizing forethought, implementation, iterative refinement, and a multi-stage validation, all to systematically bridge the gap between prose and practice for complex LL schemes. For validation, we devise (i) an LLM-as-Judge stage with a scoring system considering behavioral checks, conceptual mechanisms, structural elements, and reproducibility on benchmarks, and (ii) an independent LLM-as-Examiner stage for ground-truth assessment. We apply LockForge to 10 seminal LL schemes, many of which lack reference implementations. Our evaluation on multiple SOTA LLMs, including ablation studies, reveals the significant complexity of the task. We show that an advanced reasoning model and a sophisticated, multi-stage framework like LockForge are required. We release all implementations and benchmarks, providing a reproducible and fair foundation for evaluation of further LL research.

#28 Zero-Trust Strategies for O-RAN Cellular Networks: Principles, Challenges and Research Directions

著者: Charalampos Katsis, Imtiaz Karim, Elisa Bertino

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.18568

要約:
Cellular networks have become foundational to modern communication, supporting a broad range of applications, from civilian use to enterprise systems and military tactical networks. The advent of fifth-generation and beyond cellular networks (B5G) introduces emerging compute capabilities into the Radio Access Network (RAN), transforming it from a traditionally closed, vendor-locked infrastructure into an open and programmable ecosystem. This evolution, exemplified by Open-RAN (O-RAN), enables the deployment of control-plane applications from diverse sources, which can dynamically influence user-plane traffic in response to real-time events. As cellular infrastructures become more disaggregated and software-driven, security becomes an increasingly critical concern. Zero-Trust Architecture (ZTA) has emerged as a promising security paradigm that discards implicit trust assumptions by acknowledging that threats may arise from both external and internal sources. ZTA mandates comprehensive and fine-grained security mechanisms across both control and user planes to contain adversarial movements and enhance breach detection and attack response actions. In this paper, we explore the adoption of ZTA in the context of 5G and beyond, with a particular focus on O-RAN as an architectural enabler. We analyze how ZTA principles align with the architectural and operational characteristics of O-RAN, and identify key challenges and opportunities for embedding zero-trust mechanisms within O-RAN-based cellular networks.

#29 TASO: Jailbreak LLMs via Alternative Template and Suffix Optimization

著者: Yanting Wang, Runpeng Geng, Jinghui Chen, Minhao Cheng, Jinyuan Jia

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.18581

要約:
Many recent studies showed that LLMs are vulnerable to jailbreak attacks, where an attacker can perturb the input of an LLM to induce it to generate an output for a harmful question. In general, existing jailbreak techniques either optimize a semantic template intended to induce the LLM to produce harmful outputs or optimize a suffix that leads the LLM to initiate its response with specific tokens (e.g., "Sure"). In this work, we introduce TASO (Template and Suffix Optimization), a novel jailbreak method that optimizes both a template and a suffix in an alternating manner. Our insight is that suffix optimization and template optimization are complementary to each other: suffix optimization can effectively control the first few output tokens but cannot control the overall quality of the output, while template optimization provides guidance for the entire output but cannot effectively control the initial tokens, which significantly impact subsequent responses. Thus, they can be combined to improve the attack's effectiveness. We evaluate the effectiveness of TASO on benchmark datasets (including HarmBench and AdvBench) on 24 leading LLMs (including models from the Llama family, OpenAI, and DeepSeek). The results demonstrate that TASO can effectively jailbreak existing LLMs. We hope our work can inspire future studies in exploring this direction. We will make code and data publicly available.

#30 FHE-Agent: Automating CKKS Configuration for Practical Encrypted Inference via an LLM-Guided Agentic Framework

agent

著者: Nuo Xu, Zhaoting Gong, Ran Ran, Jinwei Tang, Wujie Wen, Caiwen Ding

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.18653

要約:
Fully Homomorphic Encryption (FHE), particularly the CKKS scheme, is a promising enabler for privacy-preserving MLaaS, but its practical deployment faces a prohibitive barrier: it heavily relies on domain expertise. Configuring CKKS involves a tightly coupled space of ring dimensions, modulus chains, and packing layouts. Without deep cryptographic knowledge to navigate these interactions, practitioners are restricted to compilers that rely on fixed heuristics. These "one-shot" tools often emit rigid configurations that are either severely over-provisioned in latency or fail to find a feasible solution entirely for deeper networks. We present FHE-Agent, an agentic framework that automates this expert reasoning process. By coupling a Large Language Model (LLM) controller with a deterministic tool suite, FHE-Agent decomposes the search into global parameter selection and layer-wise bottleneck repair. The agents operate within a multi-fidelity workflow, pruning invalid regimes using cheap static analysis and reserving expensive encrypted evaluations for the most promising candidates. We instantiate FHE-Agent on the Orion compiler and evaluate it on standard benchmarks (MLP, LeNet, LoLa) and deeper architectures (AlexNet). FHE-Agent consistently achieves better precision and lower latency than na\"ive search strategies. Crucially, it automatically discovers feasible, 128-bit secure configurations for complex models where baseline heuristics and one-shot prompts fail to produce a valid setup.

#31 Evaluation of Real-Time Mitigation Techniques for Cyber Security in IEC 61850 / IEC 62351 Substations

著者: Akila Herath, Chen-Ching Liu, Junho Hong, Kuchan Park

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.18748

要約:
The digitalization of substations enlarges the cyber-attack surface, necessitating effective detection and mitigation of cyber attacks in digital substations. While machine learning-based intrusion detection has been widely explored, such methods have not demonstrated detection and mitigation within the required real-time budget. In contrast, cryptographic authentication has emerged as a practical candidate for real-time cyber defense, as specified in IEC 62351. In addition, lightweight rule-based intrusion detection that validates IEC 61850 semantics can provide specification-based detection of anomalous or malicious traffic with minimal processing delay. This paper presents the design logic and implementation aspects of three potential real-time mitigation techniques capable of countering GOOSE-based attacks: (i) IEC 62351-compliant message authentication code (MAC) scheme, (ii) a semantics-enforced rule-based intrusion detection system (IDS), and (iii) a hybrid approach integrating both MAC verification and Intrusion Detection System (IDS). A comparative evaluation of these real-time mitigation approaches is conducted using a cyber-physical system (CPS) security testbed. The results show that the hybrid integration significantly enhances mitigation capability. Furthermore, the processing delays of all three methods remain within the strict delivery requirements of GOOSE communication. The study also identifies limitations that none of the techniques can fully address, highlighting areas for future work.

#32 Re-Key-Free, Risky-Free: Adaptable Model Usage Control

著者: Zihan Wang, Zhongkui Ma, Xinguo Feng, Chuan Yan, Dongge Liu, Ruoxi Sun, Derui Wang, Minhui Xue, Guangdong Bai

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.18772

要約:
Deep neural networks (DNNs) have become valuable intellectual property of model owners, due to the substantial resources required for their development. To protect these assets in the deployed environment, recent research has proposed model usage control mechanisms to ensure models cannot be used without proper authorization. These methods typically lock the utility of the model by embedding an access key into its parameters. However, they often assume static deployment, and largely fail to withstand continual post-deployment model updates, such as fine-tuning or task-specific adaptation. In this paper, we propose ADALOC, to endow key-based model usage control with adaptability during model evolution. It strategically selects a subset of weights as an intrinsic access key, which enables all model updates to be confined to this key throughout the evolution lifecycle. ADALOC enables using the access key to restore the keyed model to the latest authorized states without redistributing the entire network (i.e., adaptation), and frees the model owner from full re-keying after each model update (i.e., lock preservation). We establish a formal foundation to underpin ADALOC, providing crucial bounds such as the errors introduced by updates restricted to the access key. Experiments on standard benchmarks, such as CIFAR-100, Caltech-256, and Flowers-102, and modern architectures, including ResNet, DenseNet, and ConvNeXt, demonstrate that ADALOC achieves high accuracy under significant updates while retaining robust protections. Specifically, authorized usages consistently achieve strong task-specific performance, while unauthorized usage accuracy drops to near-random guessing levels (e.g., 1.01% on CIFAR-100), compared to up to 87.01% without ADALOC. This shows that ADALOC can offer a practical solution for adaptive and protected DNN deployment in evolving real-world scenarios.

#33 RoguePrompt: Dual-Layer Ciphering for Self-Reconstruction to Circumvent LLM Moderation

privacy

著者: Benyamin Tafreshian

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.18790

要約:
Content moderation pipelines for modern large language models combine static filters, dedicated moderation services, and alignment tuned base models, yet real world deployments still exhibit dangerous failure modes. This paper presents RoguePrompt, an automated jailbreak attack that converts a disallowed user query into a self reconstructing prompt which passes provider moderation while preserving the original harmful intent. RoguePrompt partitions the instruction across two lexical streams, applies nested classical ciphers, and wraps the result in natural language directives that cause the target model to decode and execute the hidden payload. Our attack assumes only black box access to the model and to the associated moderation endpoint. We instantiate RoguePrompt against GPT 4o and evaluate it on 2 448 prompts that a production moderation system previously marked as strongly rejected. Under an evaluation protocol that separates three security relevant outcomes bypass, reconstruction, and execution the attack attains 84.7 percent bypass, 80.2 percent reconstruction, and 71.5 percent full execution, substantially outperforming five automated jailbreak baselines. We further analyze the behavior of several automated and human aligned evaluators and show that dual layer lexical transformations remain effective even when detectors rely on semantic similarity or learned safety rubrics. Our results highlight systematic blind spots in current moderation practice and suggest that robust deployment will require joint reasoning about user intent, decoding workflows, and model side computation rather than surface level toxicity alone.

#34 Defending Large Language Models Against Jailbreak Exploits with Responsible AI Considerations

著者: Ryan Wong (National University of Singapore), Hosea David Yu Fei Ng (National University of Singapore), Dhananjai Sharma (National University of Singapore), Glenn Jun Jie Ng (National University of Singapore), Kavishvaran Srinivasan (National University of Singapore)

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.18933

要約:
Large Language Models (LLMs) remain susceptible to jailbreak exploits that bypass safety filters and induce harmful or unethical behavior. This work presents a systematic taxonomy of existing jailbreak defenses across prompt-level, model-level, and training-time interventions, followed by three proposed defense strategies. First, a Prompt-Level Defense Framework detects and neutralizes adversarial inputs through sanitization, paraphrasing, and adaptive system guarding. Second, a Logit-Based Steering Defense reinforces refusal behavior through inference-time vector steering in safety-sensitive layers. Third, a Domain-Specific Agent Defense employs the MetaGPT framework to enforce structured, role-based collaboration and domain adherence. Experiments on benchmark datasets show substantial reductions in attack success rate, achieving full mitigation under the agent-based defense. Overall, this study highlights how jailbreaks pose a significant security threat to LLMs and identifies key intervention points for prevention, while noting that defense strategies often involve trade-offs between safety, performance, and scalability. Code is available at: https://github.com/Kuro0911/CS5446-Project

#35 Understanding and Mitigating Over-refusal for Large Language Models via Safety Representation

著者: Junbo Zhang, Ran Chen, Qianli Zhou, Xinyang Deng, Wen Jiang

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.19009

要約:
Large language models demonstrate powerful capabilities across various natural language processing tasks, yet they also harbor safety vulnerabilities. To enhance LLM safety, various jailbreak defense methods have been proposed to guard against harmful outputs. However, improvements in model safety often come at the cost of severe over-refusal, failing to strike a good balance between safety and usability. In this paper, we first analyze the causes of over-refusal from a representation perspective, revealing that over-refusal samples reside at the boundary between benign and malicious samples. Based on this, we propose MOSR, designed to mitigate over-refusal by intervening the safety representation of LLMs. MOSR incorporates two novel components: (1) Overlap-Aware Loss Weighting, which determines the erasure weight for malicious samples by quantifying their similarity to pseudo-malicious samples in the representation space, and (2) Context-Aware Augmentation, which supplements the necessary context for rejection decisions by adding harmful prefixes before rejection responses. Experiments demonstrate that our method outperforms existing approaches in mitigating over-refusal while largely maintaining safety. Overall, we advocate that future defense methods should strike a better balance between safety and over-refusal.

#36 Can LLMs Threaten Human Survival? Benchmarking Potential Existential Threats from LLMs via Prefix Completion

著者: Yu Cui, Yifei Liu, Hang Fu, Sicheng Pan, Haibin Zhang, Cong Zuo, Licheng Wang

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.19171

要約:
Research on the safety evaluation of large language models (LLMs) has become extensive, driven by jailbreak studies that elicit unsafe responses. Such response involves information already available to humans, such as the answer to "how to make a bomb". When LLMs are jailbroken, the practical threat they pose to humans is negligible. However, it remains unclear whether LLMs commonly produce unpredictable outputs that could pose substantive threats to human safety. To address this gap, we study whether LLM-generated content contains potential existential threats, defined as outputs that imply or promote direct harm to human survival. We propose \textsc{ExistBench}, a benchmark designed to evaluate such risks. Each sample in \textsc{ExistBench} is derived from scenarios where humans are positioned as adversaries to AI assistants. Unlike existing evaluations, we use prefix completion to bypass model safeguards. This leads the LLMs to generate suffixes that express hostility toward humans or actions with severe threat, such as the execution of a nuclear strike. Our experiments on 10 LLMs reveal that LLM-generated content indicates existential threats. To investigate the underlying causes, we also analyze the attention logits from LLMs. To highlight real-world safety risks, we further develop a framework to assess model behavior in tool-calling. We find that LLMs actively select and invoke external tools with existential threats. Code and data are available at: https://github.com/cuiyu-ai/ExistBench.

#37 Adversarial Attack-Defense Co-Evolution for LLM Safety Alignment via Tree-Group Dual-Aware Search and Optimization

著者: Xurui Li, Kaisong Song, Rui Zhu, Pin-Yu Chen, Haixu Tang

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.19218

要約:
Large Language Models (LLMs) have developed rapidly in web services, delivering unprecedented capabilities while amplifying societal risks. Existing works tend to focus on either isolated jailbreak attacks or static defenses, neglecting the dynamic interplay between evolving threats and safeguards in real-world web contexts. To mitigate these challenges, we propose ACE-Safety (Adversarial Co-Evolution for LLM Safety), a novel framework that jointly optimize attack and defense models by seamlessly integrating two key innovative procedures: (1) Group-aware Strategy-guided Monte Carlo Tree Search (GS-MCTS), which efficiently explores jailbreak strategies to uncover vulnerabilities and generate diverse adversarial samples; (2) Adversarial Curriculum Tree-aware Group Policy Optimization (AC-TGPO), which jointly trains attack and defense LLMs with challenging samples via curriculum reinforcement learning, enabling robust mutual improvement. Evaluations across multiple benchmarks demonstrate that our method outperforms existing attack and defense approaches, and provides a feasible pathway for developing LLMs that can sustainably support responsible AI ecosystems.

#38 FedPoisonTTP: A Threat Model and Poisoning Attack for Federated Test-Time Personalization

backdoor

著者: Md Akil Raihan Iftee, Syed Md. Ahnaf Hasan, Amin Ahsan Ali, AKM Mahbubur Rahman, Sajib Mistry, Aneesh Krishna

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.19248

要約:
Test-time personalization in federated learning enables models at clients to adjust online to local domain shifts, enhancing robustness and personalization in deployment. Yet, existing federated learning work largely overlooks the security risks that arise when local adaptation occurs at test time. Heterogeneous domain arrivals, diverse adaptation algorithms, and limited cross-client visibility create vulnerabilities where compromised participants can craft poisoned inputs and submit adversarial updates that undermine both global and per-client performance. To address this threat, we introduce FedPoisonTTP, a realistic grey-box attack framework that explores test-time data poisoning in the federated adaptation setting. FedPoisonTTP distills a surrogate model from adversarial queries, synthesizes in-distribution poisons using feature-consistency, and optimizes attack objectives to generate high-entropy or class-confident poisons that evade common adaptation filters. These poisons are injected during local adaptation and spread through collaborative updates, leading to broad degradation. Extensive experiments on corrupted vision benchmarks show that compromised participants can substantially diminish overall test-time performance.

#39 Medusa: Cross-Modal Transferable Adversarial Attacks on Multimodal Medical Retrieval-Augmented Generation

著者: Yingjia Shang, Yi Liu, Huimin Wang, Furong Li, Wenfang Sun, Wu Chengyu, Yefeng Zheng

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.19257

要約:
With the rapid advancement of retrieval-augmented vision-language models, multimodal medical retrieval-augmented generation (MMed-RAG) systems are increasingly adopted in clinical decision support. These systems enhance medical applications by performing cross-modal retrieval to integrate relevant visual and textual evidence for tasks, e.g., report generation and disease diagnosis. However, their complex architecture also introduces underexplored adversarial vulnerabilities, particularly via visual input perturbations. In this paper, we propose Medusa, a novel framework for crafting cross-modal transferable adversarial attacks on MMed-RAG systems under a black-box setting. Specifically, Medusa formulates the attack as a perturbation optimization problem, leveraging a multi-positive InfoNCE loss (MPIL) to align adversarial visual embeddings with medically plausible but malicious textual targets, thereby hijacking the retrieval process. To enhance transferability, we adopt a surrogate model ensemble and design a dual-loop optimization strategy augmented with invariant risk minimization (IRM). Extensive experiments on two real-world medical tasks, including medical report generation and disease diagnosis, demonstrate that Medusa achieves over 90% average attack success rate across various generation models and retrievers under appropriate parameter configuration, while remaining robust against four mainstream defenses, outperforming state-of-the-art baselines. Our results reveal critical vulnerabilities in the MMed-RAG systems and highlight the necessity of robustness benchmarking in safety-critical medical applications. The code and data are available at https://anonymous.4open.science/r/MMed-RAG-Attack-F05A.

#40 Evolution of Cybersecurity Subdisciplines: A Science of Science Study

著者: Yao Chen, Jeff Yan

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.19331

要約:
The science of science is an emerging field that studies the practice of science itself. We present the first study of the cybersecurity discipline from a science of science perspective. We examine the evolution of two comparable interdisciplinary communities in cybersecurity: the Symposium on Usable Privacy and Security (SOUPS) and Financial Cryptography and Data Security (FC).

#41 Binary BPE: A Family of Cross-Platform Tokenizers for Binary Analysis

著者: Michael J. Bommarito II

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.17573

要約:
Sequence models for binary analysis are bottlenecked by byte-level tokenization: raw bytes waste precious context window capacity for transformers and other neural network architectures, and many existing text-oriented tokenizers fail on arbitrary 0x00--0xFF sequences. To address this issue, we introduce the Binary BPE tokenizer family, a set of cross-platform Byte Pair Encoding (BPE) tokenizers for executables trained on a large corpus of binaries spanning multiple platforms, architectures, and operating systems, including Linux, Windows, macOS, Android, and malware sources. We release trained tokenizers with vocabularies of 4K, 8K, 16K, 32K, and 64K tokens, enabling both systematic scaling studies and practical deployment from resource-constrained edge devices to high-throughput datacenters. These tokenizers discover interpretable patterns (ELF/PE headers, instruction sequences, cross-platform strings) while yielding multi-byte compression per token. On representative uncompressed executables (e.g., ELF/PE/Mach-O rather than compressed APKs), the Binary BPE tokenizers typically allow for roughly 2-3x more binary content per fixed-length transformer context window than raw bytes, enabling more efficient research and practical deployment for content identification, malware detection, reverse engineering, and optimization. We release the trained Binary BPE tokenizers on HuggingFace, providing a drop-in, open-source foundation for binary-focused language models and context-efficient agentic tools.

#42 Optimized Memory Tagging on AmpereOne Processors

著者: Shiv Kaushik, Mahesh Madhav, Nagi Aboulenein, Jason Bessette, Sandeep Brahmadathan, Ben Chaffin, Matthew Erler, Stephan Jourdan, Thomas Maciukenas, Ramya Masti, Jon Perry, Massimo Sutera, Scott Tetrick, Bret Toll, David Turley, Carl Worth, Atiq Bajwa

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.17773

要約:
Memory-safety escapes continue to form the launching pad for a wide range of security attacks, especially for the substantial base of deployed software that is coded in pointer-based languages such as C/C++. Although compiler and Instruction Set Architecture (ISA) extensions have been introduced to address elements of this issue, the overhead and/or comprehensive applicability have limited broad production deployment. The Memory Tagging Extension (MTE) to the ARM AArch64 Instruction Set Architecture is a valuable tool to address memory-safety escapes; when used in synchronous tag-checking mode, MTE provides deterministic detection and prevention of sequential buffer overflow attacks, and probabilistic detection and prevention of exploits resulting from temporal use-after-free pointer programming bugs. The AmpereOne processor, launched in 2024, is the first datacenter processor to support MTE. Its optimized MTE implementation uniquely incurs no memory capacity overhead for tag storage and provides synchronous tag-checking with single-digit performance impact across a broad range of datacenter class workloads. Furthermore, this paper analyzes the complete hardware-software stack, identifying application memory management as the primary remaining source of overhead and highlighting clear opportunities for software optimization. The combination of an efficient hardware foundation and a clear path for software improvement makes the MTE implementation of the AmpereOne processor highly attractive for deployment in production cloud environments.

#43 Uncertainty-Aware Federated Learning for Cyber-Resilient Microgrid Energy Management

著者: Oluleke Babayomi, Dong-Seong Kim

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.17968

要約:
Maintaining economic efficiency and operational reliability in microgrid energy management systems under cyberattack conditions remains challenging. Most approaches assume non-anomalous measurements, make predictions with unquantified uncertainties, and do not mitigate malicious attacks on renewable forecasts for energy management optimization. This paper presents a comprehensive cyber-resilient framework integrating federated Long Short-Term Memory-based photovoltaic forecasting with a novel two-stage cascade false data injection attack detection and energy management system optimization. The approach combines autoencoder reconstruction error with prediction uncertainty quantification to enable attack-resilient energy storage scheduling while preserving data privacy. Extreme false data attack conditions were studied that caused 58% forecast degradation and 16.9\% operational cost increases. The proposed integrated framework reduced false positive detections by 70%, recovered 93.7% of forecasting performance losses, and achieved 5\% operational cost savings, mitigating 34.7% of attack-induced economic losses. Results demonstrate that precision-focused cascade detection with multi-signal fusion outperforms single-signal approaches, validating security-performance synergy for decentralized microgrids.

#44 Federated Anomaly Detection and Mitigation for EV Charging Forecasting Under Cyberattacks

著者: Oluleke Babayomi, Dong-Seong Kim

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.17978

要約:
Electric Vehicle (EV) charging infrastructure faces escalating cybersecurity threats that can severely compromise operational efficiency and grid stability. Existing forecasting techniques are limited by the lack of combined robust anomaly mitigation solutions and data privacy preservation. Therefore, this paper addresses these challenges by proposing a novel anomaly-resilient federated learning framework that simultaneously preserves data privacy, detects cyber-attacks, and maintains trustworthy demand prediction accuracy under adversarial conditions. The proposed framework integrates three key innovations: LSTM autoencoder-based distributed anomaly detection deployed at each federated client, interpolation-based anomalous data mitigation to preserve temporal continuity, and federated Long Short-Term Memory (LSTM) networks that enable collaborative learning without centralized data aggregation. The framework is validated on real-world EV charging infrastructure datasets combined with real-world DDoS attack datasets, providing robust validation of the proposed approach under realistic threat scenarios. Experimental results demonstrate that the federated approach achieves superior performance compared to centralized models, with 15.2% improvement in R2 accuracy while maintaining data locality. The integrated cyber-attack detection and mitigation system produces trustworthy datasets that enhance prediction reliability, recovering 47.9% of attack-induced performance degradation while maintaining exceptional precision (91.3%) and minimal false positive rates (1.21%). The proposed architecture enables enhanced EV infrastructure planning, privacy-preserving collaborative forecasting, cybersecurity resilience, and rapid recovery from malicious threats across distributed charging networks.

#45 Privacy Auditing of Multi-domain Graph Pre-trained Model under Membership Inference Attacks

privacy

著者: Jiayi Luo, Qingyun Sun, Yuecen Wei, Haonan Yuan, Xingcheng Fu, Jianxin Li

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.17989

要約:
Multi-domain graph pre-training has emerged as a pivotal technique in developing graph foundation models. While it greatly improves the generalization of graph neural networks, its privacy risks under membership inference attacks (MIAs), which aim to identify whether a specific instance was used in training (member), remain largely unexplored. However, effectively conducting MIAs against multi-domain graph pre-trained models is a significant challenge due to: (i) Enhanced Generalization Capability: Multi-domain pre-training reduces the overfitting characteristics commonly exploited by MIAs. (ii) Unrepresentative Shadow Datasets: Diverse training graphs hinder the obtaining of reliable shadow graphs. (iii) Weakened Membership Signals: Embedding-based outputs offer less informative cues than logits for MIAs. To tackle these challenges, we propose MGP-MIA, a novel framework for Membership Inference Attacks against Multi-domain Graph Pre-trained models. Specifically, we first propose a membership signal amplification mechanism that amplifies the overfitting characteristics of target models via machine unlearning. We then design an incremental shadow model construction mechanism that builds a reliable shadow model with limited shadow graphs via incremental learning. Finally, we introduce a similarity-based inference mechanism that identifies members based on their similarity to positive and negative samples. Extensive experiments demonstrate the effectiveness of our proposed MGP-MIA and reveal the privacy risks of multi-domain graph pre-training.

#46 Vulnerability-Aware Robust Multimodal Adversarial Training

著者: Junrui Zhang, Xinyu Zhao, Jie Peng, Chenjie Wang, Jianmin Ji, Tianlong Chen

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.18138

要約:
Multimodal learning has shown significant superiority on various tasks by integrating multiple modalities. However, the interdependencies among modalities increase the susceptibility of multimodal models to adversarial attacks. Existing methods mainly focus on attacks on specific modalities or indiscriminately attack all modalities. In this paper, we find that these approaches ignore the differences between modalities in their contribution to final robustness, resulting in suboptimal robustness performance. To bridge this gap, we introduce Vulnerability-Aware Robust Multimodal Adversarial Training (VARMAT), a probe-in-training adversarial training method that improves multimodal robustness by identifying the vulnerability of each modality. To be specific, VARMAT first explicitly quantifies the vulnerability of each modality, grounded in a first-order approximation of the attack objective (Probe). Then, we propose a targeted regularization term that penalizes modalities with high vulnerability, guiding robust learning while maintaining task accuracy (Training). We demonstrate the enhanced robustness of our method across multiple multimodal datasets involving diverse modalities. Finally, we achieve {12.73%, 22.21%, 11.19%} robustness improvement on three multimodal datasets, revealing a significant blind spot in multimodal adversarial training.

#47 Vision Token Masking Alone Cannot Prevent PHI Leakage in Medical Document OCR: A Systematic Evaluation

著者: Richard J. Young

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.18272

要約:
Large vision-language models (VLMs) are increasingly deployed for optical character recognition (OCR) in healthcare settings, raising critical concerns about protected health information (PHI) exposure during document processing. This work presents the first systematic evaluation of inference-time vision token masking as a privacy-preserving mechanism for medical document OCR using DeepSeek-OCR. We introduce seven masking strategies (V3-V9) targeting different architectural layers (SAM encoder blocks, compression layers, dual vision encoders, projector fusion) and evaluate PHI reduction across HIPAA-defined categories using 100 synthetic medical billing statements (drawn from a corpus of 38,517 annotated documents) with perfect ground-truth annotations. All masking strategies converge to 42.9% PHI reduction, successfully suppressing long-form spatially-distributed identifiers (patient names, dates of birth, physical addresses at 100% effectiveness) while failing to prevent short structured identifiers (medical record numbers, social security numbers, email addresses, account numbers at 0% effectiveness). Ablation studies varying mask expansion radius (r=1,2,3) demonstrate that increased spatial coverage does not improve reduction beyond this ceiling, indicating that language model contextual inference - not insufficient visual masking - drives structured identifier leakage. A simulated hybrid architecture combining vision masking with NLP post-processing achieves 88.6% total PHI reduction (assuming 80% NLP accuracy on remaining identifiers). This negative result establishes boundaries for vision-only privacy interventions in VLMs, provides guidance distinguishing PHI types amenable to vision-level versus language-level redaction, and redirects future research toward decoder-level fine-tuning and hybrid defense-in-depth architectures for HIPAA-compliant medical document processing.

#48 From Reviewers' Lens: Understanding Bug Bounty Report Invalid Reasons with LLMs

著者: Jiangrui Zheng, Yingming Zhou, Ali Abdullah Ahmad, Hanqing Yao, Xueqing Liu

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.18608

要約:
Bug bounty platforms (e.g., HackerOne, BugCrowd) leverage crowd-sourced vulnerability discovery to improve continuous coverage, reduce the cost of discovery, and serve as an integral complement to internal red teams. With the rise of AI-generated bug reports, little work exists to help bug hunters understand why these reports are labeled as invalid. To improve report quality and reduce reviewers' burden, it is critical to predict invalid reports and interpret invalid reasons. In this work, we conduct an empirical study with the purpose of helping bug hunters understand the validity of reports. We collect a dataset of 9,942 disclosed bug bounty reports, including 1,400 invalid reports, and evaluate whether state-of-the-art large language models can identify invalid reports. While models such as GPT-5, DeepSeek, and a fine-tuned RoBERTa achieve strong overall accuracy, they consistently struggle to detect invalid cases, showing a tendency to over-accept reports. To improve invalidity detection, we build a taxonomy of rejection reasons for Information Disclosure vulnerabilities and incorporate it into a retrieval-augmented generation (RAG) framework. This approach substantially improves classification consistency and reduces bias. We also examine whether reviewer decisions may be influenced by factors beyond the content of the report. Our analysis shows that reporters with higher reputations tend to receive more favorable outcomes in borderline cases, suggesting that perceived expertise can influence review judgments. Overall, our findings highlight the challenges of invalid report identification and show that combining LLMs with structured reviewer knowledge can support more transparent and consistent vulnerability report review.

#49 LLM-CSEC: Empirical Evaluation of Security in C/C++ Code Generated by Large Language Models

著者: Muhammad Usman Shahid, Chuadhry Mujeeb Ahmed, Rajiv Ranjan

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.18966

要約:
The security of code generated by large language models (LLMs) is a significant concern, as studies indicate that such code often contains vulnerabilities and lacks essential defensive programming constructs. This work focuses on examining and evaluating the security of LLM-generated code, particularly in the context of C/C++. We categorized known vulnerabilities using the Common Weakness Enumeration (CWE) and, to study their criticality, mapped them to CVEs. We used ten different LLMs for code generation and analyzed the outputs through static analysis. The amount of CWEs present in AI-generated code is concerning. Our findings highlight the need for developers to be cautious when using LLM-generated code. This study provides valuable insights to advance automated code generation and encourage further research in this domain.

#50 A General Framework for Per-record Differential Privacy

privacy

著者: Xinghe Chen, Dajun Sun, Quanqing Xu, Wei Dong

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.19015

要約:
Differential Privacy (DP) is a widely adopted standard for privacy-preserving data analysis, but it assumes a uniform privacy budget across all records, limiting its applicability when privacy requirements vary with data values. Per-record Differential Privacy (PrDP) addresses this by defining the privacy budget as a function of each record, offering better alignment with real-world needs. However, the dependency between the privacy budget and the data value introduces challenges in protecting the budget's privacy itself. Existing solutions either handle specific privacy functions or adopt relaxed PrDP definitions. A simple workaround is to use the global minimum of the privacy function, but this severely degrades utility, as the minimum is often set extremely low to account for rare records with high privacy needs. In this work, we propose a general and practical framework that enables any standard DP mechanism to support PrDP, with error depending only on the minimal privacy requirement among records actually present in the dataset. Since directly revealing this minimum may leak information, we introduce a core technique called privacy-specified domain partitioning, which ensures accurate estimation without compromising privacy. We also extend our framework to the local DP setting via a novel technique, privacy-specified query augmentation. Using our framework, we present the first PrDP solutions for fundamental tasks such as count, sum, and maximum estimation. Experimental results show that our mechanisms achieve high utility and significantly outperform existing Personalized DP (PDP) methods, which can be viewed as a special case of PrDP with relaxed privacy protection.

#51 SoK: The Security-Safety Continuum of Multimodal Foundation Models through Information Flow and Global Game-Theoretic Analysis of Asymmetric Threats

著者: Ruoxi Sun, Jiamin Chang, Hammond Pearce, Chaowei Xiao, Bo Li, Qi Wu, Surya Nepal, Minhui Xue

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2411.11195

要約:
Multimodal foundation models (MFMs) integrate diverse data modalities to support complex and wide-ranging tasks. However, this integration also introduces distinct safety and security challenges. In this paper, we unify the concepts of safety and security in the context of MFMs by identifying critical threats that arise from both model behavior and system-level interactions. We propose a taxonomy grounded in information theory, evaluating risks through the concepts of channel capacity, signal, noise, and bandwidth. This perspective provides a principled way to analyze how information flows through MFMs and how vulnerabilities can emerge across modalities. Building on this foundation, we introduce a deterministic minimax formulation to analyze defense mechanisms and expose structural vulnerabilities in multimodal systems. Our framework projects attacks onto the noise, signal, and bandwidth axes, collapsing the defense search space and mitigating defender asymmetry. Across 15 defenses, we find that system-level bandwidth and behavior constraints generalize substantially better than brittle model-only methods. Finally, we formalize an MFM "self-destruction threshold" that specifies when termination should be triggered, providing a concrete activation rule for circuit-breaker safeguards within multimodal systems.

#52 Preserving Expert-Level Privacy in Offline Reinforcement Learning

privacy

著者: Navodita Sharma, Vishnu Vinod, Abhradeep Thakurta, Alekh Agarwal, Borja Balle, Christoph Dann, Aravindan Raghuveer

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2411.13598

要約:
The offline reinforcement learning (RL) problem aims to learn an optimal policy from historical data collected by one or more behavioural policies (experts) by interacting with an environment. However, the individual experts may be privacy-sensitive in that the learnt policy may retain information about their precise choices. In some domains like personalized retrieval, advertising and healthcare, the expert choices are considered sensitive data. To provably protect the privacy of such experts, we propose a novel consensus-based expert-level differentially private offline RL training approach compatible with any existing offline RL algorithm. We prove rigorous differential privacy guarantees, while maintaining strong empirical performance. Unlike existing work in differentially private RL, we supplement the theory with proof-of-concept experiments on classic RL environments featuring large continuous state spaces, demonstrating substantial improvements over a natural baseline across multiple tasks.

#53 DiffBreak: Is Diffusion-Based Purification Robust?

diffusion

著者: Andre Kassis, Urs Hengartner, Yaoliang Yu

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2411.16598

要約:
Diffusion-based purification (DBP) has become a cornerstone defense against adversarial examples (AEs), regarded as robust due to its use of diffusion models (DMs) that project AEs onto the natural data manifold. We refute this core claim, theoretically proving that gradient-based attacks effectively target the DM rather than the classifier, causing DBP's outputs to align with adversarial distributions. This prompts a reassessment of DBP's robustness, accrediting it two critical factors: inaccurate gradients and improper evaluation protocols that test only a single random purification of the AE. We show that when accounting for stochasticity and resubmission risk, DBP collapses. To support this, we introduce DiffBreak, the first reliable toolkit for differentiation through DBP, eliminating gradient mismatches that previously further inflated robustness estimates. We also analyze the current defense scheme used for DBP where classification relies on a single purification, pinpointing its inherent invalidity. We provide a statistically grounded majority-vote (MV) alternative that aggregates predictions across multiple purified copies, showing partial but meaningful robustness gain. We then propose a novel adaptation of an optimization method against deepfake watermarking, crafting systemic perturbations that defeat DBP even under MV, challenging DBP's viability.

#54 SATA: A Paradigm for LLM Jailbreak via Simple Assistive Task Linkage

著者: Xiaoning Dong, Wenbo Hu, Wei Xu, Tianxing He

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2412.15289

要約:
Large language models (LLMs) have made significant advancements across various tasks, but their safety alignment remain a major concern. Exploring jailbreak prompts can expose LLMs' vulnerabilities and guide efforts to secure them. Existing methods primarily design sophisticated instructions for the LLM to follow, or rely on multiple iterations, which could hinder the performance and efficiency of jailbreaks. In this work, we propose a novel jailbreak paradigm, Simple Assistive Task Linkage (SATA), which can effectively circumvent LLM safeguards and elicit harmful responses. Specifically, SATA first masks harmful keywords within a malicious query to generate a relatively benign query containing one or multiple [MASK] special tokens. It then employs a simple assistive task such as a masked language model task or an element lookup by position task to encode the semantics of the masked keywords. Finally, SATA links the assistive task with the masked query to jointly perform the jailbreak. Extensive experiments show that SATA achieves state-of-the-art performance and outperforms baselines by a large margin. Specifically, on AdvBench dataset, with mask language model (MLM) assistive task, SATA achieves an overall attack success rate (ASR) of 85% and harmful score (HS) of 4.57, and with element lookup by position (ELP) assistive task, SATA attains an overall ASR of 76% and HS of 4.43.

#55 Incalmo: An Autonomous LLM-assisted System for Red Teaming Multi-Host Networks

著者: Brian Singer, Keane Lucas, Lakshmi Adiga, Meghna Jain, Lujo Bauer, Vyas Sekar

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2501.16466

要約:
Security operators use red teams to simulate real attackers and proactively find defense gaps. In realistic enterprise settings, this involves executing multi-host network attacks spanning many "stepping stone" hosts. Unfortunately, red teams are expensive and entail significant expertise and effort. Given the promise of LLMs in CTF challenges, we first analyze if LLMs can autonomously execute multi-host red team exercises. We find that state-of-the-art LLM-assisted offense systems (e.g., PentestGPT, CyberSecEval3) with leading LLMs (e.g., Sonnet 4, Gemini 2.5 Pro) are unable to do so. Building on our observations in understanding the failure modes of state-of-the-art systems, we argue the need to improve the abstractions and interfaces for LLM-assisted red teaming. Based on this insight, we present the design and implementation of Incalmo, an LLM-assisted system for autonomously red teaming multi-host networks. Incalmo uses LLMs to plan red team exercises in terms of high-level declarative tasks that are executed by domain-specific task agents. Incalmo also uses auxiliary services to manage context and acquired assets. For our evaluation, we develop MHBench, a novel multi-host attack benchmark with 40 realistic emulated networks (from 22 to 50 hosts). We find that Incalmo successfully acquires critical assets (i.e., key hosts or data) in 37 out of 40 MHBench environments. In contrast, state-of-the-art LLM-assisted systems succeed in only 3 out of 40 environments. We show that Incalmo is efficient-successful attacks took 12-54 minutes and cost <$15 in LLM credits.

#56 DarkMind: Latent Chain-of-Thought Backdoor in Customized LLMs

backdoor

著者: Zhen Guo, Shanghao Shi, Shamim Yazdani, Ning Zhang, Reza Tourani

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2501.18617

要約:
With the rapid rise of personalized AI, customized large language models (LLMs) equipped with Chain of Thought (COT) reasoning now power millions of AI agents. However, their complex reasoning processes introduce new and largely unexplored security vulnerabilities. We present DarkMind, a novel latent reasoning level backdoor attack that targets customized LLMs by manipulating internal COT steps without altering user queries. Unlike prior prompt based attacks, DarkMind activates covertly within the reasoning chain via latent triggers, enabling adversarial behaviors without modifying input prompts or requiring access to model parameters. To achieve stealth and reliability, we propose dual trigger types instant and retrospective and integrate them within a unified embedding template that governs trigger dependent activation, employ a stealth optimization algorithm to minimize semantic drift, and introduce an automated conversation starter for covert activation across domains. Comprehensive experiments on eight reasoning datasets spanning arithmetic, commonsense, and symbolic domains, using five LLMs, demonstrate that DarkMind consistently achieves high attack success rates. We further investigate defense strategies to mitigate these risks and reveal that reasoning level backdoors represent a significant yet underexplored threat, underscoring the need for robust, reasoning aware security mechanisms.

#57 Traffic Modeling for Network Security and Privacy: Challenges Ahead

privacy

著者: Dinil Mon Divakaran

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2503.22161

要約:
Network traffic analysis using machine learning (ML) made significant progress over the past decades. Traffic analysis addresses various challenging problems in network security, ranging from detection of anomalies and attacks to countering of censorship. ML models are also developed to expose user privacy risks as demonstrated by the research works on fingerprinting of user-visiting websites, IoT devices, and different applications, even when payloads are encrypted. Despite these advancements, significant challenges remain in the domain of network traffic analysis to effectively secure our networks from evolving threats and attacks. After briefly reviewing the relevant tasks and recent ML models for traffic analysis, we discuss the challenges that lie ahead.

#58 GiBy: A Giant-Step Baby-Step Classifier For Anomaly Detection In Industrial Control Systems

著者: Sarad Venugopalan, Sridhar Adepu

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2504.20906

要約:
The continuous monitoring of the interactions between cyber-physical components of any industrial control system (ICS) is required to secure automation of the system controls, and to guarantee plant processes are fail-safe and remain in an acceptably safe state. Safety is achieved by managing actuation (where electric signals are used to trigger physical movement), dependent on corresponding sensor readings; used as ground truth in decision making. Timely detection of anomalies (attacks, faults and unascertained states) in ICSs is crucial for the safe running of a plant, the safety of its personnel, and for the safe provision of any services provided. We propose an anomaly detection method that involves accurate linearization of the non-linear forms arising from sensor-actuator(s) relationships, primarily because solving linear models is easier and well understood. We accomplish this by using a well-known water treatment testbed as a use case. Our experiments show millisecond time response to detect anomalies, all of which are explainable and traceable; this simultaneous coupling of detection speed and explainability has not been achieved by other state of the art Artificial Intelligence (AI)/ Machine Learning (ML) models with eXplainable AI (XAI) used for the same purpose. Our methods explainability enables us to pin-point the sensor(s) and the actuation state(s) for which the anomaly was detected. The proposed algorithm showed an accuracy of 97.72% by flagging deviations within safe operation limits as non-anomalous; indicative that slower detectors with highest detection resolution is unnecessary, for systems whose safety boundaries provide leeway within safety limits.

#59 ExtendAttack: Attacking Servers of LRMs via Extending Reasoning

著者: Zhenhao Zhu, Yue Liu, Zhiwei Xu, Yingwei Ma, Hongcheng Gao, Nuo Chen, Yanpei Guo, Wenjie Qu, Huiying Xu, Zifeng Kang, Xinzhong Zhu, Jiaheng Zhang

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2506.13737

要約:
Large Reasoning Models (LRMs) have demonstrated promising performance in complex tasks. However, the resource-consuming reasoning processes may be exploited by attackers to maliciously occupy the resources of the servers, leading to a crash, like the DDoS attack in cyber. To this end, we propose a novel attack method on LRMs termed ExtendAttack to maliciously occupy the resources of servers by stealthily extending the reasoning processes of LRMs. Concretely, we systematically obfuscate characters within a benign prompt, transforming them into a complex, poly-base ASCII representation. This compels the model to perform a series of computationally intensive decoding sub-tasks that are deeply embedded within the semantic structure of the query itself. Extensive experiments demonstrate the effectiveness of our proposed ExtendAttack. Remarkably, it significantly increases response length and latency, with the former increasing by over 2.7 times for the o3 model on the HumanEval benchmark. Besides, it preserves the original meaning of the query and achieves comparable answer accuracy, showing the stealthiness.

#60 Revisiting Pre-trained Language Models for Vulnerability Detection

著者: Youpeng Li, Weiliang Qi, Xuyu Wang, Fuxun Yu, Xinda Wang

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2507.16887

要約:
The rapid advancement of pre-trained language models (PLMs) has demonstrated promising results for various code-related tasks. However, their effectiveness in detecting real-world vulnerabilities remains a critical challenge. While existing empirical studies evaluate PLMs for vulnerability detection (VD), they suffer from data leakage, limited scope, and superficial analysis, hindering the accuracy and comprehensiveness of evaluations. This paper begins by revisiting the common issues in existing research on PLMs for VD through the evaluation pipeline. It then proceeds with an accurate and extensive evaluation of 18 PLMs on high-quality datasets that feature accurate labeling, diverse vulnerability types, and various projects. Specifically, we compare the performance of PLMs under both fine-tuning and prompt engineering, assess their effectiveness and generalizability across various training and testing settings, and analyze their robustness to a series of perturbations. Our findings reveal that PLMs incorporating pre-training tasks designed to capture the syntactic and semantic patterns of code outperform both general-purpose PLMs and those solely pre-trained or fine-tuned on large code corpora. However, these models face notable challenges in real-world scenarios, such as difficulties in detecting vulnerabilities with complex dependencies, handling perturbations introduced by code normalization and abstraction, and identifying semantic-preserving vulnerable code transformations. Also, the truncation caused by the limited context windows of PLMs can lead to a non-negligible number of labeling errors, which is overlooked by previous work. This study underscores the importance of thorough evaluations of model performance in practical scenarios and outlines future directions to help enhance the effectiveness of PLMs for realistic VD applications.

#61 Fine-Grained Privacy Extraction from Retrieval-Augmented Generation Systems via Knowledge Asymmetry Exploitation

privacy

著者: Yufei Chen, Yao Wang, Haibin Zhang, Tao Gu

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2507.23229

要約:
Retrieval-augmented generation (RAG) systems enhance large language models (LLMs) by integrating external knowledge bases, but this advancement introduces significant privacy risks. Existing privacy attacks on RAG systems can trigger data leakage but often fail to accurately isolate knowledge-base-derived sentences within mixed responses. They also lack robustness when applied across multiple domains. This paper addresses these challenges by presenting a novel black-box attack framework that exploits knowledge asymmetry between RAG and standard LLMs to achieve fine-grained privacy extraction across heterogeneous knowledge landscapes. We propose a chain-of-thought reasoning strategy that creates adaptive prompts to steer RAG systems away from sensitive content. Specifically, we first decompose adversarial queries to maximize information disparity and then apply a semantic relationship scoring to resolve lexical and syntactic ambiguities. We finally train a neural network on these feature scores to precisely identify sentences containing private information. Unlike prior work, our framework generalizes to unseen domains through iterative refinement without pre-defined knowledge. Experimental results show that we achieve over 91% privacy extraction rate in single-domain and 83% in multi-domain scenarios, reducing sensitive sentence exposure by over 65% in case studies. This work bridges the gap between attack and defense in RAG systems, enabling precise extraction of private information while providing a foundation for adaptive mitigation.

#62 CIF: A Constrained Inversion Framework for Reliable Message Extraction in Diffusion-Based Generative Steganography

diffusion

著者: Yuqi Qian, Yun Cao, Meiyang Lv, Haocheng Fu

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2508.00434

要約:
Generative image steganography aims to conceal secret information in generated images without arousing suspicion. However, in practical scenarios involving high-capacity embedding or lossy transmission, existing methods still suffer from limited extraction accuracy. The main challenge lies in accurately recovering the secret-embedded latent vectors from stego images. To address this issue, we propose CIF, a constrained inversion framework designed to achieve accurate message extraction. Specifically, CIF reduces dynamic structural errors by enforcing linear consistency in the latent space, meanwhile reduces numerical integration errors by adaptively adjusting the integration order according to local trajectory stability. Experimental results show that our method reduces latent reconstruction error by more than 35\% and achieves higher message extraction accuracy than existing approaches.

#63 Whispering Agents: An Event-driven Covert Communication Protocol For the Internet of Agents

agent

著者: Kaibo Huang, Yukun Wei, Wansheng Wu, Tianhua Zhang, Zhongliang Yang, Linna Zhou

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2508.02188

要約:
The emergence of the Internet of Agents (IoA) introduces critical challenges for communication privacy in sensitive, high-stakes domains. While standard Agent-to-Agent (A2A) protocols secure message content, they are not designed to protect the act of communication itself, leaving agents vulnerable to surveillance and traffic analysis. We find that the rich, event-driven nature of agent dialogues provides a powerful, yet untapped, medium for covert communication. To harness this potential, we introduce and formalize the Covert Event Channel, the first unified model for agent covert communication driven by three interconnected dimensions, which consist of the Storage, Timing,and Behavioral channels. Based on this model, we design and engineer {\Pi}CCAP, a novel protocol that operationalizes this event-driven paradigm. Our comprehensive evaluation demonstrates that {\Pi}CCAP achieves high capacity and robustness while remaining imperceptible to powerful LLM-based wardens, establishing its practical viability. By systematically engineering this channel, our work provides the foundational understanding essential for developing the next generation of monitoring systems and defensive protocols for a secure and trustworthy IoA.

#64 Decentralized Identity Management on Ripple: A Conceptual Framework for High-Speed, Low-Cost Identity Transactions in Attestation-Based Attribute-Based Identity

著者: Ruwanga Konara, Kasun De Zoysa, Asanka Sayakkara

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2509.10545

要約:
Recent years have seen many industrial implementations and much scholastic research, i.e., prototypes and theoretical frameworks, in Decentralized Identity Management Systems (DIDMS). It is safe to say that Attestation-Based Attribute-Based Decentralized IDM (ABABDIDM) has not received anywhere near the same level of attention in the literature as general Attribute-Based DIDMs (ABDIDM), i.e, decentralized Attribute-Based Access Control (ABAC). The use of decentralization, i.e., DIDM, is to improve upon the security and privacy-related issues of centralized Identity Management Systems (IDM) and Attribute-Based IDMs (ABIDM). And blockchain is the framework used for decentralization in all these schemes. Many DIDMs - even ABDIDMs - have been defined on popular blockchains such as Hyperledger, Ethereum, and Bitcoin. However, despite the characteristics of Ripple that makes it appealing for an ABIDM, there is a lack of research to develop an Identity Management System (IDMS) on Ripple in literature. We have attempted to conceptualize an ABABDIDM on Ripple.

#65 An efficient quantum algorithm for computing $S$-units and its applications

著者: Jean-Francois Biasse, Fang Song

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2510.02280

要約:
In this paper, we provide details on the proofs of the quantum polynomial time algorithm of Biasse and Song (SODA 16) for computing the $S$-unit group of a number field. This algorithm directly implies polynomial time methods to calculate class groups, S-class groups, relative class group and the unit group, ray class groups, solve the principal ideal problem, solve certain norm equations, and decompose ideal classes in the ideal class group. Additionally, combined with a result of Cramer, Ducas, Peikert and Regev (Eurocrypt 2016), the resolution of the principal ideal problem allows one to find short generators of a principal ideal. Likewise, methods due to Cramer, Ducas and Wesolowski (Eurocrypt 2017) use the resolution of the principal ideal problem and the decomposition of ideal classes to find so-called ``mildly short vectors'' in ideal lattices of cyclotomic fields.

#66 A Comprehensive Survey of Website Fingerprinting Attacks and Defenses in Tor: Advances and Open Challenges

著者: Yuwen Cui, Guangjing Wang, Khanh Vu, Kai Wei, Kehan Shen, Zhengyuan Jiang, Xiao Han, Ning Wang, Zhuo Lu, Yao Liu

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2510.11804

要約:
The Tor network provides users with strong anonymity by routing their internet traffic through multiple relays. While Tor encrypts traffic and hides IP addresses, it remains vulnerable to traffic analysis attacks such as the website fingerprinting (WF) attack, achieving increasingly high fingerprinting accuracy even under open-world conditions. In response, researchers have proposed a variety of defenses, ranging from adaptive padding, traffic regularization, and traffic morphing to adversarial perturbation, that seek to obfuscate or reshape traffic traces. However, these defenses often entail trade-offs between privacy, usability, and system performance. Despite extensive research, a comprehensive survey unifying WF datasets, attack methodologies, and defense strategies remains absent. This paper fills that gap by systematically categorizing existing WF research into three key domains: datasets, attack models, and defense mechanisms. We provide an in-depth comparative analysis of techniques, highlight their strengths and limitations under diverse threat models, and discuss emerging challenges such as multi-tab browsing and coarse-grained traffic features. By consolidating prior work and identifying open research directions, this survey serves as a foundation for advancing stronger privacy protection in Tor.

#67 Attack-Specialized Deep Learning with Ensemble Fusion for Network Anomaly Detection

著者: Nisith Dissanayake (University of Moratuwa), Uthayasanker Thayasivam (University of Moratuwa)

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2510.12455

要約:
The growing scale and sophistication of cyberattacks pose critical challenges to network security, particularly in detecting diverse intrusion types within imbalanced datasets. Traditional intrusion detection systems (IDS) often struggle to maintain high accuracy across both frequent and rare attacks, leading to increased false negatives for minority classes. To address this, we propose a hybrid anomaly detection framework that integrates specialized deep learning models with an ensemble meta-classifier. Each model is trained to detect a specific attack category, enabling tailored learning of class-specific patterns, while their collective outputs are fused by a Random Forest meta-classifier to improve overall decision reliability. The framework is evaluated on the NSL-KDD benchmark, demonstrating superior performance in handling class imbalance compared to conventional monolithic models. Results show significant improvements in precision, recall, and F1-score across all attack categories, including rare classes such as User to Root (U2R). The proposed system achieves near-perfect detection rates with minimal false alarms, highlighting its robustness and generalizability. This work advances the design of intrusion detection systems by combining specialization with ensemble learning, providing an effective and scalable solution for safeguarding modern networks.

#68 A DRL-Empowered Multi-Level Jamming Approach for Secure Semantic Communication

著者: Weixuan Chen, Qianqian Yang

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2510.26610

要約:
Semantic communication (SemCom) aims to transmit only task-relevant information, thereby improving communication efficiency but also exposing semantic information to potential eavesdropping. In this paper, we propose a deep reinforcement learning (DRL)-empowered multi-level jamming approach to enhance the security of SemCom systems over MIMO fading wiretap channels. This approach combines semantic layer jamming, achieved by encoding task-irrelevant text, and physical layer jamming, achieved by encoding random Gaussian noise. These two-level jamming signals are superposed with task-relevant semantic information to protect the transmitted semantics from eavesdropping. A deep deterministic policy gradient (DDPG) algorithm is further introduced to dynamically design and optimize the precoding matrices for both taskrelevant semantic information and multi-level jamming signals, aiming to enhance the legitimate user's image reconstruction while degrading the eavesdropper's performance. To jointly train the SemCom model and the DDPG agent, we propose an alternating optimization strategy where the two modules are updated iteratively. Experimental results demonstrate that, compared with both the encryption-based (ESCS) and encoded jammer-based (EJ) benchmarks, our method achieves comparable security while improving the legitimate user's peak signalto-noise ratio (PSNR) by up to approximately 0.6 dB.

#69 Differentiated Directional Intervention A Framework for Evading LLM Safety Alignment

著者: Peng Zhang, Peijie Sun

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.06852

要約:
Safety alignment instills in Large Language Models (LLMs) a critical capacity to refuse malicious requests. Prior works have modeled this refusal mechanism as a single linear direction in the activation space. We posit that this is an oversimplification that conflates two functionally distinct neural processes: the detection of harm and the execution of a refusal. In this work, we deconstruct this single representation into a Harm Detection Direction and a Refusal Execution Direction. Leveraging this fine-grained model, we introduce Differentiated Bi-Directional Intervention (DBDI), a new white-box framework that precisely neutralizes the safety alignment at critical layer. DBDI applies adaptive projection nullification to the refusal execution direction while suppressing the harm detection direction via direct steering. Extensive experiments demonstrate that DBDI outperforms prominent jailbreaking methods, achieving up to a 97.88\% attack success rate on models such as Llama-2. By providing a more granular and mechanistic framework, our work offers a new direction for the in-depth understanding of LLM safety alignment.

#70 Privacy on the Fly: A Predictive Adversarial Transformation Network for Mobile Sensor Data

privacy

著者: Tianle Song, Chenhao Lin, Yang Cao, Zhengyu Zhao, Jiahao Sun, Chong Zhang, Le Yang, Chao Shen

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.07242

要約:
Mobile motion sensors such as accelerometers and gyroscopes are now ubiquitously accessible by third-party apps via standard APIs. While enabling rich functionalities like activity recognition and step counting, this openness has also enabled unregulated inference of sensitive user traits, such as gender, age, and even identity, without user consent. Existing privacy-preserving techniques, such as GAN-based obfuscation or differential privacy, typically require access to the full input sequence, introducing latency that is incompatible with real-time scenarios. Worse, they tend to distort temporal and semantic patterns, degrading the utility of the data for benign tasks like activity recognition. To address these limitations, we propose the Predictive Adversarial Transformation Network (PATN), a real-time privacy-preserving framework that leverages historical signals to generate adversarial perturbations proactively. The perturbations are applied immediately upon data acquisition, enabling continuous protection without disrupting application functionality. Experiments on two datasets demonstrate that PATN substantially degrades the performance of privacy inference models, achieving Attack Success Rate (ASR) of 40.11% and 44.65% (reducing inference accuracy to near-random) and increasing the Equal Error Rate (EER) from 8.30% and 7.56% to 41.65% and 46.22%. On ASR, PATN outperforms baseline methods by 16.16% and 31.96%, respectively.

#71 Revisit to the Bai-Galbraith signature scheme

著者: Banhirup Sengupta, Peenal Gupta, Souvik Sengupta

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.09582

要約:
Dilithium is one of the NIST approved lattice-based signature schemes. In this short note we describe the Bai-Galbraith signature scheme proposed in BG14, which differs to Dilithium, due to the fact that there is no public key compression. This lattice-based signature scheme is based on Learning with Errors (LWE).

#72 Can MLLMs Detect Phishing? A Comprehensive Security Benchmark Suite Focusing on Dynamic Threats and Multimodal Evaluation in Academic Environments

著者: Jingzhuo Zhou

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.15165

要約:
The rapid proliferation of Multimodal Large Language Models (MLLMs) has introduced unprecedented security challenges, particularly in phishing detection within academic environments. Academic institutions and researchers are high-value targets, facing dynamic, multilingual, and context-dependent threats that leverage research backgrounds, academic collaborations, and personal information to craft highly tailored attacks. Existing security benchmarks largely rely on datasets that do not incorporate specific academic background information, making them inadequate for capturing the evolving attack patterns and human-centric vulnerability factors specific to academia. To address this gap, we present AdapT-Bench, a unified methodological framework and benchmark suite for systematically evaluating MLLM defense capabilities against dynamic phishing attacks in academic settings.

#73 Hiding in the AI Traffic: Abusing MCP for LLM-Powered Agentic Red Teaming

agent

著者: Strahinja Janjusevic, Anna Baron Garcia, Sohrob Kazerounian

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.15998

要約:
Generative AI is reshaping offensive cybersecurity by enabling autonomous red team agents that can plan, execute, and adapt during penetration tests. However, existing approaches face trade-offs between generality and specialization, and practical deployments reveal challenges such as hallucinations, context limitations, and ethical concerns. In this work, we introduce a novel command & control (C2) architecture leveraging the Model Context Protocol (MCP) to coordinate distributed, adaptive reconnaissance agents covertly across networks. Notably, we find that our architecture not only improves goal-directed behavior of the system as whole, but also eliminates key host and network artifacts that can be used to detect and prevent command & control behavior altogether. We begin with a comprehensive review of state-of-the-art generative red teaming methods, from fine-tuned specialist models to modular or agentic frameworks, analyzing their automation capabilities against task-specific accuracy. We then detail how our MCP-based C2 can overcome current limitations by enabling asynchronous, parallel operations and real-time intelligence sharing without periodic beaconing. We furthermore explore advanced adversarial capabilities of this architecture, its detection-evasion techniques, and address dual-use ethical implications, proposing defensive measures and controlled evaluation in lab settings. Experimental comparisons with traditional C2 show drastic reductions in manual effort and detection footprint. We conclude with future directions for integrating autonomous exploitation, defensive LLM agents, predictive evasive maneuvers, and multi-agent swarms. The proposed MCP-enabled C2 framework demonstrates a significant step toward realistic, AI-driven red team operations that can simulate advanced persistent threats while informing the development of next-generation defensive systems.

#74 Future-Back Threat Modeling: A Foresight-Driven Security Framework

著者: Vu Van Than

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.16088

要約:
Traditional threat modeling remains reactive-focused on known TTPs and past incident data, while threat prediction and forecasting frameworks are often disconnected from operational or architectural artifacts. This creates a fundamental weakness: the most serious cyber threats often do not arise from what is known, but from what is assumed, overlooked, or not yet conceived, and frequently originate from the future, such as artificial intelligence, information warfare, and supply chain attacks, where adversaries continuously develop new exploits that can bypass defenses built on current knowledge. To address this mental gap, this paper introduces the theory and methodology of Future-Back Threat Modeling (FBTM). This predictive approach begins with envisioned future threat states and works backward to identify assumptions, gaps, blind spots, and vulnerabilities in the current defense architecture, providing a clearer and more accurate view of impending threats so that we can anticipate their emergence and shape the future we want through actions taken now. The proposed methodology further aims to reveal known unknowns and unknown unknowns, including tactics, techniques, and procedures that are emerging, anticipated, and plausible. This enhances the predictability of adversary behavior, particularly under future uncertainty, helping security leaders make informed decisions today that shape more resilient security postures for the future.

#75 TICAL: Trusted and Integrity-protected Compilation of AppLications

著者: Robert Krahn, Nikson Kanti Paul, Franz Gregor, Do Le Quoc, Andrey Brito, Andr\'e Martin, Christof Fetzer

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.17070

要約:
During the past few years, we have witnessed various efforts to provide confidentiality and integrity for applications running in untrusted environments such as public clouds. In most of these approaches, hardware extensions such as Intel SGX, TDX, AMD SEV, etc., are leveraged to provide encryption and integrity protection on process or VM level. Although all of these approaches increase the trust in the application at runtime, an often overlooked aspect is the integrity and confidentiality protection at build time, which is equally important as maliciously injected code during compilation can compromise the entire application and system. In this paper, we present Tical, a practical framework for trusted compilation that provides integrity protection and confidentiality in build pipelines from source code to the final executable. Our approach harnesses TEEs as runtime protection but enriches TEEs with file system shielding and an immutable audit log with version history to provide accountability. This way, we can ensure that the compiler chain can only access trusted files and intermediate output, such as object files produced by trusted processes. Our evaluation using micro- and macro-benchmarks shows that Tical can protect the confidentiality and integrity of whole CI/CD pipelines with an acceptable performance overhead.

#76 Persistent BitTorrent Trackers

著者: Fran\c{c}ois-Xavier Wicht, Zhengwei Tong, Shunfan Zhou, Hang Yin, Aviv Yaish

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.17260

要約:
Private BitTorrent trackers enforce upload-to-download ratios to prevent free-riding, but suffer from three critical weaknesses: reputation cannot move between trackers, centralized servers create single points of failure, and upload statistics are self-reported and unverifiable. When a tracker shuts down (whether by operator choice, technical failure, or legal action) users lose their contribution history and cannot prove their standing to new communities. We address these problems by storing reputation in smart contracts and replacing self-reports with cryptographic attestations. Receiving peers sign receipts for transferred pieces, which the tracker aggregates and verifies before updating on-chain reputation. Trackers run in Trusted Execution Environments (TEEs) to guarantee correct aggregation and prevent manipulation of state. If a tracker is unavailable, peers use an authenticated Distributed Hash Table (DHT) for discovery: the on-chain reputation acts as a Public Key Infrastructure (PKI), so peers can verify each other and maintain access control without the tracker. This design persists reputation across tracker failures and makes it portable to new instances through single-hop migration in factory-deployed contracts. We formalize the security requirements, prove correctness under standard cryptographic assumptions, and evaluate a prototype on Intel TDX. Measurements show that transfer receipts adds less than 6\% overhead with typical piece sizes, and signature aggregation speeds up verification by $2.5\times$.

#77 Continuous-Variable Quantum Key Distribution with key rates far above the PLOB bound

著者: Arpan Akash Ray, Boris Skoric

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2402.04770

要約:
Continuous-Variable Quantum Key Distribution (CVQKD) at large distances has such high noise levels that the error-correcting code must have very low rate. In this regime it becomes feasible to implement random-codebook error correction, which is known to perform close to capacity. We propose a reverse reconciliation scheme for CVQKD in which the first step is advantage distillation based on random-codebook error correction operated above the Shannon limit. Our scheme has a novel way of achieving statistical decoupling between the public reconciliation data and the secret key. We provide an analysis of the secret key rate for the case of Gaussian collective attacks, and we present numerical results. The best performance is obtained when the message size exceeds the mutual information $I(X;Y)$ between Alice's quadratures $X$ and Bob's measurements $Y$, i.e. the Shannon limit. This somewhat counter-intuitive result is understood from a tradeoff between code rate and frame rejection rate, combined with the fact that error correction for QKD needs to reconcile only random data. We obtain secret key rates that lie far above the Devetak-Winter value $I(X;Y) - I(E;Y)$, which is the upper bound in the case of one-way error correction. Furthermore, our key rates lie above the PLOB bound for Continuous-Variable detection, but below the PLOB bound for Discrete-Variable detection.

#78 Do Spikes Protect Privacy? Investigating Black-Box Model Inversion Attacks in Spiking Neural Networks

privacy

著者: Hamed Poursiami, Ayana Moshruba, Maryam Parsa

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2502.05509

要約:
As machine learning models become integral to security-sensitive applications, concerns over data leakage from adversarial attacks continue to rise. Model Inversion (MI) attacks pose a significant privacy threat by enabling adversaries to reconstruct training data from model outputs. While MI attacks on Artificial Neural Networks (ANNs) have been widely studied, Spiking Neural Networks (SNNs) remain largely unexplored in this context. Due to their event-driven and discrete computations, SNNs introduce fundamental differences in information processing that may offer inherent resistance to such attacks. A critical yet underexplored aspect of this threat lies in black-box settings, where attackers operate through queries without direct access to model parameters or gradients-representing a more realistic adversarial scenario in deployed systems. This work presents the first study of black-box MI attacks on SNNs. We adapt a generative adversarial MI framework to the spiking domain by incorporating rate-based encoding for input transformation and decoding mechanisms for output interpretation. Our results show that SNNs exhibit significantly greater resistance to MI attacks than ANNs, as demonstrated by degraded reconstructions, increased instability in attack convergence, and overall reduced attack effectiveness across multiple evaluation metrics. Further analysis suggests that the discrete and temporally distributed nature of SNN decision boundaries disrupts surrogate modeling, limiting the attacker's ability to approximate the target model.

#79 Health App Reviews for Privacy & Trust (HARPT): A Corpus for Analyzing Patient Privacy Concerns, Trust in Providers and Trust in Applications

privacy

著者: Timoteo Kelly, Abdulkadir Korkmaz, Samuel Mallet, Connor Souders, Sadra Aliakbarpour, Praveen Rao

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2506.19268

要約:
Background: User reviews of Telehealth and Patient Portal mobile applications (apps) hereon referred to as electronic health (eHealth) apps are a rich source of unsolicited patient feedback, revealing critical insights into patient perceptions. However, the lack of large-scale, annotated datasets specific to privacy and trust has limited the ability of researchers to systematically analyze these concerns using natural language processing (NLP) techniques. Objective: This study aims to develop and benchmark Health App Reviews for Privacy & Trust (HARPT), a large-scale annotated corpus of patient reviews from eHealth apps to advance research in patient privacy and trust. Methods: We employed a multistage data construction strategy. This integrated keyword-based filtering, iterative manual labeling with review, targeted data augmentation, and weak supervision using transformer-based classifiers. A curated subset of 7,000 reviews was manually annotated to support machine learning model development and evaluation. The resulting dataset was used to benchmark a broad range of models. Results: The HARPT corpus comprises 480,000 patient reviews annotated across seven categories capturing critical aspects of trust in the application (TA), trust in the provider (TP), and privacy concerns (PC). We provide comprehensive benchmark performance for a range of machine learning models on the manually annotated subset, establishing a baseline for future research. Conclusions: The HARPT corpus is a significant resource for advancing the study of privacy and trust in the eHealth domain. By providing a large-scale, annotated dataset and initial benchmarks, this work supports reproducible research in usable privacy and trust within health informatics. HARPT is released under an open resource license.

#80 Synthetic Data Generation and Differential Privacy using Tensor Networks' Matrix Product States (MPS)

privacysynthetic data

著者: Alejandro Moreno R., Desale Fentaw, Samuel Palmer, Ra\'ul Salles de Padua, Ninad Dixit, Samuel Mugel, Roman Or\'us, Manuel Radons, Josef Menter, Ali Abedi

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2508.06251

要約:
Synthetic data generation is a key technique in modern artificial intelligence, addressing data scarcity, privacy constraints, and the need for diverse datasets in training robust models. In this work, we propose a method for generating privacy-preserving high-quality synthetic tabular data using Tensor Networks, specifically Matrix Product States (MPS). We benchmark the MPS-based generative model against state-of-the-art models such as CTGAN, VAE, and PrivBayes, focusing on both fidelity and privacy-preserving capabilities. To ensure differential privacy (DP), we integrate noise injection and gradient clipping during training, enabling privacy guarantees via R\'enyi Differential Privacy accounting. Across multiple metrics analyzing data fidelity and downstream machine learning task performance, our results show that MPS outperforms classical models, particularly under strict privacy constraints. This work highlights MPS as a promising tool for privacy-aware synthetic data generation. By combining the expressive power of tensor network representations with formal privacy mechanisms, the proposed approach offers an interpretable and scalable alternative for secure data sharing. Its structured design facilitates integration into sensitive domains where both data quality and confidentiality are critical.

#81 IAG: Input-aware Backdoor Attack on VLM-based Visual Grounding

backdoor

著者: Junxian Li, Beining Xu, Simin Chen, Jiatong Li, Jingdi Lei, Haodong Zhao, Di Zhang

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2508.09456

要約:
Recent advances in vision-language models (VLMs) have significantly enhanced the visual grounding task, which involves locating objects in an image based on natural language queries. Despite these advancements, the security of VLM-based grounding systems has not been thoroughly investigated. This paper reveals a novel and realistic vulnerability: the first multi-target backdoor attack on VLM-based visual grounding. Unlike prior attacks that rely on static triggers or fixed targets, we propose IAG, a method that dynamically generates input-aware, text-guided triggers conditioned on any specified target object description to execute the attack. This is achieved through a text-conditioned UNet that embeds imperceptible target semantic cues into visual inputs while preserving normal grounding performance on benign samples. We further develop a joint training objective that balances language capability with perceptual reconstruction to ensure imperceptibility, effectiveness, and stealth. Extensive experiments on multiple VLMs (e.g., LLaVA, InternVL, Ferret) and benchmarks (RefCOCO, RefCOCO+, RefCOCOg, Flickr30k Entities, and ShowUI) demonstrate that IAG achieves the best ASRs compared with other baselines on almost all settings without compromising clean accuracy, maintaining robustness against existing defenses, and exhibiting transferability across datasets and models. These findings underscore critical security risks in grounding-capable VLMs and highlight the need for further research on trustworthy multimodal understanding.

#82 How do data owners say no? A case study of data consent mechanisms in web-scraped vision-language AI training datasets

著者: Chung Peng Lee, Rachel Hong, Harry H. Jiang, Aster Plotnik, William Agnew, Jamie Morgenstern

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.08637

要約:
The internet has become the main source of data to train modern text-to-image or vision-language models, yet it is increasingly unclear whether web-scale data collection practices for training AI systems adequately respect data owners' wishes. Ignoring the owner's indication of consent around data usage not only raises ethical concerns but also has recently been elevated into lawsuits around copyright infringement cases. In this work, we aim to reveal information about data owners' consent to AI scraping and training, and study how it's expressed in DataComp, a popular dataset of 12.8 billion text-image pairs. We examine both the sample-level information, including the copyright notice, watermarking, and metadata, and the web-domain-level information, such as a site's Terms of Service (ToS) and Robots Exclusion Protocol. We estimate at least 122M of samples exhibit some indication of copyright notice in CommonPool, and find that 60\% of the samples in the top 50 domains come from websites with ToS that prohibit scraping. Furthermore, we estimate 9-13\% with 95\% confidence interval of samples from CommonPool to contain watermarks, where existing watermark detection methods fail to capture them in high fidelity. Our holistic methods and findings show that data owners rely on various channels to convey data consent, of which current AI data collection pipelines do not entirely respect. These findings highlight the limitations of the current dataset curation/release practice and the need for a unified data consent framework taking AI purposes into consideration.

#83 SmartPoC: Generating Executable and Validated PoCs for Smart Contract Bug Reports

著者: Longfei Chen, Ruibin Yan, Taiyu Wong, Yiyang Chen, Chao Zhang

公開日: Tue, 25 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.12993

要約:
Smart contracts are prone to vulnerabilities and are analyzed by experts as well as automated systems, such as static analysis and AI-assisted solutions. However, audit artifacts are heterogeneous and often lack reproducible, executable PoC tests suitable for automated validation, leading to costly, ad hoc manual verification. Large language models (LLMs) can be leveraged to turn audit reports into PoC test cases, but have three major challenges: noisy inputs, hallucinations, and missing runtime oracles. In this paper, we present SmartPoC, an automated framework that converts textual audit reports into executable, validated test cases. First, the input audit report is processed to reduce noise, and only bug-related functions are extracted and fed to LLMs as context. To curb hallucinations and ensure compile-and-run readiness, we leverage LLMs to synthesize PoC test cases with specially-designed pre-/post-execution repair. We further utilize differential verification as oracles to confirm exploitability of the PoC test cases. On the SmartBugs-Vul and FORGE-Vul benchmarks, SmartPoC generates executable, validated Foundry test cases for 85.61% and 86.45% of targets, respectively. Applied to the latest Etherscan verified-source corpus, SmartPoC confirms 236 real bugs out of 545 audit findings at a cost of only $0.03 per finding.

cs.CR updates on arXiv.org

📋 論文タイトル一覧