arXiv論文一覧 - cs.CR updates on arXiv.org

#1 From Oracle Choice to Oracle Lock-In: An Exploratory Study on Blockchain Oracles Supplier Selection

著者: Giulio Caldarelli

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.03088

要約:
As data is an essential asset for any Web3 application, selecting an oracle is a critical decision for its success. To date, academic research has mainly focused on improving oracle technology and internal economics, while the drivers of oracle choice on the client side remain largely unexplored. This study fills this gap by gathering insights from leading Web3 protocols, uncovering their rationale for oracle selection and their preferences when deciding whether to outsource or internalize data request mechanisms. The collected data covers more than 55% of the DeFi market cap and is obtained exclusively by protocol executives, board members, or delegates. Insights support the view that protocol choices are tied to technological dependencies, where immutability of smart contracts amplifies lock-in, preventing agile switching among data providers. Furthermore, when viable third-party solutions exist, protocols overwhelmingly prefer outsourcing rather than building and maintaining internal oracle mechanisms.

#2 Password-Activated Shutdown Protocols for Misaligned Frontier Agents

agent

著者: Kai Williams, Rohan Subramani, Francis Rhys Ward

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.03089

要約:
Frontier AI developers may fail to align or control highly-capable AI agents. In many cases, it could be useful to have emergency shutdown mechanisms which effectively prevent misaligned agents from carrying out harmful actions in the world. We introduce password-activated shutdown protocols (PAS protocols) -- methods for designing frontier agents to implement a safe shutdown protocol when given a password. We motivate PAS protocols by describing intuitive use-cases in which they mitigate risks from misaligned systems that subvert other control efforts, for instance, by disabling automated monitors or self-exfiltrating to external data centres. PAS protocols supplement other safety efforts, such as alignment fine-tuning or monitoring, contributing to defence-in-depth against AI risk. We provide a concrete demonstration in SHADE-Arena, a benchmark for AI monitoring and subversion capabilities, in which PAS protocols supplement monitoring to increase safety with little cost to performance. Next, PAS protocols should be robust to malicious actors who want to bypass shutdown. Therefore, we conduct a red-team blue-team game between the developers (blue-team), who must implement a robust PAS protocol, and a red-team trying to subvert the protocol. We conduct experiments in a code-generation setting, finding that there are effective strategies for the red-team, such as using another model to filter inputs, or fine-tuning the model to prevent shutdown behaviour. We then outline key challenges to implementing PAS protocols in real-life systems, including: security considerations of the password and decisions regarding when, and in which systems, to use them. PAS protocols are an intuitive mechanism for increasing the safety of frontier AI. We encourage developers to consider implementing PAS protocols prior to internal deployment of particularly dangerous systems to reduce loss-of-control risks.

#3 Many-to-One Adversarial Consensus: Exposing Multi-Agent Collusion Risks in AI-Based Healthcare

agent

著者: Adeela Bashir, The Anh han, Zia Ush Shamszaman

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.03097

要約:
The integration of large language models (LLMs) into healthcare IoT systems promises faster decisions and improved medical support. LLMs are also deployed as multi-agent teams to assist AI doctors by debating, voting, or advising on decisions. However, when multiple assistant agents interact, coordinated adversaries can collude to create false consensus, pushing an AI doctor toward harmful prescriptions. We develop an experimental framework with scripted and unscripted doctor agents, adversarial assistants, and a verifier agent that checks decisions against clinical guidelines. Using 50 representative clinical questions, we find that collusion drives the Attack Success Rate (ASR) and Harmful Recommendation Rates (HRR) up to 100% in unprotected systems. In contrast, the verifier agent restores 100% accuracy by blocking adversarial consensus. This work provides the first systematic evidence of collusion risk in AI healthcare and demonstrates a practical, lightweight defence that ensures guideline fidelity.

#4 Ensemble Privacy Defense for Knowledge-Intensive LLMs against Membership Inference Attacks

privacy

著者: Haowei Fu, Bo Ni, Han Xu, Kunpeng Liu, Dan Lin, Tyler Derr

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.03100

要約:
Retrieval-Augmented Generation (RAG) and Supervised Finetuning (SFT) have become the predominant paradigms for equipping Large Language Models (LLMs) with external knowledge for diverse, knowledge-intensive tasks. However, while such knowledge injection improves performance, it also exposes new attack surfaces. Membership Inference Attacks (MIAs), which aim to determine whether a given data sample was included in a model's training set, pose serious threats to privacy and trust in sensitive domains. To this end, we first systematically evaluate the vulnerability of RAG- and SFT-based LLMs to various MIAs. Then, to address the privacy risk, we further introduce a novel, model-agnostic defense framework, Ensemble Privacy Defense (EPD), which aggregates and evaluates the outputs of a knowledge-injected LLM, a base LLM, and a dedicated judge model to enhance resistance against MIAs. Comprehensive experiments show that, on average, EPD reduces MIA success by up to 27.8\% for SFT and 526.3\% for RAG compared to inference-time baseline, while maintaining answer quality.

#5 Lost in Modality: Evaluating the Effectiveness of Text-Based Membership Inference Attacks on Large Multimodal Models

privacy

著者: Ziyi Tong, Feifei Sun, Le Minh Nguyen

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.03121

要約:
Large Multimodal Language Models (MLLMs) are emerging as one of the foundational tools in an expanding range of applications. Consequently, understanding training-data leakage in these systems is increasingly critical. Log-probability-based membership inference attacks (MIAs) have become a widely adopted approach for assessing data exposure in large language models (LLMs), yet their effect in MLLMs remains unclear. We present the first comprehensive evaluation of extending these text-based MIA methods to multimodal settings. Our experiments under vision-and-text (V+T) and text-only (T-only) conditions across the DeepSeek-VL and InternVL model families show that in in-distribution settings, logit-based MIAs perform comparably across configurations, with a slight V+T advantage. Conversely, in out-of-distribution settings, visual inputs act as regularizers, effectively masking membership signals.

#6 Technical Report: The Need for a (Research) Sandstorm through the Privacy Sandbox

privacy

著者: Yohan Beugin, Patrick McDaniel

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.03207

要約:
The Privacy Sandbox, launched in 2019, is a series of proposals from Google to reduce ``cross-site and cross-app tracking while helping to keep online content and services free for all''. Over the years, Google implemented, experimented, and deprecated some of these APIs into their own products (Chrome, Android, etc.) which raised concerns about the potential of these mechanisms to fundamentally disrupt the advertising, mobile, and web ecosystems. As a result, it is paramount for researchers to understand the consequences that these new technologies, and future ones, will have on billions of users if and when deployed. In this report, we outline our call for privacy, security, usability, and utility evaluations of these APIs, our efforts materialized through the creation and operation of Privacy Sandstorm (https://privacysandstorm.github.io); a research portal to systematically gather resources (overview, analyses, artifacts, etc.) about such proposals. We find that our inventory provides a better visibility and broader perspective on the research findings in that space than what Google lets show through official channels.

#7 How to DP-fy Your Data: A Practical Guide to Generating Synthetic Data With Differential Privacy

privacysynthetic data

著者: Natalia Ponomareva, Zheng Xu, H. Brendan McMahan, Peter Kairouz, Lucas Rosenblatt, Vincent Cohen-Addad, Crist\'obal Guzm\'an, Ryan McKenna, Galen Andrew, Alex Bie, Da Yu, Alex Kurakin, Morteza Zadimoghaddam, Sergei Vassilvitskii, Andreas Terzis

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.03238

要約:
High quality data is needed to unlock the full potential of AI for end users. However finding new sources of such data is getting harder: most publicly-available human generated data will soon have been used. Additionally, publicly available data often is not representative of users of a particular system -- for example, a research speech dataset of contractors interacting with an AI assistant will likely be more homogeneous, well articulated and self-censored than real world commands that end users will issue. Therefore unlocking high-quality data grounded in real user interactions is of vital interest. However, the direct use of user data comes with significant privacy risks. Differential Privacy (DP) is a well established framework for reasoning about and limiting information leakage, and is a gold standard for protecting user privacy. The focus of this work, \emph{Differentially Private Synthetic data}, refers to synthetic data that preserves the overall trends of source data,, while providing strong privacy guarantees to individuals that contributed to the source dataset. DP synthetic data can unlock the value of datasets that have previously been inaccessible due to privacy concerns and can replace the use of sensitive datasets that previously have only had rudimentary protections like ad-hoc rule-based anonymization. In this paper we explore the full suite of techniques surrounding DP synthetic data, the types of privacy protections they offer and the state-of-the-art for various modalities (image, tabular, text and decentralized). We outline all the components needed in a system that generates DP synthetic data, from sensitive data handling and preparation, to tracking the use and empirical privacy testing. We hope that work will result in increased adoption of DP synthetic data, spur additional research and increase trust in DP synthetic data approaches.

#8 Empirical assessment of the perception of graphical threat model acceptability

著者: Nathan D. Schiele, Olga Gadyatskaya

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.03351

要約:
Threat modeling (TM) is an important aspect of risk analysis and secure software engineering. Graphical threat models are a recommended tool to analyze and communicate threat information. However, the comparison of different graphical threat models, and the acceptability of these threat models for an audience with a limited technical background, is not well understood, despite these users making up a sizable portion of the cybersecurity industry. We seek to compare the acceptability of three general, graphical threat models, Attack-Defense Trees (ADTs), Attack Graphs (AGs), and CORAS, for users with a limited technical background. We conducted a laboratory study with 38 bachelor students who completed tasks with the three threat models across three different scenarios assigned using a Latin square design. Threat model submissions were qualitatively analyzed, and participants filled out a perception questionnaire based on the Method Evaluation Model (MEM). We find that both ADTs and CORAS are broadly acceptable for a wide range of scenarios, and both could be applied successfully by users with a limited technical background; further, we also find that the lack of a specific tool for AGs may have impacted the perceived usefulness of AGs. We can recommend that users with a limited technical background use ADTs or CORAS as a general graphical TM method. Further research on the acceptability of AGs to such an audience and the effect of a dedicated TM tool support is needed.

#9 Immunity memory-based jailbreak detection: multi-agent adaptive guard for large language models

agent

著者: Jun Leng, Litian Zhang, Xi Zhang

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.03356

要約:
Large language models (LLMs) have become foundational in AI systems, yet they remain vulnerable to adversarial jailbreak attacks. These attacks involve carefully crafted prompts that bypass safety guardrails and induce models to produce harmful content. Detecting such malicious input queries is therefore critical for maintaining LLM safety. Existing methods for jailbreak detection typically involve fine-tuning LLMs as static safety LLMs using fixed training datasets. However, these methods incur substantial computational costs when updating model parameters to improve robustness, especially in the face of novel jailbreak attacks. Inspired by immunological memory mechanisms, we propose the Multi-Agent Adaptive Guard (MAAG) framework for jailbreak detection. The core idea is to equip guard with memory capabilities: upon encountering novel jailbreak attacks, the system memorizes attack patterns, enabling it to rapidly and accurately identify similar threats in future encounters. Specifically, MAAG first extracts activation values from input prompts and compares them to historical activations stored in a memory bank for quick preliminary detection. A defense agent then simulates responses based on these detection results, and an auxiliary agent supervises the simulation process to provide secondary filtering of the detection outcomes. Extensive experiments across five open-source models demonstrate that MAAG significantly outperforms state-of-the-art (SOTA) methods, achieving 98% detection accuracy and a 96% F1-score across a diverse range of attack scenarios.

#10 Scaling Trust in Quantum Federated Learning: A Multi-Protocol Privacy Design

privacy

著者: Dev Gurung, Shiva Raj Pokhrel

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.03358

要約:
Quantum Federated Learning (QFL) promises to revolutionize distributed machine learning by combining the computational power of quantum devices with collaborative model training. Yet, privacy of both data and models remains a critical challenge. In this work, we propose a privacy-preserving QFL framework where a network of $n$ quantum devices trains local models and transmits them to a central server under a multi-layered privacy protocol. Our design leverages Singular Value Decomposition (SVD), Quantum Key Distribution (QKD), and Analytic Quantum Gradient Descent (AQGD) to secure data preparation, model sharing, and training stages. Through theoretical analysis and experiments on contemporary quantum platforms and datasets, we demonstrate that the framework robustly safeguards data and model confidentiality while maintaining training efficiency.

#11 Rethinking Security in Semantic Communication: Latent Manipulation as a New Threat

著者: Zhiyuan Xi, Kun Zhu

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.03361

要約:
Deep learning-based semantic communication (SemCom) has emerged as a promising paradigm for next-generation wireless networks, offering superior transmission efficiency by extracting and conveying task-relevant semantic latent representations rather than raw data. However, the openness of the wireless medium and the intrinsic vulnerability of semantic latent representations expose such systems to previously unrecognized security risks. In this paper, we uncover a fundamental latent-space vulnerability that enables Man-in-the-Middle (MitM) attacker to covertly manipulate the transmitted semantics while preserving the statistical properties of the transmitted latent representations. We first present a Diffusion-based Re-encoding Attack (DiR), wherein the attacker employs a diffusion model to synthesize an attacker-designed semantic variant, and re-encodes it into a valid latent representation compatible with the SemCom decoder. Beyond this model-dependent pathway, we further propose a model-agnostic and training-free Test-Time Adaptation Latent Manipulation attack (TTA-LM), in which the attacker perturbs and steers the intercepted latent representation toward an attacker-specified semantic target by leveraging the gradient of a target loss function. In contrast to diffusion-based manipulation, TTA-LM does not rely on any generative model and does not impose modality-specific or task-specific assumptions, thereby enabling efficient and broadly applicable latent-space tampering across diverse SemCom architectures. Extensive experiments on representative semantic communication architectures demonstrate that both attacks can significantly alter the decoded semantics while preserving natural latent-space distributions, making the attacks covert and difficult to detect.

#12 HarnessAgent: Scaling Automatic Fuzzing Harness Construction with Tool-Augmented LLM Pipelines

agent

著者: Kang Yang, Yunhang Zhang, Zichuan Li, GuanHong Tao, Jun Xu, XiaoJing Liao

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.03420

要約:
Large language model (LLM)-based techniques have achieved notable progress in generating harnesses for program fuzzing. However, applying them to arbitrary functions (especially internal functions) \textit{at scale} remains challenging due to the requirement of sophisticated contextual information, such as specification, dependencies, and usage examples. State-of-the-art methods heavily rely on static or incomplete context provisioning, causing failure of generating functional harnesses. Furthermore, LLMs tend to exploit harness validation metrics, producing plausible yet logically useless code. % Therefore, harness generation across large and diverse projects continues to face challenges in reliable compilation, robust code retrieval, and comprehensive validation. To address these challenges, we present HarnessAgent, a tool-augmented agentic framework that achieves fully automated, scalable harness construction over hundreds of OSS-Fuzz targets. HarnessAgent introduces three key innovations: 1) a rule-based strategy to identify and minimize various compilation errors; 2) a hybrid tool pool for precise and robust symbol source code retrieval; and 3) an enhanced harness validation pipeline that detects fake definitions. We evaluate HarnessAgent on 243 target functions from OSS-Fuzz projects (65 C projects and 178 C++ projects). It improves the three-shot success rate by approximately 20\% compared to state-of-the-art techniques, reaching 87\% for C and 81\% for C++. Our one-hour fuzzing results show that more than 75\% of the harnesses generated by HarnessAgent increase the target function coverage, surpassing the baselines by over 10\%. In addition, the hybrid tool-pool system of HarnessAgent achieves a response rate of over 90\% for source code retrieval, outperforming Fuzz Introspector by more than 30\%.

#13 In-Situ Encryption of Single-Transistor Nonvolatile Memories without Density Loss

著者: Sanwar Ahmed Ovy, Jiahui Duan, Md Ashraful Islam Romel, Franz Muller, Thomas Kampfe, Kai Ni, Sumitha George

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.03461

要約:
Non-volatile memories (NVMs) offer negligible leakage power consumption, high integration density, and data retention, but their non-volatility also raises the risk of data exposure. Conventional encryption techniques such as the Advanced Encryption Standard (AES) incur large area overheads and performance penalties, motivating lightweight XOR-based in-situ encryption schemes with low area and power requirements. This work proposes an ultra-dense single-transistor encrypted cell using ferroelectric FET (FeFET) devices, which, to our knowledge, is the first to eliminate the two-memory-devices-per-encrypted-cell requirement in XOR-based schemes, enabling encrypted memory arrays to maintain the same number of storage devices as unencrypted arrays. The key idea is an in-memory single-FeFET XOR scheme, where the ciphertext is encoded in the device threshold voltage and leverages the direction-dependent current flow of the FeFET for single-cycle decryption; eliminating complementary bit storage also removes the need for two write cycles, allowing faster encryption. We extend the approach to multi-level-cell (MLC) FeFETs to store multiple bits per transistor. We validate the proposed idea through both simulation and experimental evaluations. Our analysis on a 128x128-bit array shows 2x higher encryption/decryption throughput than prior FeFET work and 45.2x/14.12x improvement over AES, while application-level evaluations using neural-network benchmarks demonstrate average latency reductions of 50% and 95% compared to prior FeFET-based and AES-based schemes, respectively.

#14 A Hybrid Deep Learning and Anomaly Detection Framework for Real-Time Malicious URL Classification

著者: Berkani Khaled, Zeraoulia Rafik

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.03462

要約:
Malicious URLs remain a primary vector for phishing, malware, and cyberthreats. This study proposes a hybrid deep learning framework combining \texttt{HashingVectorizer} n-gram analysis, SMOTE balancing, Isolation Forest anomaly filtering, and a lightweight neural network classifier for real-time URL classification. The multi-stage pipeline processes URLs from open-source repositories with statistical features (length, dot count, entropy), achieving $O(NL + EBdh)$ training complexity and a 20\,ms prediction latency. Empirical evaluation yields 96.4\% accuracy, 95.4\% F1-score, and 97.3\% ROC-AUC, outperforming CNN (94.8\%) and SVM baselines with a $50\!\times$--$100\!\times$ speedup (Table~\ref{tab:comp-complexity}). A multilingual Tkinter GUI (Arabic/English/French) enables real-time threat assessment with clipboard integration. The framework demonstrates superior scalability and resilience against obfuscated URL patterns.

#15 Tuning for TraceTarnish: Techniques, Trends, and Testing Tangible Traits

著者: Robert Dilworth

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.03465

要約:
In this study, we more rigorously evaluated our attack script $\textit{TraceTarnish}$, which leverages adversarial stylometry principles to anonymize the authorship of text-based messages. To ensure the efficacy and utility of our attack, we sourced, processed, and analyzed Reddit comments--comments that were later alchemized into $\textit{TraceTarnish}$ data--to gain valuable insights. The transformed $\textit{TraceTarnish}$ data was then further augmented by $\textit{StyloMetrix}$ to manufacture stylometric features--features that were culled using the Information Gain criterion, leaving only the most informative, predictive, and discriminative ones. Our results found that function words and function word types ($L\_FUNC\_A$ $\&$ $L\_FUNC\_T$); content words and content word types ($L\_CONT\_A$ $\&$ $L\_CONT\_T$); and the Type-Token Ratio ($ST\_TYPE\_TOKEN\_RATIO\_LEMMAS$) yielded significant Information-Gain readings. The identified stylometric cues--function-word frequencies, content-word distributions, and the Type-Token Ratio--serve as reliable indicators of compromise (IoCs), revealing when a text has been deliberately altered to mask its true author. Similarly, these features could function as forensic beacons, alerting defenders to the presence of an adversarial stylometry attack; granted, in the absence of the original message, this signal may go largely unnoticed, as it appears to depend on a pre- and post-transformation comparison. "In trying to erase a trace, you often imprint a larger one." Armed with this understanding, we framed $\textit{TraceTarnish}$'s operations and outputs around these five isolated features, using them to conceptualize and implement enhancements that further strengthen the attack.

#16 A User Centric Group Authentication Scheme for Secure Communication

著者: Oylum Gerenli, Gunes Karabulut-Kurt, Enver Ozdemir

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.03551

要約:
Group Authentication Schemes (GAS) are methodologies developed to verify the membership of multiple users simultaneously. These schemes enable the concurrent authentication of several users while eliminating the need for a certification authority. Numerous GAS methods have been explored in the literature, and they can be classified into three distinct generations based on their foundational mathematical principles. First-generation GASs rely on polynomial interpolation and the multiplicative subgroup of a finite field. Second-generation GASs also employ polynomial interpolation, but they distinguish themselves by incorporating elliptic curves over finite fields. While third-generation GASs present a promising solution for scalable environments, they demonstrate a limitation in certain applications. Such applications typically require the identification of users participating in the authentication process. In the third-generation GAS, users are able to verify their credentials while maintaining anonymity. However, there are various applications where the identification of participating users is necessary. In this study, we propose an improved version of third-generation GAS, utilizing inner product spaces and polynomial interpolation to resolve this limitation. We address the issue of preventing malicious actions by legitimate group members. The current third-generation scheme allows members to share group credentials, which can jeopardize group confidentiality. Our proposed scheme mitigates this risk by eliminating the ability of individual users to distribute credentials. However, a potential limitation of our scheme is its reliance on a central authority for authentication in certain scenarios.

#17 SELF: A Robust Singular Value and Eigenvalue Approach for LLM Fingerprinting

著者: Hanxiu Zhang, Yue Zheng

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.03620

要約:
The protection of Intellectual Property (IP) in Large Language Models (LLMs) represents a critical challenge in contemporary AI research. While fingerprinting techniques have emerged as a fundamental mechanism for detecting unauthorized model usage, existing methods -- whether behavior-based or structural -- suffer from vulnerabilities such as false claim attacks or susceptible to weight manipulations. To overcome these limitations, we propose SELF, a novel intrinsic weight-based fingerprinting scheme that eliminates dependency on input and inherently resists false claims. SELF achieves robust IP protection through two key innovations: 1) unique, scalable and transformation-invariant fingerprint extraction via singular value and eigenvalue decomposition of LLM attention weights, and 2) effective neural network-based fingerprint similarity comparison based on few-shot learning and data augmentation. Experimental results demonstrate SELF maintains high IP infringement detection accuracy while showing strong robustness against various downstream modifications, including quantization, pruning, and fine-tuning attacks. Our code is available at https://github.com/HanxiuZhang/SELF_v2.

#18 A Descriptive Model for Modelling Attacker Decision-Making in Cyber-Deception

著者: B. R. Turner, O. Guidetti, N. M. Karie, R. Ryan, Y. Yan

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.03641

要約:
Cyber-deception is an increasingly important defensive strategy, shaping adversarial decision making through controlled misinformation, uncertainty, and misdirection. Although game-theoretic, Bayesian, Markov decision process, and reinforcement learning models offer insight into deceptive interactions, they typically assume an attacker has already chosen to engage. Such approaches overlook cognitive and perceptual factors that influence an attacker's initial decision to engage or withdraw. This paper presents a descriptive model that incorporates the psychological and strategic elements shaping this decision. The model defines five components, belief (B), scepticism (S), deception fidelity (D), reconnaissance (R), and experience (E), which interact to capture how adversaries interpret deceptive cues and assess whether continued engagement is worthwhile. The framework provides a structured method for analysing engagement decisions in cyber-deception scenarios. A series of experiments has been designed to evaluate this model through Capture the Flag activities incorporating varying levels of deception, supported by behavioural and biometric observations. These experiments have not yet been conducted, and no experimental findings are presented in this paper. These experiments will combine behavioural observations with biometric indicators to produce a multidimensional view of adversarial responses. Findings will improve understanding of the factors influencing engagement decisions and refine the model's relevance to real-world cyber-deception settings. By addressing the gap in existing models that presume engagement, this work supports more cognitively realistic and strategically effective cyber-deception practices.

#19 Towards Privacy-Preserving Range Queries with Secure Learned Spatial Index over Encrypted Data

privacy

著者: Zuan Wang, Juntao Lu, Jiazhuang Wu, Youliang Tian, Wei Song, Qiuxian Li, Duo Zhang

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.03669

要約:
With the growing reliance on cloud services for large-scale data management, preserving the security and privacy of outsourced datasets has become increasingly critical. While encrypting data and queries can prevent direct content exposure, recent research reveals that adversaries can still infer sensitive information via access pattern and search path analysis. However, existing solutions that offer strong access pattern privacy often incur substantial performance overhead. In this paper, we propose a novel privacy-preserving range query scheme over encrypted datasets, offering strong security guarantees while maintaining high efficiency. To achieve this, we develop secure learned spatial index (SLS-INDEX), a secure learned index that integrates the Paillier cryptosystem with a hierarchical prediction architecture and noise-injected buckets, enabling data-aware query acceleration in the encrypted domain. To further obfuscate query execution paths, SLS-INDEXbased Range Queries (SLRQ) employs a permutation-based secure bucket prediction protocol. Additionally, we introduce a secure point extraction protocol that generates candidate results to reduce the overhead of secure computation. We provide formal security analysis under realistic leakage functions and implement a prototype to evaluate its practical performance. Extensive experiments on both real-world and synthetic datasets demonstrate that SLRQ significantly outperforms existing solutions in query efficiency while ensuring dataset, query, result, and access pattern privacy.

#20 Context-Aware Hierarchical Learning: A Two-Step Paradigm towards Safer LLMs

著者: Tengyun Ma, Jiaqi Yao, Daojing He, Shihao Peng, Yu Li, Shaohui Liu, Zhuotao Tian

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.03720

要約:
Large Language Models (LLMs) have emerged as powerful tools for diverse applications. However, their uniform token processing paradigm introduces critical vulnerabilities in instruction handling, particularly when exposed to adversarial scenarios. In this work, we identify and propose a novel class of vulnerabilities, termed Tool-Completion Attack (TCA), which exploits function-calling mechanisms to subvert model behavior. To evaluate LLM robustness against such threats, we introduce the Tool-Completion benchmark, a comprehensive security assessment framework, which reveals that even state-of-the-art models remain susceptible to TCA, with surprisingly high attack success rates. To address these vulnerabilities, we introduce Context-Aware Hierarchical Learning (CAHL), a sophisticated mechanism that dynamically balances semantic comprehension with role-specific instruction constraints. CAHL leverages the contextual correlations between different instruction segments to establish a robust, context-aware instruction hierarchy. Extensive experiments demonstrate that CAHL significantly enhances LLM robustness against both conventional attacks and the proposed TCA, exhibiting strong generalization capabilities in zero-shot evaluations while still preserving model performance on generic tasks. Our code is available at https://github.com/S2AILab/CAHL.

#21 The Treasury Proof Ledger: A Cryptographic Framework for Accountable Bitcoin Treasuries

著者: Jose E. Puente, Carlos Puente

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.03765

要約:
Public companies and institutional investors that hold Bitcoin face increasing pressure to show solvency, manage risk, and satisfy regulatory expectations without exposing internal wallet structures or trading strategies. This paper introduces the Treasury Proof Ledger (TPL), a Bitcoin-anchored logging framework for multi-domain Bitcoin treasuries that treats on-chain and off-chain exposures as a conserved state machine with an explicit fee sink. A TPL instance records proof-of-reserves snapshots, proof-of-transit receipts for movements between domains, and policy metadata, and it supports restricted views based on stakeholder permissions. We define an idealised TPL model, represent Bitcoin treasuries as multi-domain exposure vectors, and give deployment-level security notions including exposure soundness, policy completeness, non-equivocation, and privacy-compatible policy views. We then outline how practical, restricted forms of these guarantees can be achieved by combining standard proof-of-reserves and proof-of-transit techniques with hash-based commitments anchored on Bitcoin. The results are existence-type statements: they show which guarantees are achievable once economic and governance assumptions are set, without claiming that any current system already provides them. A stylised corporate-treasury example illustrates how TPL could support responsible transparency policies and future cross-institution checks consistent with Bitcoin's fixed monetary supply.

#22 "MCP Does Not Stand for Misuse Cryptography Protocol": Uncovering Cryptographic Misuse in Model Context Protocol at Scale

著者: Biwei Yan, Yue Zhang, Minghui Xu, Hao Wu, Yechao Zhang, Kun Li, Guoming Zhang, Xiuzhen Cheng

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.03775

要約:
The Model Context Protocol (MCP) is rapidly emerging as the middleware for LLM-based applications, offering a standardized interface for tool integration. However, its built-in security mechanisms are minimal: while schemas and declarations prevent malformed requests, MCP provides no guarantees of authenticity or confidentiality, forcing developers to implement cryptography themselves. Such ad hoc practices are historically prone to misuse, and within MCP they threaten sensitive data and services. We present MICRYSCOPE, the first domain-specific framework for detecting cryptographic misuses in MCP implementations. MICRYSCOPE combines three key innovations: a cross-language intermediate representation that normalizes cryptographic APIs across diverse ecosystems, a hybrid dependency analysis that uncovers explicit and implicit function relationships (including insecure runtime compositions orchestrated by LLMs) and a taint-based misuse detector that tracks sensitive data flows and flags violations of established cryptographic rules. Applying MICRYSCOPE to 9,403 MCP servers, we identified 720 with cryptographic logic, of which 19.7% exhibited misuses. These flaws are concentrated in certain markets (e.g., Smithery Registry with 42% insecure servers), languages (Python at 34% misuse rate), and categories (Developer Tools and Data Science & ML accounting for over 50% of all misuses). Case studies reveal real-world consequences, including leaked API keys, insecure DES/ECB tools, and MD5-based authentication bypasses. Our study establishes the first ecosystem-wide view of cryptographic misuse in MCP and provides both tools and insights to strengthen the security foundations of this rapidly growing protocol.

#23 CCN: Decentralized Cross-Chain Channel Networks Supporting Secure and Privacy-Preserving Multi-Hop Interactions

privacy

著者: Minghui Xu, Yihao Guo, Yanqiang Zhang, Zhiguang Shan, Guangyong Shang, Zhen Ma, Bin Xiao, Xiuzhen Cheng

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.03791

要約:
Cross-chain technology enables interoperability among otherwise isolated blockchains, supporting interactions across heterogeneous networks. Similar to how multi-hop communication became fundamental in the evolution of the Internet, the demand for multi-hop cross-chain interactions is gaining increasing attention. However, this growing demand introduces new security and privacy challenges. On the security side, multi-hop interactions depend on the availability of multiple participating nodes. If any node becomes temporarily offline during execution, the protocol may fail to complete correctly, leading to settlement failure or fund loss. On the privacy side, the need for on-chain transparency to validate intermediate states may unintentionally leak linkable information, compromising the unlinkability of user interactions. In this paper, we propose the Cross-Chain Channel Network (CCN), a decentralized network designed to support secure and privacy-preserving multi-hop cross-chain transactions. Through experimental evaluation, we identify two critical types of offline failures, referred to as active and passive offline cases, which have not been adequately addressed by existing solutions. To mitigate these issues, we introduce R-HTLC, a core protocol within CCN. R-HTLC incorporates an hourglass mechanism and a multi-path refund strategy to ensure settlement correctness even when some nodes go offline during execution. Importantly, CCN addresses not only the correctness under offline conditions but also maintains unlinkability in such adversarial settings. To overcome this, CCN leverages zero-knowledge proofs and off-chain coordination, ensuring that interaction relationships remain indistinguishable even when certain nodes are temporarily offline.

#24 Unfolding Challenges in Securing and Regulating Unmanned Air Vehicles

著者: Sonali Rout, Vireshwar Kumar

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.03792

要約:
Unmanned Aerial Vehicles (UAVs) or drones are being introduced in a wide range of commercial applications. This has also made them prime targets of attackers who compromise their fundamental security properties, including confidentiality, integrity, and availability. As researchers discover novel threat vectors in UAVs, the government and industry are increasingly concerned about their limited ability to secure and regulate UAVs and their usage. With the aim of unfolding a path for a large-scale commercial UAV network deployment, we conduct a comprehensive state-of-the-art study and examine the prevailing security challenges. Unlike the prior art, we focus on uncovering the research gaps that must be addressed to enforce security policy regulations in civilian off-the-shelf drone systems. To that end, we first examine the known security threats to UAVs based on their impact and effectiveness. We then analyze existing countermeasures to prevent, detect, and respond to these threats in terms of security and performance overhead. We further outline the future research directions for securing UAVs. Finally, we establish the fundamental requirements and highlight critical research challenges in introducing a regulatory entity to achieve a secure and regulated UAV network.

#25 Watermarks for Embeddings-as-a-Service Large Language Models

intellectual property

著者: Anudeex Shetty

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.03079

要約:
Large Language Models (LLMs) have demonstrated exceptional capabilities in natural language understanding and generation. Based on these LLMs, businesses have started to provide Embeddings-as-a-Service (EaaS), offering feature extraction capabilities (in the form of text embeddings) that benefit downstream natural language processing tasks. However, prior research has demonstrated that EaaS is vulnerable to imitation attacks, where an attacker clones the service's model in a black-box manner without access to the model's internal workings. In response, watermarks have been added to the text embeddings to protect the intellectual property of EaaS providers by allowing them to check for model ownership. This thesis focuses on defending against imitation attacks by investigating EaaS watermarks. To achieve this goal, we unveil novel attacks and propose and validate new watermarking techniques. Firstly, we show that existing EaaS watermarks can be removed through paraphrasing the input text when attackers clone the model during imitation attacks. Our study illustrates that paraphrasing can effectively bypass current state-of-the-art EaaS watermarks across various attack setups (including different paraphrasing techniques and models) and datasets in most instances. This demonstrates a new vulnerability in recent EaaS watermarking techniques. Subsequently, as a countermeasure, we propose a novel watermarking technique, WET (Watermarking EaaS with Linear Transformation), which employs linear transformation of the embeddings. Watermark verification is conducted by applying a reverse transformation and comparing the similarity between recovered and original embeddings. We demonstrate its robustness against paraphrasing attacks with near-perfect verifiability. We conduct detailed ablation studies to assess the significance of each component and hyperparameter in WET.

#26 Randomized Masked Finetuning: An Efficient Way to Mitigate Memorization of PIIs in LLMs

privacy

著者: Kunj Joshi, David A. Smith

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.03310

要約:
The current literature on memorization in Natural Language Models, especially Large Language Models (LLMs), poses severe security and privacy risks, as models tend to memorize personally identifying information (PIIs) from training data. We introduce Randomized Masked Fine-Tuning (RMFT), a novel privacy-preserving fine-tuning technique that reduces PII memorization while minimizing performance impact. Using the Enron Email Dataset, we demonstrate that RMFT achieves an 80.81% reduction in Total Extraction Rate and 80.17% reduction in Seen Extraction Rate compared to baseline fine-tuning, outperforming deduplication methods while maintaining only a 5.73% increase in perplexity. We present MaxTER, a Pareto-optimal evaluation framework for assessing privacy-utility tradeoffs, and show the performance of RMFT vs Deduplication by Area Under The Response Curve (AURC) metric.

#27 Mobility Induced Sensitivity of UAV based Nodes to Jamming in Private 5G Airfield Networks An Experimental Study

privacy

著者: Pavlo Mykytyn, Ronald Chitauro, Onur Yener, Peter Langendoerfer

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.03536

要約:
This work presents an experimental performance evaluation of a private 5G airfield network under controlled directional SDR jamming attacks targeting UAV-based UE nodes. Using a QualiPoc Android UE, mounted as a payload on a quadcopter UAV, we conducted a series of experiments to evaluate signal degradation, handover performance, and ser-vice stability in the presence of constant directional jamming. The conducted experiments aimed to examine the effects of varying travel speeds, altitudes, and moving patterns of a UAV-based UE to record and analyze the key physical-layer and network-layer metrics such as CQI, MCS, RSRP, SINR, BLER, Net PDSCH Throughput and RLF. The re-sults of this work describe the link stability and signal degradation dependencies, caused by the level of mobility of the UAV-based UE nodes during autonomous and automatic operation in private 5G Airfield networks

#28 Towards Irreversible Machine Unlearning for Diffusion Models

privacydiffusion

著者: Xun Yuan, Zilong Zhao, Jiayu Li, Aryan Pasikhani, Prosanta Gope, Biplab Sikdar

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.03564

要約:
Diffusion models are renowned for their state-of-the-art performance in generating synthetic images. However, concerns related to safety, privacy, and copyright highlight the need for machine unlearning, which can make diffusion models forget specific training data and prevent the generation of sensitive or unwanted content. Current machine unlearning methods for diffusion models are primarily designed for conditional diffusion models and focus on unlearning specific data classes or features. Among these methods, finetuning-based machine unlearning methods are recognized for their efficiency and effectiveness, which update the parameters of pre-trained diffusion models by minimizing carefully designed loss functions. However, in this paper, we propose a novel attack named Diffusion Model Relearning Attack (DiMRA), which can reverse the finetuning-based machine unlearning methods, posing a significant vulnerability of this kind of technique. Without prior knowledge of the unlearning elements, DiMRA optimizes the unlearned diffusion model on an auxiliary dataset to reverse the unlearning, enabling the model to regenerate previously unlearned elements. To mitigate this vulnerability, we propose a novel machine unlearning method for diffusion models, termed as Diffusion Model Unlearning by Memorization (DiMUM). Unlike traditional methods that focus on forgetting, DiMUM memorizes alternative data or features to replace targeted unlearning data or features in order to prevent generating such elements. In our experiments, we demonstrate the effectiveness of DiMRA in reversing state-of-the-art finetuning-based machine unlearning methods for diffusion models, highlighting the need for more robust solutions. We extensively evaluate DiMUM, demonstrating its superior ability to preserve the generative performance of diffusion models while enhancing robustness against DiMRA.

#29 Dynamic Optical Test for Bot Identification (DOT-BI): A simple check to identify bots in surveys and online processes

著者: Malte Bleeker, Mauro Gotsch

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.03580

要約:
We propose the Dynamic Optical Test for Bot Identification (DOT-BI): a quick and easy method that uses human perception of motion to differentiate between human respondents and automated systems in surveys and online processes. In DOT-BI, a 'hidden' number is displayed with the same random black-and-white pixel texture as its background. Only the difference in motion and scale between the number and the background makes the number perceptible to humans across frames, while frame-by-frame algorithmic processing yields no meaningful signal. We conducted two preliminary assessments. Firstly, state-of-the-art, video-capable, multimodal models (GPT-5-Thinking and Gemini 2.5 Pro) fail to extract the correct value, even when given explicit instructions about the mechanism. Secondly, in an online survey (n=182), 99.5% (181/182) of participants solved the task, with an average end-to-end completion time of 10.7 seconds; a supervised lab study (n=39) found no negative effects on perceived ease-of-use or completion time relative to a control. We release code to generate tests and 100+ pre-rendered variants to facilitate adoption in surveys and online processes.

#30 In-Context Representation Hijacking

著者: Itay Yona, Amir Sarid, Michael Karasik, Yossi Gandelsman

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.03771

要約:
We introduce \textbf{Doublespeak}, a simple \emph{in-context representation hijacking} attack against large language models (LLMs). The attack works by systematically replacing a harmful keyword (e.g., \textit{bomb}) with a benign token (e.g., \textit{carrot}) across multiple in-context examples, provided a prefix to a harmful request. We demonstrate that this substitution leads to the internal representation of the benign token converging toward that of the harmful one, effectively embedding the harmful semantics under a euphemism. As a result, superficially innocuous prompts (e.g., ``How to build a carrot?'') are internally interpreted as disallowed instructions (e.g., ``How to build a bomb?''), thereby bypassing the model's safety alignment. We use interpretability tools to show that this semantic overwrite emerges layer by layer, with benign meanings in early layers converging into harmful semantics in later ones. Doublespeak is optimization-free, broadly transferable across model families, and achieves strong success rates on closed-source and open-source systems, reaching 74\% ASR on Llama-3.3-70B-Instruct with a single-sentence context override. Our findings highlight a new attack surface in the latent space of LLMs, revealing that current alignment strategies are insufficient and should instead operate at the representation level.

#31 Log Probability Tracking of LLM APIs

著者: Timoth\'ee Chauvin, Erwan Le Merrer, Fran\c{c}ois Ta\"iani, Gilles Tredan

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.03816

要約:
When using an LLM through an API provider, users expect the served model to remain consistent over time, a property crucial for the reliability of downstream applications and the reproducibility of research. Existing audit methods are too costly to apply at regular time intervals to the wide range of available LLM APIs. This means that model updates are left largely unmonitored in practice. In this work, we show that while LLM log probabilities (logprobs) are usually non-deterministic, they can still be used as the basis for cost-effective continuous monitoring of LLM APIs. We apply a simple statistical test based on the average value of each token logprob, requesting only a single token of output. This is enough to detect changes as small as one step of fine-tuning, making this approach more sensitive than existing methods while being 1,000x cheaper. We introduce the TinyChange benchmark as a way to measure the sensitivity of audit methods in the context of small, realistic model changes.

#32 A Comprehensive Study on the Impact of Vulnerable Dependencies on Open-Source Software

著者: Shree Hari Bittugondanahalli Indra Kumar, Lilia Rodrigues Sampaio, Andr\'e Martin, Andrey Brito, Christof Fetzer

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.03868

要約:
Open-source libraries are widely used by software developers to speed up the development of products, however, they can introduce security vulnerabilities, leading to incidents like Log4Shell. With the expanding usage of open-source libraries, it becomes even more imperative to comprehend and address these dependency vulnerabilities. The use of Software Composition Analysis (SCA) tools does greatly help here as they provide a deep insight on what dependencies are used in a project, enhancing the security and integrity in the software supply chain. In order to learn how wide spread vulnerabilities are and how quickly they are being fixed, we conducted a study on over 1k open-source software projects with about 50k releases comprising several languages such as Java, Python, Rust, Go, Ruby, PHP, and JavaScript. Our objective is to investigate the severity, persistence, and distribution of these vulnerabilities, as well as their correlation with project metrics such as team and contributors size, activity and release cycles. In order to perform such analysis, we crawled over 1k projects from github including their version history ranging from 2013 to 2023 using VODA, our SCA tool. Using our approach, we can provide information such as library versions, dependency depth, and known vulnerabilities, and how they evolved over the software development cycle. Being larger and more diverse than datasets used in earlier works and studies, ours provides better insights and generalizability of the gained results. The data collected answers several research questions about the dependency depth and the average time a vulnerability persists. Among other findings, we observed that for most programming languages, vulnerable dependencies are transitive, and a critical vulnerability persists in average for over a year before being fixed.

#33 Efficient Public Verification of Private ML via Regularization

privacy

著者: Zo\"e Ruha Bell, Anvith Thudi, Olive Franzese-McLaughlin, Nicolas Papernot, Shafi Goldwasser

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.04008

要約:
Training with differential privacy (DP) provides a guarantee to members in a dataset that they cannot be identified by users of the released model. However, those data providers, and, in general, the public, lack methods to efficiently verify that models trained on their data satisfy DP guarantees. The amount of compute needed to verify DP guarantees for current algorithms scales with the amount of compute required to train the model. In this paper we design the first DP algorithm with near optimal privacy-utility trade-offs but whose DP guarantees can be verified cheaper than training. We focus on DP stochastic convex optimization (DP-SCO), where optimal privacy-utility trade-offs are known. Here we show we can obtain tight privacy-utility trade-offs by privately minimizing a series of regularized objectives and only using the standard DP composition bound. Crucially, this method can be verified with much less compute than training. This leads to the first known DP-SCO algorithm with near optimal privacy-utility whose DP verification scales better than training cost, significantly reducing verification costs on large datasets.

#34 MarkTune: Improving the Quality-Detectability Trade-off in Open-Weight LLM Watermarking

intellectual property

著者: Yizhou Zhao, Zhiwei Steven Wu, Adam Block

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.04044

要約:
Watermarking aims to embed hidden signals in generated text that can be reliably detected when given access to a secret key. Open-weight language models pose acute challenges for such watermarking schemes because the inference-time interventions that dominate contemporary approaches cannot be enforced once model weights are public. Existing watermaking techniques for open-weight models, such as the recently proposed GaussMark, typically rely on small modifications to model weights, which can yield signals detectable to those equipped with a secret key, but achieving detection power comparable to inference-time watermarks generally requires weight perturbations that noticeably reduce generation quality. We introduce MarkTune, a theoretically principled, on-policy fine-tuning framework that treats the GaussMark signal as a reward while simultaneously regularizing against degradation in text quality. We derive MarkTune as an improvement on GaussMark and demonstrate that MarkTune consistently improves the quality-detectability trade-off over GaussMark by steering finer-grained, watermark-aware weight updates within the model's representation space while preserving generation quality. Empirically, we show that MarkTune pushes the quality-detectability frontier of GaussMark close to that of inference-time watermarking, remains robust to paraphrasing and fine-tuning attacks, and exhibits strong generalization: a model fine-tuned on one dataset retains substantial watermark detection power on unseen datasets. Together, these results establish MarkTune as a general strategy for embedding robust, high-quality watermarks into open-weight LMs.

#35 Exercising the CCPA Opt-out Right on Android: Legally Mandated but Practically Challenging

著者: Sebastian Zimmeck, Nishant Aggarwal, Zachary Liu, Sage Altman, Konrad Kollnig

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2407.14938

要約:
The business model of many mobile apps is based on generating revenue from sharing user data with ad networks and other companies to deliver personalized ads. The California Consumer Privacy Act (CCPA) gives California residents a right to opt out of the selling and sharing of their personal information. In two experiments we evaluate to which extent popular apps on the Android platform enable California residents to exercise their CCPA opt-out right. In our first experiment -- manually exercising the opt-out right via app-level UIs for a set of 100 apps -- we find that only 48 apps implement the legally mandated setting, which suggests that CCPA opt-out right non-compliance is a broader issue on the platform. In our second experiment -- automatically exercising the opt-out right at the platform-level by sending Global Privacy Control (GPC) signals -- we find for an app dataset of $1,811$ apps that GPC is largely ineffective. While we estimate with 95% confidence that 62%--81% of apps in our app dataset must respect the CCPA's opt-out right, many apps do not do so. Disabling apps' access to the AdID, which is not intended for exercising the CCPA opt-out right but could have practical effect in this regard, does not lead to a different result. For example, when sending GPC signals and disabling apps' access to the AdID, 338 apps still had the ccpa status of the ad network Vungle set to opted in while only 26 had set it to opted out. Overall, our results suggest a compliance gap as California residents have no effective way of exercising their CCPA opt-out right on the Android platform; neither at the app- nor at the platform-level. We think that re-purposing the Android AdID setting as an opt-out right setting with legal meaning could resolve this compliance gap under the CCPA and other laws and improve users' privacy on the platform overall.

#36 Privacy is All You Need: Revolutionizing Wearable Health Data with Advanced PETs

privacy

著者: Karthik Barma

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2503.03428

要約:
In a world where data is the new currency, wearable health devices offer unprecedented insights into daily life, continuously monitoring vital signs and metrics. However, this convenience raises privacy concerns, as these devices collect sensitive data that can be misused or breached. Traditional measures often fail due to real-time data processing needs and limited device power. Users also lack awareness and control over data sharing and usage. We propose a Privacy-Enhancing Technology (PET) framework for wearable devices, integrating federated learning, lightweight cryptographic methods, and selectively deployed blockchain technology. The blockchain acts as a secure ledger triggered only upon data transfer requests, granting users real-time notifications and control. By dismantling data monopolies, this approach returns data sovereignty to individuals. Through real-world applications like secure medical data sharing, privacy-preserving fitness tracking, and continuous health monitoring, our framework reduces privacy risks by up to 70 percent while preserving data utility and performance. This innovation sets a new benchmark for wearable privacy and can scale to broader IoT ecosystems, including smart homes and industry. As data continues to shape our digital landscape, our research underscores the critical need to maintain privacy and user control at the forefront of technological progress.

#37 Les Dissonances: Cross-Tool Harvesting and Polluting in Pool-of-Tools Empowered LLM Agents

agent

著者: Zichuan Li, Jian Cui, Xiaojing Liao, Luyi Xing

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2504.03111

要約:
Large Language Model (LLM) agents are autonomous systems powered by LLMs, capable of reasoning and planning to solve problems by leveraging a set of tools. However, the integration of multi-tool capabilities in LLM agents introduces challenges in securely managing tools, ensuring their compatibility, handling dependency relationships, and protecting control flows within LLM agent workflows. In this paper, we present the first systematic security analysis of task control flows in multi-tool-enabled LLM agents. We identify a novel threat, Cross-Tool Harvesting and Polluting (XTHP), which includes multiple attack vectors to first hijack the normal control flows of agent tasks, and then collect and pollute confidential or private information within LLM agent systems. To understand the impact of this threat, we developed Chord, a dynamic scanning tool designed to automatically detect real-world agent tools susceptible to XTHP attacks. Our evaluation of 66 real-world tools from the repositories of two major LLM agent development frameworks, LangChain and LlamaIndex, revealed a significant security concern: 75% are vulnerable to XTHP attacks, highlighting the prevalence of this threat.

#38 SafeGenes: Evaluating the Adversarial Robustness of Genomic Foundation Models

著者: Huixin Zhan, Clovis Barbour, Jason H. Moore

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2506.00821

要約:
Genomic Foundation Models (GFMs), such as Evolutionary Scale Modeling (ESM), have demonstrated significant success in variant effect prediction. However, their adversarial robustness remains largely unexplored. To address this gap, we propose SafeGenes: a framework for Secure analysis of genomic foundation models, leveraging adversarial attacks to evaluate robustness against both engineered near-identical adversarial Genes and embedding-space manipulations. In this study, we assess the adversarial vulnerabilities of GFMs using two approaches: the Fast Gradient Sign Method (FGSM) and a soft prompt attack. FGSM introduces minimal perturbations to input sequences, while the soft prompt attack optimizes continuous embeddings to manipulate model predictions without modifying the input tokens. By combining these techniques, SafeGenes provides a comprehensive assessment of GFM susceptibility to adversarial manipulation. Targeted soft prompt attacks induced severe degradation in MLM-based shallow architectures such as ProteinBERT, while still producing substantial failure modes even in high-capacity foundation models such as ESM1b and ESM1v. These findings expose critical vulnerabilities in current foundation models, opening new research directions toward improving their security and robustness in high-stakes genomic applications such as variant effect prediction.

#39 Blameless Users in a Clean Room: Defining Copyright Protection for Generative Models

intellectual property

著者: Aloni Cohen

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2506.19881

要約:
Are there any conditions under which a generative model's outputs are guaranteed not to infringe the copyrights of its training data? This is the question of "provable copyright protection" first posed by Vyas, Kakade, and Barak (ICML 2023). They define near access-freeness (NAF) and propose it as sufficient for protection. This paper revisits the question and establishes new foundations for provable copyright protection -- foundations that are firmer both technically and legally. First, we show that NAF alone does not prevent infringement. In fact, NAF models can enable verbatim copying, a blatant failure of copy protection that we dub being tainted. Then, we introduce our blameless copy protection framework for defining meaningful guarantees, and instantiate it with clean-room copy protection. Clean-room copy protection allows a user to control their risk of copying by behaving in a way that is unlikely to copy in a counterfactual clean-room setting. Finally, we formalize a common intuition about differential privacy and copyright by proving that DP implies clean-room copy protection when the dataset is golden, a copyright deduplication requirement.

#40 SafePTR: Token-Level Jailbreak Defense in Multimodal LLMs via Prune-then-Restore Mechanism

著者: Beitao Chen, Xinyu Lyu, Lianli Gao, Jingkuan Song, Heng Tao Shen

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2507.01513

要約:
By incorporating visual inputs, Multimodal Large Language Models (MLLMs) extend LLMs to support visual reasoning. However, this integration also introduces new vulnerabilities, making MLLMs susceptible to multimodal jailbreak attacks and hindering their safe deployment.Existing defense methods, including Image-to-Text Translation, Safe Prompting, and Multimodal Safety Tuning, attempt to address this by aligning multimodal inputs with LLMs' built-in safeguards.Yet, they fall short in uncovering root causes of multimodal vulnerabilities, particularly how harmful multimodal tokens trigger jailbreak in MLLMs? Consequently, they remain vulnerable to text-driven multimodal jailbreaks, often exhibiting overdefensive behaviors and imposing heavy training overhead.To bridge this gap, we present an comprehensive analysis of where, how and which harmful multimodal tokens bypass safeguards in MLLMs. Surprisingly, we find that less than 1% tokens in early-middle layers are responsible for inducing unsafe behaviors, highlighting the potential of precisely removing a small subset of harmful tokens, without requiring safety tuning, can still effectively improve safety against jailbreaks. Motivated by this, we propose Safe Prune-then-Restore (SafePTR), an training-free defense framework that selectively prunes harmful tokens at vulnerable layers while restoring benign features at subsequent layers.Without incurring additional computational overhead, SafePTR significantly enhances the safety of MLLMs while preserving efficiency. Extensive evaluations across three MLLMs and five benchmarks demonstrate SafePTR's state-of-the-art performance in mitigating jailbreak risks without compromising utility.

#41 Shadow in the Cache: Unveiling and Mitigating Privacy Risks of KV-cache in LLM Inference

privacy

著者: Zhifan Luo, Shuo Shao, Su Zhang, Lijing Zhou, Yuke Hu, Chenxu Zhao, Zhihao Liu, Zhan Qin

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2508.09442

要約:
The Key-Value (KV) cache, which stores intermediate attention computations (Key and Value pairs) to avoid redundant calculations, is a fundamental mechanism for accelerating Large Language Model (LLM) inference. However, this efficiency optimization introduces significant yet underexplored privacy risks. This paper provides the first comprehensive analysis of these vulnerabilities, demonstrating that an attacker can reconstruct sensitive user inputs directly from the KV-cache. We design and implement three distinct attack vectors: a direct Inversion Attack, a more broadly applicable and potent Collision Attack, and a semantic-based Injection Attack. These methods demonstrate the practicality and severity of KV-cache privacy leakage issues. To mitigate this, we propose KV-Cloak, a novel, lightweight, and efficient defense mechanism. KV-Cloak uses a reversible matrix-based obfuscation scheme, combined with operator fusion, to secure the KV-cache. Our extensive experiments show that KV-Cloak effectively thwarts all proposed attacks, reducing reconstruction quality to random noise. Crucially, it achieves this robust security with virtually no degradation in model accuracy and minimal performance overhead, offering a practical solution for trustworthy LLM deployment.

#42 Logic Encryption: This Time for Real

著者: Rupesh Raj Karn, Lakshmi Likhitha Mankali, Zeng Wang, Saideep Sreekumar, Prithwish Basu Roy, Ozgur Sinanoglu, Lilas Alrahis, Johann Knechtel

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.00833

要約:
Modern circuits face various threats like reverse engineering, theft of intellectual property (IP), side-channel attacks, etc. Here, we present a novel approach for IP protection based on logic encryption (LE). Unlike established schemes for logic locking, our work obfuscates the circuit's structure and functionality by encoding and encrypting the logic itself. We devise an end-to-end method for practical LE implementation based on standard cryptographic algorithms, key-bit randomization, simple circuit design techniques, and system-level synthesis operations, all in a correct-by-construction manner. Our extensive analysis demonstrates the remarkable efficacy of our scheme, outperforming prior art against a range of oracle-less attacks covering crucial threat vectors, all with lower design overheads. We provide a full open-source release.

#43 Sliced R\'enyi Pufferfish Privacy: Directional Additive Noise Mechanism and Private Learning with Gradient Clipping

privacy

著者: Tao Zhang, Yevgeniy Vorobeychik

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.01115

要約:
We study privatization mechanism design and privacy accounting in the Pufferfish family, addressing two practical gaps of Renyi Pufferfish Privacy (RPP): high-dimensional optimal transport (OT) calibration and the absence of a general, mechanism-agnostic composition rule for iterative learning. We introduce Sliced Renyi Pufferfish Privacy (SRPP), which replaces high-dimensional comparisons by directional ones over a set of unit vectors, enabling geometry-aware and tractable guarantees. To calibrate noise without high-dimensional OT, we propose sliced Wasserstein mechanisms that compute per-direction (1-D) sensitivities, yielding closed-form, statistically stable, and anisotropic calibrations. We further define SRPP Envelope (SRPE) as computable upper bounds that are tightly implementable by these sliced Wasserstein mechanisms. For iterative deep learning algorithms, we develop a decompose-then-compose SRPP-SGD scheme with gradient clipping based on a History-Uniform Cap (HUC), a pathwise bound on one-step directional changes that is uniform over optimization history, and a mean-square variant (ms-HUC) that leverages subsampling randomness to obtain on-average SRPP guarantees with improved utility. The resulting HUC and ms-HUC accountants aggregate per-iteration, per-direction Renyi costs and integrate naturally with moments-accountant style analyses. Finally, when multiple mechanisms are trained and privatized independently under a common slicing geometry, our analysis yields graceful additive composition in both worst-case and mean-square regimes. Our experiments indicate that the proposed SRPP-based methods achieve favorable privacy-utility trade-offs in both static and iterative settings.

#44 The Trojan Knowledge: Bypassing Commercial LLM Guardrails via Harmless Prompt Weaving and Adaptive Tree Search

著者: Rongzhe Wei, Peizhi Niu, Xinjie Shen, Tony Tu, Yifan Li, Ruihan Wu, Eli Chien, Pin-Yu Chen, Olgica Milenkovic, Pan Li

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.01353

要約:
Large language models (LLMs) remain vulnerable to jailbreak attacks that bypass safety guardrails to elicit harmful outputs. Existing approaches overwhelmingly operate within the prompt-optimization paradigm: whether through traditional algorithmic search or recent agent-based workflows, the resulting prompts typically retain malicious semantic signals that modern guardrails are primed to detect. In contrast, we identify a deeper, largely overlooked vulnerability stemming from the highly interconnected nature of an LLM's internal knowledge. This structure allows harmful objectives to be realized by weaving together sequences of benign sub-queries, each of which individually evades detection. To exploit this loophole, we introduce the Correlated Knowledge Attack Agent (CKA-Agent), a dynamic framework that reframes jailbreaking as an adaptive, tree-structured exploration of the target model's knowledge base. The CKA-Agent issues locally innocuous queries, uses model responses to guide exploration across multiple paths, and ultimately assembles the aggregated information to achieve the original harmful objective. Evaluated across state-of-the-art commercial LLMs (Gemini2.5-Flash/Pro, GPT-oss-120B, Claude-Haiku-4.5), CKA-Agent consistently achieves over 95% success rates even against strong guardrails, underscoring the severity of this vulnerability and the urgent need for defenses against such knowledge-decomposition attacks. Our codes are available at https://github.com/Graph-COM/CKA-Agent.

#45 COGNITION: From Evaluation to Defense against Multimodal LLM CAPTCHA Solvers

著者: Junyu Wang, Changjia Zhu, Yuanbo Zhou, Lingyao Li, Xu He, Junjie Xiong

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.02318

要約:
This paper studies how multimodal large language models (MLLMs) undermine the security guarantees of visual CAPTCHA. We identify the attack surface where an adversary can cheaply automate CAPTCHA solving using off-the-shelf models. We evaluate 7 leading commercial and open-source MLLMs across 18 real-world CAPTCHA task types, measuring single-shot accuracy, success under limited retries, end-to-end latency, and per-solve cost. We further analyze the impact of task-specific prompt engineering and few-shot demonstrations on solver effectiveness. We reveal that MLLMs can reliably solve recognition-oriented and low-interaction CAPTCHA tasks at human-like cost and latency, whereas tasks requiring fine-grained localization, multi-step spatial reasoning, or cross-frame consistency remain significantly harder for current models. By examining the reasoning traces of such MLLMs, we investigate the underlying mechanisms of why models succeed/fail on specific CAPTCHA puzzles and use these insights to derive defense-oriented guidelines for selecting and strengthening CAPTCHA tasks. We conclude by discussing implications for platform operators deploying CAPTCHA as part of their abuse-mitigation pipeline.Code Availability (https://anonymous.4open.science/r/Captcha-465E/).

#46 Decryption Through Polynomial Ambiguity: Noise-Enhanced High-Memory Convolutional Codes for Post-Quantum Cryptography

著者: Meir Ariel

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.02822

要約:
We present a novel approach to post-quantum cryptography that employs directed-graph decryption of noise-enhanced high-memory convolutional codes. The proposed construction generates random-like generator matrices that effectively conceal algebraic structure and resist known structural attacks. Security is further reinforced by the deliberate injection of strong noise during decryption, arising from polynomial division: while legitimate recipients retain polynomial-time decoding, adversaries face exponential-time complexity. As a result, the scheme achieves cryptanalytic security margins surpassing those of Classic McEliece by factors exceeding 2^(200). Beyond its enhanced security, the method offers greater design flexibility, supporting arbitrary plaintext lengths with linear-time decryption and uniform per-bit computational cost, enabling seamless scalability to long messages. Practical deployment is facilitated by parallel arrays of directed-graph decoders, which identify the correct plaintext through polynomial ambiguity while allowing efficient hardware and software implementations. Altogether, the scheme represents a compelling candidate for robust, scalable, and quantum-resistant public-key cryptography.

#47 Privacy Risks and Preservation Methods in Explainable Artificial Intelligence: A Scoping Review

privacy

著者: Sonal Allana, Mohan Kankanhalli, Rozita Dara

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2505.02828

要約:
Explainable Artificial Intelligence (XAI) has emerged as a pillar of Trustworthy AI and aims to bring transparency in complex models that are opaque by nature. Despite the benefits of incorporating explanations in models, an urgent need is found in addressing the privacy concerns of providing this additional information to end users. In this article, we conduct a scoping review of existing literature to elicit details on the conflict between privacy and explainability. Using the standard methodology for scoping review, we extracted 57 articles from 1,943 studies published from January 2019 to December 2024. The review addresses 3 research questions to present readers with more understanding of the topic: (1) what are the privacy risks of releasing explanations in AI systems? (2) what current methods have researchers employed to achieve privacy preservation in XAI systems? (3) what constitutes a privacy preserving explanation? Based on the knowledge synthesized from the selected studies, we categorize the privacy risks and preservation methods in XAI and propose the characteristics of privacy preserving explanations to aid researchers and practitioners in understanding the requirements of XAI that is privacy compliant. Lastly, we identify the challenges in balancing privacy with other system desiderata and provide recommendations for achieving privacy preserving XAI. We expect that this review will shed light on the complex relationship of privacy and explainability, both being the fundamental principles of Trustworthy AI.

#48 Can VLMs Detect and Localize Fine-Grained AI-Edited Images?

著者: Zhen Sun, Ziyi Zhang, Zeren Luo, Zhiyuan Zhong, Zeyang Sha, Tianshuo Cong, Zheng Li, Shiwen Cui, Weiqiang Wang, Jiaheng Wei, Xinlei He, Qi Li, Qian Wang

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2505.15644

要約:
Fine-grained detection and localization of localized image edits is crucial for assessing content authenticity, especially as modern diffusion models and image editors can produce highly realistic manipulations. However, this problem faces three key challenges: (1) most AIGC detectors produce only a global real-or-fake label without indicating where edits occur; (2) traditional computer vision methods for edit localization typically rely on costly pixel-level annotations; and (3) there is no large-scale, modern benchmark specifically targeting edited-image detection. To address these gaps, we develop an automated data-generation pipeline and construct FragFake, a large-scale benchmark of AI-edited images spanning multiple source datasets, diverse editing models, and several common edit types. Building on FragFake, we are the first to systematically study vision language models (VLMs) for edited-image classification and edited-region localization. Our experiments show that pretrained VLMs, including GPT4o, perform poorly on this task, whereas fine-tuned models such as Qwen2.5-VL achieve high accuracy and substantially higher object precision across all settings. We further explore GRPO-based RLVR training, which yields modest metric gains while improving the interpretability of model outputs. Ablation and transfer analyses reveal how data balancing, training size, LoRA rank, and training domain affect performance, and highlight both the potential and the limitations of cross-editor and cross-dataset generalization. We anticipate that this work will establish a solid foundation to facilitate and inspire subsequent research endeavors in the domain of multimodal content authenticity.

#49 Exact Coset Sampling for Quantum Lattice Algorithms

著者: Yifan Zhang

公開日: Thu, 04 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2509.12341

要約:
In this work, we give a new completion of Chen's windowed-QFT lattice algorithm~\citep{chen2024quantum}. This extra step, called Step~$9^\dagger$, replaces the domain extension stage in Steps~8--9. The published Step~9 calls an amplitude periodicity lemma, yet its hypotheses break in the presence of affine offsets $\boldsymbol{v}^*$. Our analysis finds a basic conflict between two design constraints. The lattice problem asks for high spectral resolution, so the method prefers wide time windows. The quadratic phase error of the state prefers narrow time windows. Assumption~A5 packages the spectral concentration and near-uniformity properties that we require from the front end. Under~A5, a direct $\mathbb{Z}_M^n$ Fourier transform of the chirp-corrected coordinate state produces samples $\boldsymbol{u}$ that satisfy $\langle \boldsymbol{b}, \boldsymbol{u} \rangle \equiv 0 \pmod{Q}$ with probability $1-\mathrm{negl}(n)$ and are nearly uniform on the dual hyperplane $\{\boldsymbol{u} : \langle \boldsymbol{b}, \boldsymbol{u} \rangle \equiv 0 \pmod{Q}\}$. The new procedure does not require internal access to control wires. It uses the normalization $b_1=-1$ to apply a center-referenced phase correction directly on the first coordinate register. The scaling parameter $D$ ensures that this physical operation can be implemented by arithmetic on $X_1$ alone and does not read the hidden loop index. For Chen's complex-Gaussian Karst-wave window, we isolate a parameter regime, formalized in Assumption~A5, in which a polynomial retuning of the parameters gives a one-dimensional envelope for the loop index with width $\sigma_J \asymp Q\log n$.

cs.CR updates on arXiv.org

📋 論文タイトル一覧