arXiv論文一覧 - cs.CR updates on arXiv.org

#1 Diffusion-Based Image Editing: An Unforeseen Adversary to Robust Invisible Watermarks

intellectual propertydiffusion

著者: Wenkai Fu, Finn Carter, Yue Wang, Emily Davis, Bo Zhang

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.05598

要約:
Robust invisible watermarking aims to embed hidden messages into images such that they survive various manipulations while remaining imperceptible. However, powerful diffusion-based image generation and editing models now enable realistic content-preserving transformations that can inadvertently remove or distort embedded watermarks. In this paper, we present a theoretical and empirical analysis demonstrating that diffusion-based image editing can effectively break state-of-the-art robust watermarks designed to withstand conventional distortions. We analyze how the iterative noising and denoising process of diffusion models degrades embedded watermark signals, and provide formal proofs that under certain conditions a diffusion model's regenerated image retains virtually no detectable watermark information. Building on this insight, we propose a diffusion-driven attack that uses generative image regeneration to erase watermarks from a given image. Furthermore, we introduce an enhanced \emph{guided diffusion} attack that explicitly targets the watermark during generation by integrating the watermark decoder into the sampling loop. We evaluate our approaches on multiple recent deep learning watermarking schemes (e.g., StegaStamp, TrustMark, and VINE) and demonstrate that diffusion-based editing can reduce watermark decoding accuracy to near-zero levels while preserving high visual fidelity of the images. Our findings reveal a fundamental vulnerability in current robust watermarking techniques against generative model-based edits, underscoring the need for new watermarking strategies in the era of generative AI.

#2 Securing UAV Communications by Fusing Cross-Layer Fingerprints

著者: Yong Huang, Ruihao Li, Mingyang Chen, Feiyang Zhao, Dalong Zhang, Wanqing Tu

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.05796

要約:
The open nature of wireless communications renders unmanned aerial vehicle (UAV) communications vulnerable to impersonation attacks, under which malicious UAVs can impersonate authorized ones with stolen digital certificates. Traditional fingerprint-based UAV authentication approaches rely on a single modality of sensory data gathered from a single layer of the network model, resulting in unreliable authentication experiences, particularly when UAVs are mobile and in an open-world environment. To transcend these limitations, this paper proposes SecureLink, a UAV authentication system that is among the first to employ cross-layer information for enhancing the efficiency and reliability of UAV authentication. Instead of using single modalities, SecureLink fuses physical-layer radio frequency (RF) fingerprints and application-layer micro-electromechanical system (MEMS) fingerprints into reliable UAV identifiers via multimodal fusion. SecureLink first aligns fingerprints from channel state information measurements and telemetry data, such as feedback readings of onboard accelerometers, gyroscopes, and barometers. Then, an attention-based neural network is devised for in-depth feature fusion. Next, the fused features are trained by a multi-similarity loss and fed into a one-class support vector machine for open-world authentication. We extensively implement our SecureLink using three different types of UAVs and evaluate it in different environments. With only six additional data frames, SecureLink achieves a closed-world accuracy of 98.61% and an open-world accuracy of 97.54% with two impersonating UAVs, outperforming the existing approaches in authentication robustness and communication overheads. Finally, our datasets collected from these experiments are available on GitHub: https://github.com/PhyGroup/SecureLink\_data.

#3 When AI Meets the Web: Prompt Injection Risks in Third-Party AI Chatbot Plugins

著者: Yigitcan Kaya, Anton Landerer, Stijn Pletinckx, Michelle Zimmermann, Christopher Kruegel, Giovanni Vigna

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.05797

要約:
Prompt injection attacks pose a critical threat to large language models (LLMs), with prior work focusing on cutting-edge LLM applications like personal copilots. In contrast, simpler LLM applications, such as customer service chatbots, are widespread on the web, yet their security posture and exposure to such attacks remain poorly understood. These applications often rely on third-party chatbot plugins that act as intermediaries to commercial LLM APIs, offering non-expert website builders intuitive ways to customize chatbot behaviors. To bridge this gap, we present the first large-scale study of 17 third-party chatbot plugins used by over 10,000 public websites, uncovering previously unknown prompt injection risks in practice. First, 8 of these plugins (used by 8,000 websites) fail to enforce the integrity of the conversation history transmitted in network requests between the website visitor and the chatbot. This oversight amplifies the impact of direct prompt injection attacks by allowing adversaries to forge conversation histories (including fake system messages), boosting their ability to elicit unintended behavior (e.g., code generation) by 3 to 8x. Second, 15 plugins offer tools, such as web-scraping, to enrich the chatbot's context with website-specific content. However, these tools do not distinguish the website's trusted content (e.g., product descriptions) from untrusted, third-party content (e.g., customer reviews), introducing a risk of indirect prompt injection. Notably, we found that ~13% of e-commerce websites have already exposed their chatbots to third-party content. We systematically evaluate both vulnerabilities through controlled experiments grounded in real-world observations, focusing on factors such as system prompt design and the underlying LLM. Our findings show that many plugins adopt insecure practices that undermine the built-in LLM safeguards.

#4 IndirectAD: Practical Data Poisoning Attacks against Recommender Systems for Item Promotion

backdoor

著者: Zihao Wang, Tianhao Mao, XiaoFeng Wang, Di Tang, Xiaozhong Liu

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.05845

要約:
Recommender systems play a central role in digital platforms by providing personalized content. They often use methods such as collaborative filtering and machine learning to accurately predict user preferences. Although these systems offer substantial benefits, they are vulnerable to security and privacy threats, especially data poisoning attacks. By inserting misleading data, attackers can manipulate recommendations for purposes ranging from boosting product visibility to shaping public opinion. Despite these risks, concerns are often downplayed because such attacks typically require controlling at least 1% of the platform's user base, a difficult task on large platforms. We tackle this issue by introducing the IndirectAD attack, inspired by Trojan attacks on machine learning. IndirectAD reduces the need for a high poisoning ratio through a trigger item that is easier to recommend to the target users. Rather than directly promoting a target item that does not match a user's interests, IndirectAD first promotes the trigger item, then transfers that advantage to the target item by creating co-occurrence data between them. This indirect strategy delivers a stronger promotion effect while using fewer controlled user accounts. Our extensive experiments on multiple datasets and recommender systems show that IndirectAD can cause noticeable impact with only 0.05% of the platform's user base. Even in large-scale settings, IndirectAD remains effective, highlighting a more serious and realistic threat to today's recommender systems.

#5 MCP-RiskCue: Can LLM infer risk information from MCP server System Logs?

著者: Jiayi Fu, Qiyao Sun

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.05867

要約:
Large language models (LLMs) demonstrate strong capabilities in solving complex tasks when integrated with external tools. The Model Context Protocol (MCP) has become a standard interface for enabling such tool-based interactions. However, these interactions introduce substantial security concerns, particularly when the MCP server is compromised or untrustworthy. While prior benchmarks primarily focus on prompt injection attacks or analyze the vulnerabilities of LLM MCP interaction trajectories, limited attention has been given to the underlying system logs associated with malicious MCP servers. To address this gap, we present the first synthetic benchmark for evaluating LLMs ability to identify security risks from system logs. We define nine categories of MCP server risks and generate 1,800 synthetic system logs using ten state-of-the-art LLMs. These logs are embedded in the return values of 243 curated MCP servers, yielding a dataset of 2,421 chat histories for training and 471 queries for evaluation. Our pilot experiments reveal that smaller models often fail to detect risky system logs, leading to high false negatives. While models trained with supervised fine-tuning (SFT) tend to over-flag benign logs, resulting in elevated false positives, Reinforcement Learning from Verifiable Reward (RLVR) offers a better precision-recall balance. In particular, after training with Group Relative Policy Optimization (GRPO), Llama3.1-8B-Instruct achieves 83% accuracy, surpassing the best-performing large remote model by 9 percentage points. Fine-grained, per-category analysis further underscores the effectiveness of reinforcement learning in enhancing LLM safety within the MCP framework. Code and data are available at: https://github.com/PorUna-byte/MCP-Guard/tree/master

#6 Injecting Falsehoods: Adversarial Man-in-the-Middle Attacks Undermining Factual Recall in LLMs

著者: Alina Fastowski, Bardh Prenkaj, Yuxiao Li, Gjergji Kasneci

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.05919

要約:
LLMs are now an integral part of information retrieval. As such, their role as question answering chatbots raises significant concerns due to their shown vulnerability to adversarial man-in-the-middle (MitM) attacks. Here, we propose the first principled attack evaluation on LLM factual memory under prompt injection via Xmera, our novel, theory-grounded MitM framework. By perturbing the input given to "victim" LLMs in three closed-book and fact-based QA settings, we undermine the correctness of the responses and assess the uncertainty of their generation process. Surprisingly, trivial instruction-based attacks report the highest success rate (up to ~85.3%) while simultaneously having a high uncertainty for incorrectly answered questions. To provide a simple defense mechanism against Xmera, we train Random Forest classifiers on the response uncertainty levels to distinguish between attacked and unattacked queries (average AUC of up to ~96%). We believe that signaling users to be cautious about the answers they receive from black-box and potentially corrupt LLMs is a first checkpoint toward user cyberspace safety.

#7 Cryptographic Binding Should Not Be Optional: A Formal-Methods Analysis of FIDO UAF Channel Binding

著者: Enis Golaszewski, Alan T. Sherman, Edward Zieglar, Jonathan D. Fuchs, Sophia Hamer

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.06028

要約:
As a case study in cryptographic binding, we present a formal-methods analysis of the cryptographic channel binding mechanisms in the Fast IDentity Online (FIDO) Universal Authentication Framework (UAF) authentication protocol, which seeks to reduce the use of traditional passwords in favor of authentication devices. First, we show that UAF's channel bindings fail to mitigate protocol interaction by a Dolev-Yao adversary, enabling the adversary to transfer the server's authentication challenge to alternate sessions of the protocol. As a result, in some contexts, the adversary can masquerade as a client and establish an authenticated session with a server (e.g., possibly a bank server). Second, we implement a proof-of-concept man-in-the-middle attack against eBay's open source FIDO UAF implementation. Third, we propose and formally verify improvements to UAF. The weakness we analyze is similar to the vulnerability discovered in the Needham-Schroeder protocol over 25 years ago. That this vulnerability appears in the FIDO UAF standard highlights the strong need for protocol designers to bind messages properly and to analyze their designs with formal-methods tools. To our knowledge, we are first to carry out a formal-methods analysis of channel binding in UAF and first to exhibit details of an attack on UAF that exploits the weaknesses of UAF's channel binding. Our case study illustrates the importance of cryptographically binding context to protocol messages to prevent an adversary from misusing messages out of context.

#8 Identity Card Presentation Attack Detection: A Systematic Review

著者: Esteban M. Ruiz, Juan E. Tapia, Reinel T. Soto, Christoph Busch

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.06056

要約:
Remote identity verification is essential for modern digital security; however, it remains highly vulnerable to sophisticated Presentation Attacks (PAs) that utilise forged or manipulated identity documents. Although Deep Learning (DL) has driven advances in Presentation Attack Detection (PAD), the field is fundamentally limited by a lack of data and the poor generalisation of models across various document types and new attack methods. This article presents a systematic literature review (SLR) conducted in accordance with the PRISMA methodology, aiming to analyse and synthesise the current state of AI-based PAD for identity documents from 2020 to 2025 comprehensively. Our analysis reveals a significant methodological evolution: a transition from standard Convolutional Neural Networks (CNNs) to specialised forensic micro-artefact analysis, and more recently, the adoption of large-scale Foundation Models (FMs), marking a substantial shift in the field. We identify a central paradox that hinders progress: a critical "Reality Gap" exists between models validated on extensive, private datasets and those assessed using limited public datasets, which typically consist of mock-ups or synthetic data. This gap limits the reproducibility of research results. Additionally, we highlight a "Synthetic Utility Gap," where synthetic data generation the primary academic response to data scarcity often fails to predict forensic utility. This can lead to model overfitting to generation artefacts instead of the actual attack. This review consolidates our findings, identifies critical research gaps, and provides a definitive reference framework that outlines a prescriptive roadmap for future research aimed at developing secure, robust, and globally generalizable PAD systems.

#9 A Privacy-Preserving Federated Learning Method with Homomorphic Encryption in Omics Data

privacy

著者: Yusaku Negoya, Feifei Cui, Zilong Zhang, Miao Pan, Tomoaki Ohtsuki, Aohan Li

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.06064

要約:
Omics data is widely employed in medical research to identify disease mechanisms and contains highly sensitive personal information. Federated Learning (FL) with Differential Privacy (DP) can ensure the protection of omics data privacy against malicious user attacks. However, FL with the DP method faces an inherent trade-off: stronger privacy protection degrades predictive accuracy due to injected noise. On the other hand, Homomorphic Encryption (HE) allows computations on encrypted data and enables aggregation of encrypted gradients without DP-induced noise can increase the predictive accuracy. However, it may increase the computation cost. To improve the predictive accuracy while considering the computational ability of heterogeneous clients, we propose a Privacy-Preserving Machine Learning (PPML)-Hybrid method by introducing HE. In the proposed PPML-Hybrid method, clients distributed select either HE or DP based on their computational resources, so that HE clients contribute noise-free updates while DP clients reduce computational overhead. Meanwhile, clients with high computational resources clients can flexibly adopt HE or DP according to their privacy needs. Performance evaluation on omics datasets show that our proposed method achieves comparable predictive accuracy while significantly reducing computation time relative to HE-only. Additionally, it outperforms DP-only methods under equivalent or stricter privacy budgets.

#10 PraxiMLP: A Threshold-based Framework for Efficient Three-Party MLP with Practical Security

著者: Tianle Tao, Shizhao Peng, Haogang Zhu

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.06104

要約:
Efficiency and communication cost remain critical bottlenecks for practical Privacy-Preserving Machine Learning (PPML). Most existing frameworks rely on fixed-point arithmetic for strong security, which introduces significant precision loss and requires expensive cross-domain conversions (e.g., Arithmetic-to-Boolean) for non-linear operations. To address this, we propose PraxiMLP, a highly efficient three-party MLP framework grounded in practical security. The core of our work is a pair of novel additive-to-multiplicative conversion protocols that operate entirely within the arithmetic domain, thus avoiding expensive cross-domain conversions. By natively supporting loating-point numbers, PraxiMLP precisely handles non-linear functions, dramatically improving both efficiency and precision. Experimental results confirm that, compared to mainstream PPML frameworks, PraxiMLP delivers an average 8 orders of magnitude precision improvement on basic protocols and a 5x average model training speedup in a WAN environment.

#11 Reliablocks: Developing Reliability Scores for Optimistic Rollups

著者: Souradeep Das, Ethan Lam, Varun Vaidya, Sanjay Amirthraj

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.06130

要約:
Introducing Reliablocks, an on-chain reliability index for non-finalized blocks in Optimistic Rollups. This was built during the EigenLayer Infinite Hackathon at the Infinite Hacker House at DevCon 2024. As part of this research, we delivered a working Layer AVS WASMI component, a working Eigen Layer AVS component, EigenLayer Solidity smart contracts that work with the AVS component, a UI dashboard illustrating the reliability score and a derived interest rate for further utilization.

#12 SoK: Systematizing a Decade of Architectural RowHammer Defenses Through the Lens of Streaming Algorithms

著者: Michael Jaemin Kim, Seungmin Baek, Jumin Kim, Hwayong Nam, Nam Sung Kim, Jung Ho Ahn

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.06192

要約:
A decade after its academic introduction, RowHammer (RH) remains a moving target that continues to challenge both the industry and academia. With its potential to serve as a critical attack vector, the ever-decreasing RH threshold now threatens DRAM process technology scaling, with a superlinearly increasing cost of RH protection solutions. Due to their generality and relatively lower performance costs, architectural RH solutions are the first line of defense against RH. However, the field is fragmented with varying views of the problem, terminologies, and even threat models. In this paper, we systematize architectural RH defenses from the last decade through the lens of streaming algorithms. We provide a taxonomy that encompasses 48 different works. We map multiple architectural RH defenses to the classical streaming algorithms, which extends to multiple proposals that did not identify this link. We also provide two practitioner guides. The first guide analyzes which algorithm best fits a given RHTH, location, process technology, storage type, and mitigative action. The second guide encourages future research to consult existing algorithms when architecting RH defenses. We illustrate this by demonstrating how Reservoir-Sampling can improve related RH defenses, and also introduce StickySampling that can provide mathematical security that related studies do not guarantee.

#13 Enhancing Adversarial Robustness of IoT Intrusion Detection via SHAP-Based Attribution Fingerprinting

著者: Dilli Prasad Sharma, Liang Xue, Xiaowei Sun, Xiaodong Lin, Pulei Xiong

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.06197

要約:
The rapid proliferation of Internet of Things (IoT) devices has transformed numerous industries by enabling seamless connectivity and data-driven automation. However, this expansion has also exposed IoT networks to increasingly sophisticated security threats, including adversarial attacks targeting artificial intelligence (AI) and machine learning (ML)-based intrusion detection systems (IDS) to deliberately evade detection, induce misclassification, and systematically undermine the reliability and integrity of security defenses. To address these challenges, we propose a novel adversarial detection model that enhances the robustness of IoT IDS against adversarial attacks through SHapley Additive exPlanations (SHAP)-based fingerprinting. Using SHAP's DeepExplainer, we extract attribution fingerprints from network traffic features, enabling the IDS to reliably distinguish between clean and adversarially perturbed inputs. By capturing subtle attribution patterns, the model becomes more resilient to evasion attempts and adversarial manipulations. We evaluated the model on a standard IoT benchmark dataset, where it significantly outperformed a state-of-the-art method in detecting adversarial attacks. In addition to enhanced robustness, this approach improves model transparency and interpretability, thereby increasing trust in the IDS through explainable AI.

#14 RAG-targeted Adversarial Attack on LLM-based Threat Detection and Mitigation Framework

著者: Seif Ikbarieh, Kshitiz Aryal, Maanak Gupta

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.06212

要約:
The rapid expansion of the Internet of Things (IoT) is reshaping communication and operational practices across industries, but it also broadens the attack surface and increases susceptibility to security breaches. Artificial Intelligence has become a valuable solution in securing IoT networks, with Large Language Models (LLMs) enabling automated attack behavior analysis and mitigation suggestion in Network Intrusion Detection Systems (NIDS). Despite advancements, the use of LLMs in such systems further expands the attack surface, putting entire networks at risk by introducing vulnerabilities such as prompt injection and data poisoning. In this work, we attack an LLM-based IoT attack analysis and mitigation framework to test its adversarial robustness. We construct an attack description dataset and use it in a targeted data poisoning attack that applies word-level, meaning-preserving perturbations to corrupt the Retrieval-Augmented Generation (RAG) knowledge base of the framework. We then compare pre-attack and post-attack mitigation responses from the target model, ChatGPT-5 Thinking, to measure the impact of the attack on model performance, using an established evaluation rubric designed for human experts and judge LLMs. Our results show that small perturbations degrade LLM performance by weakening the linkage between observed network traffic features and attack behavior, and by reducing the specificity and practicality of recommended mitigations for resource-constrained devices.

#15 HYDRA: A Hybrid Heuristic-Guided Deep Representation Architecture for Predicting Latent Zero-Day Vulnerabilities in Patched Functions

著者: Mohammad Farhad, Sabbir Rahman, Shuvalaxmi Dass

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.06220

要約:
Software security testing, particularly when enhanced with deep learning models, has become a powerful approach for improving software quality, enabling faster detection of known flaws in source code. However, many approaches miss post-fix latent vulnerabilities that remain even after patches typically due to incomplete fixes or overlooked issues may later lead to zero-day exploits. In this paper, we propose $HYDRA$, a $Hy$brid heuristic-guided $D$eep $R$epresentation $A$rchitecture for predicting latent zero-day vulnerabilities in patched functions that combines rule-based heuristics with deep representation learning to detect latent risky code patterns that may persist after patches. It integrates static vulnerability rules, GraphCodeBERT embeddings, and a Variational Autoencoder (VAE) to uncover anomalies often missed by symbolic or neural models alone. We evaluate HYDRA in an unsupervised setting on patched functions from three diverse real-world software projects: Chrome, Android, and ImageMagick. Our results show HYDRA predicts 13.7%, 20.6%, and 24% of functions from Chrome, Android, and ImageMagick respectively as containing latent risks, including both heuristic matches and cases without heuristic matches ($None$) that may lead to zero-day vulnerabilities. It outperforms baseline models that rely solely on regex-derived features or their combination with embeddings, uncovering truly risky code variants that largely align with known heuristic patterns. These results demonstrate HYDRA's capability to surface hidden, previously undetected risks, advancing software security validation and supporting proactive zero-day vulnerabilities discovery.

#16 Setting $\varepsilon$ is not the Issue in Differential Privacy

privacy

著者: Edwige Cyffers

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.06305

要約:
This position paper argues that setting the privacy budget in differential privacy should not be viewed as an important limitation of differential privacy compared to alternative methods for privacy-preserving machine learning. The so-called problem of interpreting the privacy budget is often presented as a major hindrance to the wider adoption of differential privacy in real-world deployments and is sometimes used to promote alternative mitigation techniques for data protection. We believe this misleads decision-makers into choosing unsafe methods. We argue that the difficulty in interpreting privacy budgets does not stem from the definition of differential privacy itself, but from the intrinsic difficulty of estimating privacy risks in context, a challenge that any rigorous method for privacy risk assessment face. Moreover, we claim that any sound method for estimating privacy risks should, given the current state of research, be expressible within the differential privacy framework or justify why it cannot.

#17 Enhancing Deep Learning-Based Rotational-XOR Attacks on Lightweight Block Ciphers Simon32/64 and Simeck32/64

著者: Chengcai Liu, Siwei Chen, Zejun Xiang, Shasha Zhang, Xiangyong Zeng

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.06336

要約:
At CRYPTO 2019, Gohr pioneered neural cryptanalysis by introducing differential-based neural distinguishers to attack Speck32/64, establishing a novel paradigm combining deep learning with differential cryptanalysis.Since then, constructing neural distinguishers has become a significant approach to achieving the deep learning-based cryptanalysis for block ciphers.This paper advances rotational-XOR (RX) attacks through neural networks, focusing on optimizing distinguishers and presenting key-recovery attacks for the lightweight block ciphers Simon32/64 and Simeck32/64.In particular, we first construct the fundamental data formats specially designed for training RX-neural distinguishers by refining the existing data formats for differential-neural distinguishers. Based on these data formats, we systematically identify optimal RX-differences with Hamming weights 1 and 2 that develop high-accuracy RX-neural distinguishers. Then, through innovative application of the bit sensitivity test, we achieve significant compression of data format without sacrificing the distinguisher accuracy. This optimization enables us to add more multi-ciphertext pairs into the data formats, further strengthening the performance of RX-neural distinguishers. As an application, we obtain 14- and 17-round RX-neural distinguishers for Simon32/64 and Simeck32/64, which improves the previous ones by 3 and 2 rounds, respectively.In addition, we propose two novel techniques, key bit sensitivity test and the joint wrong key response, to tackle the challenge of applying Bayesian's key-recovery strategy to the target cipher that adopts nonlinear key schedule in the related-key setting without considering of weak-key space. By this, we can straightforwardly mount a 17-round key-recovery attack on Simeck32/64 based on the improved 16-round RX-nerual distinguisher. To the best of our knowledge, the presented RX-neural......

#18 Ghost in the Transformer: Tracing LLM Lineage with SVD-Fingerprint

著者: Suqing Wang, Ziyang Ma, Xinyi Li, Zuchao Li

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.06390

要約:
Large Language Models (LLMs) have rapidly advanced and are widely adopted across diverse fields. Due to the substantial computational cost and data requirements of training from scratch, many developers choose to fine-tune or modify existing open-source models. While most adhere to open-source licenses, some falsely claim original training despite clear derivation from public models. This raises pressing concerns about intellectual property protection and highlights the need for reliable methods to verify model provenance. In this paper, we propose GhostSpec, a lightweight yet effective method for verifying LLM lineage without access to training data or modification of model behavior. Our approach constructs compact and robust fingerprints by applying singular value decomposition (SVD) to invariant products of internal attention weight matrices, effectively capturing the structural identity of a model. Unlike watermarking or output-based methods, GhostSpec is fully data-free, non-invasive, and computationally efficient. It demonstrates strong robustness to sequential fine-tuning, pruning, block expansion, and even adversarial transformations. Extensive experiments show that GhostSpec can reliably trace the lineage of transformed models with minimal overhead. By offering a practical solution for model verification and reuse tracking, our method contributes to the protection of intellectual property and fosters a transparent, trustworthy ecosystem for large-scale language models.

#19 Inside LockBit: Technical, Behavioral, and Financial Anatomy of a Ransomware Empire

著者: Felipe Casta\~no, Constantinos Patsakis, Francesco Zola, Fran Casino

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.06429

要約:
LockBit has evolved from an obscure Ransomware-as-a-Service newcomer in 2019 to the most prolific ransomware franchise of 2024. Leveraging a recently leaked MySQL dump of the gang's management panel, this study offers an end-to-end reconstruction of LockBit's technical, behavioral, and financial apparatus. We recall the family's version timeline and map its tactics, techniques, and procedures to MITRE ATT&CK, highlighting the incremental hardening that distinguishes LockBit 3.0 from its predecessors. We then analyze 51 negotiation chat logs using natural-language embeddings and clustering to infer a canonical interaction playbook, revealing recurrent rhetorical stages that underpin the double-extortion strategy. Finally, we trace 19 Bitcoin addresses related to ransom payment chains, revealing two distinct patterns based on different laundering phases. In both cases, a small portion of the ransom is immediately split into long-lived addresses (presumably retained by the group as profit and to finance further operations) while the remainder is ultimately aggregated into two high-volume addresses before likely being sent to the affiliate. These two collector addresses appear to belong to distinct exchanges, each processing over 200k BTC. The combined evidence portrays LockBit as a tightly integrated criminal service whose resilience rests on rapid code iteration, script-driven social engineering, and industrial-scale cash-out pipelines.

#20 EASE: Practical and Efficient Safety Alignment for Small Language Models

著者: Haonan Shi, Guoli Wang, Tu Ouyang, An Wang

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.06512

要約:
Small language models (SLMs) are increasingly deployed on edge devices, making their safety alignment crucial yet challenging. Current shallow alignment methods that rely on direct refusal of malicious queries fail to provide robust protection, particularly against adversarial jailbreaks. While deliberative safety reasoning alignment offers deeper alignment for defending against sophisticated attacks, effectively implanting such reasoning capability in SLMs with limited capabilities remains an open challenge. Moreover, safety reasoning incurs significant computational overhead as models apply reasoning to nearly all queries, making it impractical for resource-constrained edge deployment scenarios that demand rapid responses. We propose EASE, a novel framework that enables practical and Efficient safety Alignment for Small languagE models. Our approach first identifies the optimal safety reasoning teacher that can effectively distill safety reasoning capabilities to SLMs. We then align models to selectively activate safety reasoning for dangerous adversarial jailbreak queries while providing direct responses to straightforward malicious queries and general helpful tasks. This selective mechanism enables small models to maintain robust safety guarantees against sophisticated attacks while preserving computational efficiency for benign interactions. Experimental results demonstrate that EASE reduces jailbreak attack success rates by up to 17% compared to shallow alignment methods while reducing inference overhead by up to 90% compared to deliberative safety reasoning alignment, making it practical for SLMs real-world edge deployments.

#21 CYPRESS: Transferring Secrets in the Shadow of Visible Packets

著者: Sirus Shahini, Robert Ricci

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.06540

要約:
Network steganography and covert communication channels have been studied extensively in the past. However, prior works offer minimal practical use for their proposed techniques and are limited to specific use cases and network protocols. In this paper, we show that covert channels in networking have a much greater potential for practical secret communication than what has been discussed before. We present a covert channel framework, CYPRESS, that creates a reliable hidden communication channel by mounting packets from secret network entities on regular packets that flow through the network, effectively transmitting a separate network traffic without generating new packets for it. CYPRESS establishes a consolidated decentralized framework in which different covert channels for various protocols are defined with their custom handler code that are plugged into the system and updated on-demand to evade detection. CYPRESS then chooses at run-time how and in what order the covert channels should be used for fragmentation and hidden transmission of data. We can reach up to 1.6MB/s of secret bandwidth in a network of ten users connected to the Internet. We demonstrate the robustness and reliability of our approach in secret communication through various security-sensitive real-world experiments. Our evaluations show that network protocols provide a notable opportunity for unconventional storage and hidden transmission of data to bypass different types of security measures and to hide the source of various cyber attacks.

#22 SteganoSNN: SNN-Based Audio-in-Image Steganography with Encryption

著者: Biswajit Kumar Sahoo, Pedro Machado, Isibor Kennedy Ihianle, Andreas Oikonomou, Srinivas Boppu

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.06573

要約:
Secure data hiding remains a fundamental challenge in digital communication, requiring a careful balance between computational efficiency and perceptual transparency. The balance between security and performance is increasingly fragile with the emergence of generative AI systems capable of autonomously generating and optimising sophisticated cryptanalysis and steganalysis algorithms, thereby accelerating the exposure of vulnerabilities in conventional data-hiding schemes. This work introduces SteganoSNN, a neuromorphic steganographic framework that exploits spiking neural networks (SNNs) to achieve secure, low-power, and high-capacity multimedia data hiding. Digitised audio samples are converted into spike trains using leaky integrate-and-fire (LIF) neurons, encrypted via a modulo-based mapping scheme, and embedded into the least significant bits of RGBA image channels using a dithering mechanism to minimise perceptual distortion. Implemented in Python using NEST and realised on a PYNQ-Z2 FPGA, SteganoSNN attains real-time operation with an embedding capacity of 8 bits per pixel. Experimental evaluations on the DIV2K 2017 dataset demonstrate image fidelity between 40.4 dB and 41.35 dB in PSNR and SSIM values consistently above 0.97, surpassing SteganoGAN in computational efficiency and robustness. SteganoSNN establishes a foundation for neuromorphic steganography, enabling secure, energy-efficient communication for Edge-AI, IoT, and biomedical applications.

#23 Secure Low-altitude Maritime Communications via Intelligent Jamming

著者: Jiawei Huang, Aimin Wang, Geng Sun, Jiahui Li, Jiacheng Wang, Weijie Yuan, Dusit Niyato, Xianbin Wang

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.06659

要約:
Low-altitude wireless networks (LAWNs) have emerged as a viable solution for maritime communications. In these maritime LAWNs, unmanned aerial vehicles (UAVs) serve as practical low-altitude platforms for wireless communications due to their flexibility and ease of deployment. However, the open and clear UAV communication channels make maritime LAWNs vulnerable to eavesdropping attacks. Existing security approaches often assume eavesdroppers follow predefined trajectories, which fails to capture the dynamic movement patterns of eavesdroppers in realistic maritime environments. To address this challenge, we consider a low-altitude maritime communication system that employs intelligent jamming to counter dynamic eavesdroppers with uncertain positioning to enhance the physical layer security. Since such a system requires balancing the conflicting performance metrics of the secrecy rate and energy consumption of UAVs, we formulate a secure and energy-efficient maritime communication multi-objective optimization problem (SEMCMOP). To solve this dynamic and long-term optimization problem, we first reformulate it as a partially observable Markov decision process (POMDP). We then propose a novel soft actor-critic with conditional variational autoencoder (SAC-CVAE) algorithm, which is a deep reinforcement learning algorithm improved by generative artificial intelligence. Specifically, the SAC-CVAE algorithm employs advantage-conditioned latent representations to disentangle and optimize policies, while enhancing computational efficiency by reducing the state space dimension. Simulation results demonstrate that our proposed intelligent jamming approach achieves secure and energy-efficient maritime communications.

#24 Adversarial Node Placement in Decentralized Federated Learning: Maximum Spanning-Centrality Strategy and Performance Analysis

著者: Adam Piaseczny, Eric Ruzomberka, Rohit Parasnis, Christopher G. Brinton

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.06742

要約:
As Federated Learning (FL) becomes more widespread, there is growing interest in its decentralized variants. Decentralized FL leverages the benefits of fast and energy-efficient device-to-device communications to obviate the need for a central server. However, this opens the door to new security vulnerabilities as well. While FL security has been a popular research topic, the role of adversarial node placement in decentralized FL remains largely unexplored. This paper addresses this gap by evaluating the impact of various coordinated adversarial node placement strategies on decentralized FL's model training performance. We adapt two threads of placement strategies to this context: maximum span-based algorithms, and network centrality-based approaches. Building on them, we propose a novel attack strategy, MaxSpAN-FL, which is a hybrid between these paradigms that adjusts node placement probabilistically based on network topology characteristics. Numerical experiments demonstrate that our attack consistently induces the largest degradation in decentralized FL models compared with baseline schemes across various network configurations and numbers of coordinating adversaries. We also provide theoretical support for why eigenvector centrality-based attacks are suboptimal in decentralized FL. Overall, our findings provide valuable insights into the vulnerabilities of decentralized FL systems, setting the stage for future research aimed at developing more secure and robust decentralized FL frameworks.

#25 Differentiated Directional Intervention A Framework for Evading LLM Safety Alignment

著者: Peng Zhang, peijie sun

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.06852

要約:
Safety alignment instills in Large Language Models (LLMs) a critical capacity to refuse malicious requests. Prior works have modeled this refusal mechanism as a single linear direction in the activation space. We posit that this is an oversimplification that conflates two functionally distinct neural processes: the detection of harm and the execution of a refusal. In this work, we deconstruct this single representation into a Harm Detection Direction and a Refusal Execution Direction. Leveraging this fine-grained model, we introduce Differentiated Bi-Directional Intervention (DBDI), a new white-box framework that precisely neutralizes the safety alignment at critical layer. DBDI applies adaptive projection nullification to the refusal execution direction while suppressing the harm detection direction via direct steering. Extensive experiments demonstrate that DBDI outperforms prominent jailbreaking methods, achieving up to a 97.88\% attack success rate on models such as Llama-2. By providing a more granular and mechanistic framework, our work offers a new direction for the in-depth understanding of LLM safety alignment.

#26 Nearly-Optimal Private Selection via Gaussian Mechanism

privacy

著者: Ethan Leeman, Pasin Manurangsi

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.06871

要約:
Steinke (2025) recently asked the following intriguing open question: Can we solve the differentially private selection problem with nearly-optimal error by only (adaptively) invoking Gaussian mechanism on low-sensitivity queries? We resolve this question positively. In particular, for a candidate set $\mathcal{Y}$, we achieve error guarantee of $\tilde{O}(\log |\mathcal{Y}|)$, which is within a factor of $(\log \log |\mathcal{Y}|)^{O(1)}$ of the exponential mechanism (McSherry and Talwar, 2007). This improves on Steinke's mechanism which achieves an error of $O(\log^{3/2} |\mathcal{Y}|)$.

#27 Uncovering Pretraining Code in LLMs: A Syntax-Aware Attribution Approach

著者: Yuanheng Li, Zhuoyang Chen, Xiaoyun Liu, Yuhao Wang, Mingwei Liu, Yang Shi, Kaifeng Huang, Shengjie Zhao

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.07033

要約:
As large language models (LLMs) become increasingly capable, concerns over the unauthorized use of copyrighted and licensed content in their training data have grown, especially in the context of code. Open-source code, often protected by open source licenses (e.g, GPL), poses legal and ethical challenges when used in pretraining. Detecting whether specific code samples were included in LLM training data is thus critical for transparency, accountability, and copyright compliance. We propose SynPrune, a syntax-pruned membership inference attack method tailored for code. Unlike prior MIA approaches that treat code as plain text, SynPrune leverages the structured and rule-governed nature of programming languages. Specifically, it identifies and excludes consequent tokens that are syntactically required and not reflective of authorship, from attribution when computing membership scores. Experimental results show that SynPrune consistently outperforms the state-of-the-arts. Our method is also robust across varying function lengths and syntax categories.

#28 Harnessing Sparsification in Federated Learning: A Secure, Efficient, and Differentially Private Realization

privacy

著者: Shuangqing Xu, Yifeng Zheng, Zhongyun Hua

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.07123

要約:
Federated learning (FL) enables multiple clients to jointly train a model by sharing only gradient updates for aggregation instead of raw data. Due to the transmission of very high-dimensional gradient updates from many clients, FL is known to suffer from a communication bottleneck. Meanwhile, the gradients shared by clients as well as the trained model may also be exploited for inferring private local datasets, making privacy still a critical concern in FL. We present Clover, a novel system framework for communication-efficient, secure, and differentially private FL. To tackle the communication bottleneck in FL, Clover follows a standard and commonly used approach-top-k gradient sparsification, where each client sparsifies its gradient update such that only k largest gradients (measured by magnitude) are preserved for aggregation. Clover provides a tailored mechanism built out of a trending distributed trust setting involving three servers, which allows to efficiently aggregate multiple sparse vectors (top-k sparsified gradient updates) into a dense vector while hiding the values and indices of non-zero elements in each sparse vector. This mechanism outperforms a baseline built on the general distributed ORAM technique by several orders of magnitude in server-side communication and runtime, with also smaller client communication cost. We further integrate this mechanism with a lightweight distributed noise generation mechanism to offer differential privacy (DP) guarantees on the trained model. To harden Clover with security against a malicious server, we devise a series of lightweight mechanisms for integrity checks on the server-side computation. Extensive experiments show that Clover can achieve utility comparable to vanilla FL with central DP, with promising performance.

#29 Privacy on the Fly: A Predictive Adversarial Transformation Network for Mobile Sensor Data

privacy

著者: Tianle Song, Chenhao Lin, Yang Cao, Zhengyu Zhao, Jiahao Sun, Chong Zhang, Le Yang, Chao Shen

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.07242

要約:
Mobile motion sensors such as accelerometers and gyroscopes are now ubiquitously accessible by third-party apps via standard APIs. While enabling rich functionalities like activity recognition and step counting, this openness has also enabled unregulated inference of sensitive user traits, such as gender, age, and even identity, without user consent. Existing privacy-preserving techniques, such as GAN-based obfuscation or differential privacy, typically require access to the full input sequence, introducing latency that is incompatible with real-time scenarios. Worse, they tend to distort temporal and semantic patterns, degrading the utility of the data for benign tasks like activity recognition. To address these limitations, we propose the Predictive Adversarial Transformation Network (PATN), a real-time privacy-preserving framework that leverages historical signals to generate adversarial perturbations proactively. The perturbations are applied immediately upon data acquisition, enabling continuous protection without disrupting application functionality. Experiments on two datasets demonstrate that PATN substantially degrades the performance of privacy inference models, achieving Attack Success Rate (ASR) of 40.11% and 44.65% (reducing inference accuracy to near-random) and increasing the Equal Error Rate (EER) from 8.30% and 7.56% to 41.65% and 46.22%. On ASR, PATN outperforms baseline methods by 16.16% and 31.96%, respectively.

#30 JPRO: Automated Multimodal Jailbreaking via Multi-Agent Collaboration Framework

agent

著者: Yuxuan Zhou, Yang Bai, Kuofeng Gao, Tao Dai, Shu-Tao Xia

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.07315

要約:
The widespread application of large VLMs makes ensuring their secure deployment critical. While recent studies have demonstrated jailbreak attacks on VLMs, existing approaches are limited: they require either white-box access, restricting practicality, or rely on manually crafted patterns, leading to poor sample diversity and scalability. To address these gaps, we propose JPRO, a novel multi-agent collaborative framework designed for automated VLM jailbreaking. It effectively overcomes the shortcomings of prior methods in attack diversity and scalability. Through the coordinated action of four specialized agents and its two core modules: Tactic-Driven Seed Generation and Adaptive Optimization Loop, JPRO generates effective and diverse attack samples. Experimental results show that JPRO achieves over a 60\% attack success rate on multiple advanced VLMs, including GPT-4o, significantly outperforming existing methods. As a black-box attack approach, JPRO not only uncovers critical security vulnerabilities in multimodal models but also offers valuable insights for evaluating and enhancing VLM robustness.

#31 AgriTrust: a Federated Semantic Governance Framework for Trusted Agricultural Data Sharing

著者: Ivan Bergier

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.05572

要約:
The potential of agricultural data (AgData) to drive efficiency and sustainability is stifled by the "AgData Paradox": a pervasive lack of trust and interoperability that locks data in silos, despite its recognized value. This paper introduces AgriTrust, a federated semantic governance framework designed to resolve this paradox. AgriTrust integrates a multi-stakeholder governance model, built on pillars of Data Sovereignty, Transparent Data Contracts, Equitable Value Sharing, and Regulatory Compliance, with a semantic digital layer. This layer is realized through the AgriTrust Core Ontology, a formal OWL ontology that provides a shared vocabulary for tokenization, traceability, and certification, enabling true semantic interoperability across independent platforms. A key innovation is a blockchain-agnostic, multi-provider architecture that prevents vendor lock-in. The framework's viability is demonstrated through case studies across three critical Brazilian supply chains: coffee (for EUDR compliance), soy (for mass balance), and beef (for animal tracking). The results show that AgriTrust successfully enables verifiable provenance, automates compliance, and creates new revenue streams for data producers, thereby transforming data sharing from a trust-based dilemma into a governed, automated operation. This work provides a foundational blueprint for a more transparent, efficient, and equitable agricultural data economy.

#32 Preserving security in a world with powerful AI Considerations for the future Defense Architecture

著者: Nicholas Generous, Brian Cook, Jason Pruet

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.05714

要約:
Advances in AI threaten to invalidate assumptions underpinning today's defense architecture. We argue that the current U.S. defense program of record, designed in an era before capable machine intelligence, cannot by itself preserve national security against rapidly emerging AI enabled threats. Instead, shoring up legacy systems must be coupled with entirely new elements of a defense architecture. We outline immediate steps to adapt the Department of Energy National Nuclear Security Administration National Laboratories to ensure agility and resilience in an era of powerful AI.

#33 CGCE: Classifier-Guided Concept Erasure in Generative Models

著者: Viet Nguyen, Vishal M. Patel

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.05865

要約:
Recent advancements in large-scale generative models have enabled the creation of high-quality images and videos, but have also raised significant safety concerns regarding the generation of unsafe content. To mitigate this, concept erasure methods have been developed to remove undesirable concepts from pre-trained models. However, existing methods remain vulnerable to adversarial attacks that can regenerate the erased content. Moreover, achieving robust erasure often degrades the model's generative quality for safe, unrelated concepts, creating a difficult trade-off between safety and performance. To address this challenge, we introduce Classifier-Guided Concept Erasure (CGCE), an efficient plug-and-play framework that provides robust concept erasure for diverse generative models without altering their original weights. CGCE uses a lightweight classifier operating on text embeddings to first detect and then refine prompts containing undesired concepts. This approach is highly scalable, allowing for multi-concept erasure by aggregating guidance from several classifiers. By modifying only unsafe embeddings at inference time, our method prevents harmful content generation while preserving the model's original quality on benign prompts. Extensive experiments show that CGCE achieves state-of-the-art robustness against a wide range of red-teaming attacks. Our approach also maintains high generative utility, demonstrating a superior balance between safety and performance. We showcase the versatility of CGCE through its successful application to various modern T2I and T2V models, establishing it as a practical and effective solution for safe generative AI.

#34 CatBack: Universal Backdoor Attacks on Tabular Data via Categorical Encoding

backdoor

著者: Behrad Tajalli, Stefanos Koffas, Stjepan Picek

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.06072

要約:
Backdoor attacks in machine learning have drawn significant attention for their potential to compromise models stealthily, yet most research has focused on homogeneous data such as images. In this work, we propose a novel backdoor attack on tabular data, which is particularly challenging due to the presence of both numerical and categorical features. Our key idea is a novel technique to convert categorical values into floating-point representations. This approach preserves enough information to maintain clean-model accuracy compared to traditional methods like one-hot or ordinal encoding. By doing this, we create a gradient-based universal perturbation that applies to all features, including categorical ones. We evaluate our method on five datasets and four popular models. Our results show up to a 100% attack success rate in both white-box and black-box settings (including real-world applications like Vertex AI), revealing a severe vulnerability for tabular data. Our method is shown to surpass the previous works like Tabdoor in terms of performance, while remaining stealthy against state-of-the-art defense mechanisms. We evaluate our attack against Spectral Signatures, Neural Cleanse, Beatrix, and Fine-Pruning, all of which fail to defend successfully against it. We also verify that our attack successfully bypasses popular outlier detection mechanisms.

#35 A Visual Perception-Based Tunable Framework and Evaluation Benchmark for H.265/HEVC ROI Encryption

著者: Xiang Zhang, Geng Wu, Wenbin Huang, Daoyong Fu, Fei Peng, Zhangjie Fu

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.06394

要約:
ROI selective encryption, as an efficient privacy protection technique, encrypts only the key regions in the video, thereby ensuring security while minimizing the impact on coding efficiency. However, existing ROI-based video encryption methods suffer from insufficient flexibility and lack of a unified evaluation system. To address these issues, we propose a visual perception-based tunable framework and evaluation benchmark for H.265/HEVC ROI encryption. Our scheme introduces three key contributions: 1) A ROI region recognition module based on visual perception network is proposed to accurately identify the ROI region in videos. 2) A three-level tunable encryption strategy is implemented while balancing security and real-time performance. 3) A unified ROI encryption evaluation benchmark is developed to provide a standardized quantitative platform for subsequent research. This triple strategy provides new solution and significant unified performance evaluation methods for ROI selective encryption field. Experimental results indicate that the proposed benchmark can comprehensively measure the performance of the ROI selective encryption. Compared to existing ROI encryption algorithms, our proposed enhanced and advanced level encryption exhibit superior performance in multiple performance metrics. In general, the proposed framework effectively meets the privacy protection requirements in H.265/HEVC and provides a reliable solution for secure and efficient processing of sensitive video content.

#36 Efficient LLM Safety Evaluation through Multi-Agent Debate

agent

著者: Dachuan Lin, Guobin Shen, Zihao Yang, Tianrong Liu, Dongcheng Zhao, Yi Zeng

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.06396

要約:
Safety evaluation of large language models (LLMs) increasingly relies on LLM-as-a-Judge frameworks, but the high cost of frontier models limits scalability. We propose a cost-efficient multi-agent judging framework that employs Small Language Models (SLMs) through structured debates among critic, defender, and judge agents. To rigorously assess safety judgments, we construct HAJailBench, a large-scale human-annotated jailbreak benchmark comprising 12,000 adversarial interactions across diverse attack methods and target models. The dataset provides fine-grained, expert-labeled ground truth for evaluating both safety robustness and judge reliability. Our SLM-based framework achieves agreement comparable to GPT-4o judges on HAJailBench while substantially reducing inference cost. Ablation results show that three rounds of debate yield the optimal balance between accuracy and efficiency. These findings demonstrate that structured, value-aligned debate enables SLMs to capture semantic nuances of jailbreak attacks and that HAJailBench offers a reliable foundation for scalable LLM safety evaluation.

#37 PhaseSeed: Precise Call Graph Construction for Split-Phase Applications using Dynamic Seeding

著者: Tapti Palit, Seyedhamed Ghavamnia, Michalis Polychronakis

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.06661

要約:
Precise and sound call graph construction is crucial for many software security mechanisms. Unfortunately, traditional static pointer analysis techniques used to generate application call graphs suffer from imprecision. These techniques are agnostic to the application's architecture and are designed for broad applicability. To mitigate this precision problem, we propose PhaseSeed, a novel technique that improves the accuracy of pointer analysis for split-phase applications, which have distinct initialization and processing phases. PhaseSeed analyzes the initialization phase dynamically, collecting the points-to relationships established at runtime. At the end of the initialization phase, it then seeds this information to a static analysis stage that performs pointer analysis for all code that stays in scope during the processing phase, improving precision. Our observations show that, given the same runtime configuration options, the points-to relationships established during the initialization phase remain constant across multiple runs. Therefore, PhaseSeed is sound with respect to a given initial configuration. We apply PhaseSeed to three security mechanisms: control flow integrity (CFI), software debloating, and system call filtering. PhaseSeed provides up to 92.6% precision improvement for CFI compared to static call graph construction techniques, and filters nine additional security-critical system calls when used to generate Seccomp profiles.

#38 Generalized Security-Preserving Refinement for Concurrent Systems

著者: Huan Sun, David San\'an, Jingyi Wang, Yongwang Zhao, Jun Sun, Wenhai Wang

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.06862

要約:
Ensuring compliance with Information Flow Security (IFS) is known to be challenging, especially for concurrent systems with large codebases such as multicore operating system (OS) kernels. Refinement, which verifies that an implementation preserves certain properties of a more abstract specification, is promising for tackling such challenges. However, in terms of refinement-based verification of security properties, existing techniques are still restricted to sequential systems or lack the expressiveness needed to capture complex security policies for concurrent systems. In this work, we present a generalized security-preserving refinement technique, particularly for verifying the IFS of concurrent systems governed by potentially complex security policies. We formalize the IFS properties for concurrent systems and present a refinement-based compositional approach to prove that the generalized security properties (e.g., intransitive noninterference) are preserved between implementation and abstraction. The key intuition enabling such reasoning, compared to previous refinement work, is to establish a step-mapping relation between the implementation and the abstraction, which is sufficient to ensure that every paired step (in the abstraction and the implementation, respectively) is either permitted or prohibited by the security policy. We apply our approach to verify two non-trivial case studies against a collection of security policies. Our proofs are fully mechanized in Isabelle/HOL, during which we identified that two covert channels previously reported in the ARINC 653 single-core standard also exist in the ARINC 653 multicore standard. We subsequently proved the correctness of the revised mechanism, showcasing the effectiveness of our approach.

#39 HLPD: Aligning LLMs to Human Language Preference for Machine-Revised Text Detection

著者: Fangqi Dai, Xingjian Jiang, Zizhuang Deng

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.06942

要約:
To prevent misinformation and social issues arising from trustworthy-looking content generated by LLMs, it is crucial to develop efficient and reliable methods for identifying the source of texts. Previous approaches have demonstrated exceptional performance in detecting texts fully generated by LLMs. However, these methods struggle when confronting more advanced LLM output or text with adversarial multi-task machine revision, especially in the black-box setting, where the generating model is unknown. To address this challenge, grounded in the hypothesis that human writing possesses distinctive stylistic patterns, we propose Human Language Preference Detection (HLPD). HLPD employs a reward-based alignment process, Human Language Preference Optimization (HLPO), to shift the scoring model's token distribution toward human-like writing, making the model more sensitive to human writing, therefore enhancing the identification of machine-revised text. We test HLPD in an adversarial multi-task evaluation framework that leverages a five-dimensional prompt generator and multiple advanced LLMs to create diverse revision scenarios. When detecting texts revised by GPT-series models, HLPD achieves a 15.11% relative improvement in AUROC over ImBD, surpassing Fast-DetectGPT by 45.56%. When evaluated on texts generated by advanced LLMs, HLPD achieves the highest average AUROC, exceeding ImBD by 5.53% and Fast-DetectGPT by 34.14%. Code will be made available at https://github.com/dfq2021/HLPD.

#40 3D-ANC: Adaptive Neural Collapse for Robust 3D Point Cloud Recognition

著者: Yuanmin Huang, Wenxuan Li, Mi Zhang, Xiaohan Zhang, Xiaoyu You, Min Yang

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.07040

要約:
Deep neural networks have recently achieved notable progress in 3D point cloud recognition, yet their vulnerability to adversarial perturbations poses critical security challenges in practical deployments. Conventional defense mechanisms struggle to address the evolving landscape of multifaceted attack patterns. Through systematic analysis of existing defenses, we identify that their unsatisfactory performance primarily originates from an entangled feature space, where adversarial attacks can be performed easily. To this end, we present 3D-ANC, a novel approach that capitalizes on the Neural Collapse (NC) mechanism to orchestrate discriminative feature learning. In particular, NC depicts where last-layer features and classifier weights jointly evolve into a simplex equiangular tight frame (ETF) arrangement, establishing maximally separable class prototypes. However, leveraging this advantage in 3D recognition confronts two substantial challenges: (1) prevalent class imbalance in point cloud datasets, and (2) complex geometric similarities between object categories. To tackle these obstacles, our solution combines an ETF-aligned classification module with an adaptive training framework consisting of representation-balanced learning (RBL) and dynamic feature direction loss (FDL). 3D-ANC seamlessly empowers existing models to develop disentangled feature spaces despite the complexity in 3D data distribution. Comprehensive evaluations state that 3D-ANC significantly improves the robustness of models with various structures on two datasets. For instance, DGCNN's classification accuracy is elevated from 27.2% to 80.9% on ModelNet40 -- a 53.7% absolute gain that surpasses leading baselines by 34.0%.

#41 From Pretrain to Pain: Adversarial Vulnerability of Video Foundation Models Without Task Knowledge

著者: Hui Lu, Yi Yu, Song Xia, Yiming Yang, Deepu Rajan, Boon Poh Ng, Alex Kot, Xudong Jiang

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.07049

要約:
Large-scale Video Foundation Models (VFMs) has significantly advanced various video-related tasks, either through task-specific models or Multi-modal Large Language Models (MLLMs). However, the open accessibility of VFMs also introduces critical security risks, as adversaries can exploit full knowledge of the VFMs to launch potent attacks. This paper investigates a novel and practical adversarial threat scenario: attacking downstream models or MLLMs fine-tuned from open-source VFMs, without requiring access to the victim task, training data, model query, and architecture. In contrast to conventional transfer-based attacks that rely on task-aligned surrogate models, we demonstrate that adversarial vulnerabilities can be exploited directly from the VFMs. To this end, we propose the Transferable Video Attack (TVA), a temporal-aware adversarial attack method that leverages the temporal representation dynamics of VFMs to craft effective perturbations. TVA integrates a bidirectional contrastive learning mechanism to maximize the discrepancy between the clean and adversarial features, and introduces a temporal consistency loss that exploits motion cues to enhance the sequential impact of perturbations. TVA avoids the need to train expensive surrogate models or access to domain-specific data, thereby offering a more practical and efficient attack strategy. Extensive experiments across 24 video-related tasks demonstrate the efficacy of TVA against downstream models and MLLMs, revealing a previously underexplored security vulnerability in the deployment of video models.

#42 Improving Deepfake Detection with Reinforcement Learning-Based Adaptive Data Augmentation

著者: Yuxuan Zhou, Tao Yu, Wen Huang, Yuheng Zhang, Tao Dai, Shu-Tao Xia

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.07051

要約:
The generalization capability of deepfake detectors is critical for real-world use. Data augmentation via synthetic fake face generation effectively enhances generalization, yet current SoTA methods rely on fixed strategies-raising a key question: Is a single static augmentation sufficient, or does the diversity of forgery features demand dynamic approaches? We argue existing methods overlook the evolving complexity of real-world forgeries (e.g., facial warping, expression manipulation), which fixed policies cannot fully simulate. To address this, we propose CRDA (Curriculum Reinforcement-Learning Data Augmentation), a novel framework guiding detectors to progressively master multi-domain forgery features from simple to complex. CRDA synthesizes augmented samples via a configurable pool of forgery operations and dynamically generates adversarial samples tailored to the detector's current learning state. Central to our approach is integrating reinforcement learning (RL) and causal inference. An RL agent dynamically selects augmentation actions based on detector performance to efficiently explore the vast augmentation space, adapting to increasingly challenging forgeries. Simultaneously, the agent introduces action space variations to generate heterogeneous forgery patterns, guided by causal inference to mitigate spurious correlations-suppressing task-irrelevant biases and focusing on causally invariant features. This integration ensures robust generalization by decoupling synthetic augmentation patterns from the model's learned representations. Extensive experiments show our method significantly improves detector generalizability, outperforming SOTA methods across multiple cross-domain datasets.

#43 E2E-VGuard: Adversarial Prevention for Production LLM-based End-To-End Speech Synthesis

著者: Zhisheng Zhang, Derui Wang, Yifan Mi, Zhiyong Wu, Jie Gao, Yuxin Cao, Kai Ye, Minhui Xue, Jie Hao

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.07099

要約:
Recent advancements in speech synthesis technology have enriched our daily lives, with high-quality and human-like audio widely adopted across real-world applications. However, malicious exploitation like voice-cloning fraud poses severe security risks. Existing defense techniques struggle to address the production large language model (LLM)-based speech synthesis. While previous studies have considered the protection for fine-tuning synthesizers, they assume manually annotated transcripts. Given the labor intensity of manual annotation, end-to-end (E2E) systems leveraging automatic speech recognition (ASR) to generate transcripts are becoming increasingly prevalent, e.g., voice cloning via commercial APIs. Therefore, this E2E speech synthesis also requires new security mechanisms. To tackle these challenges, we propose E2E-VGuard, a proactive defense framework for two emerging threats: (1) production LLM-based speech synthesis, and (2) the novel attack arising from ASR-driven E2E scenarios. Specifically, we employ the encoder ensemble with a feature extractor to protect timbre, while ASR-targeted adversarial examples disrupt pronunciation. Moreover, we incorporate the psychoacoustic model to ensure perturbative imperceptibility. For a comprehensive evaluation, we test 16 open-source synthesizers and 3 commercial APIs across Chinese and English datasets, confirming E2E-VGuard's effectiveness in timbre and pronunciation protection. Real-world deployment validation is also conducted. Our code and demo page are available at https://wxzyd123.github.io/e2e-vguard/.

#44 On Stealing Graph Neural Network Models

著者: Marcin Podhajski, Jan Dubi\'nski, Franziska Boenisch, Adam Dziedzic, Agnieszka Pr\k{e}gowska, Tomasz P. Michalak

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.07170

要約:
Current graph neural network (GNN) model-stealing methods rely heavily on queries to the victim model, assuming no hard query limits. However, in reality, the number of allowed queries can be severely limited. In this paper, we demonstrate how an adversary can extract the GNN with very limited interactions with the model. Our approach first enables the adversary to obtain the model backbone without making direct queries to the victim model and then to strategically utilize a fixed query limit to extract the most informative data. The experiments on eight real-world datasets demonstrate the effectiveness of the attack, even under a very restricted query limit and under defense against model extraction in place. Our findings underscore the need for robust defenses against GNN model extraction threats.

#45 LiteUpdate: A Lightweight Framework for Updating AI-Generated Image Detectors

著者: Jiajie Lu, Zhenkan Fu, Na Zhao, Long Xing, Kejiang Chen, Weiming Zhang, Nenghai Yu

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.07192

要約:
The rapid progress of generative AI has led to the emergence of new generative models, while existing detection methods struggle to keep pace, resulting in significant degradation in the detection performance. This highlights the urgent need for continuously updating AI-generated image detectors to adapt to new generators. To overcome low efficiency and catastrophic forgetting in detector updates, we propose LiteUpdate, a lightweight framework for updating AI-generated image detectors. LiteUpdate employs a representative sample selection module that leverages image confidence and gradient-based discriminative features to precisely select boundary samples. This approach improves learning and detection accuracy on new distributions with limited generated images, significantly enhancing detector update efficiency. Additionally, LiteUpdate incorporates a model merging module that fuses weights from multiple fine-tuning trajectories, including pre-trained, representative, and random updates. This balances the adaptability to new generators and mitigates the catastrophic forgetting of prior knowledge. Experiments demonstrate that LiteUpdate substantially boosts detection performance in various detectors. Specifically, on AIDE, the average detection accuracy on Midjourney improved from 87.63% to 93.03%, a 6.16% relative increase.

#46 Breaking the Stealth-Potency Trade-off in Clean-Image Backdoors with Generative Trigger Optimization

backdoor

著者: Binyan Xu, Fan Yang, Di Tang, Xilin Dai, Kehuan Zhang

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.07210

要約:
Clean-image backdoor attacks, which use only label manipulation in training datasets to compromise deep neural networks, pose a significant threat to security-critical applications. A critical flaw in existing methods is that the poison rate required for a successful attack induces a proportional, and thus noticeable, drop in Clean Accuracy (CA), undermining their stealthiness. This paper presents a new paradigm for clean-image attacks that minimizes this accuracy degradation by optimizing the trigger itself. We introduce Generative Clean-Image Backdoors (GCB), a framework that uses a conditional InfoGAN to identify naturally occurring image features that can serve as potent and stealthy triggers. By ensuring these triggers are easily separable from benign task-related features, GCB enables a victim model to learn the backdoor from an extremely small set of poisoned examples, resulting in a CA drop of less than 1%. Our experiments demonstrate GCB's remarkable versatility, successfully adapting to six datasets, five architectures, and four tasks, including the first demonstration of clean-image backdoors in regression and segmentation. GCB also exhibits resilience against most of the existing backdoor defenses.

#47 On the success probability of the quantum algorithm for the short DLP

著者: Martin Eker{\aa}

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2309.01754

要約:
Eker{\aa} and H{\aa}stad have introduced a variation of Shor's algorithm for the discrete logarithm problem (DLP). Unlike Shor's original algorithm, Eker{\aa}-H{\aa}stad's algorithm solves the short DLP in groups of unknown order. In this work, we prove a lower bound on the probability of Eker{\aa}-H{\aa}stad's algorithm recovering the short logarithm $d$ in a single run. By our bound, the success probability can easily be pushed as high as $1 - 10^{-10}$ for any short $d$. A key to achieving such a high success probability is to efficiently perform a limited search in the classical post-processing by leveraging meet-in-the-middle techniques. Asymptotically, in the limit as the bit length $m$ of $d$ tends to infinity, the success probability tends to one if the limits on the search space are parameterized in $m$. Our results are directly applicable to Diffie-Hellman in safe-prime groups with short exponents, and to RSA via a reduction from the RSA integer factoring problem (IFP) to the short DLP.

#48 Unaware, Unfunded and Uneducated: A Systematic Review of SME Cybersecurity

著者: Carlos Rombaldo Junior, Ingolf Becker, Shane Johnson

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2309.17186

要約:
Small and Medium Enterprises (SMEs) are pivotal in the global economy, accounting for over 90% of businesses and 60% of employment worldwide. Despite their significance, SMEs are often disregarded in cybersecurity initiatives, rendering them ill-equipped to deal with the growing frequency, sophistication, and destructiveness of cyberattacks. We systematically reviewed the cybersecurity literature on SMEs published between 2017 and 2024. We focus on research discussing cyber threats, adopted controls, challenges, and constraints SMEs face in pursuing cybersecurity resilience. Our search yielded 1090 studies that we narrowed to 132 relevant papers. We identified 44 unique themes and categorised them as novel findings or established knowledge. This distinction revealed that research on SMEs is shallow and has made little progress in understanding SMEs' roles, threats, and needs. Studies often repeated early discoveries without replicating or offering new insights. Existing research indicates that the main challenges to attaining cybersecurity resilience of SMEs are a lack of awareness of cybersecurity risks, limited cybersecurity literacy, and constrained financial resources. Resource availability varied between developed and developing countries. Our analysis indicated a relationship among these themes, suggesting that limited literacy is the root cause of awareness and resource constraint issues.

#49 Seagull: Privacy preserving network verification system

privacy

著者: Jaber Daneshamooz, Melody Yu, Sucheer Maddury

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2402.08956

要約:
The Internet relies on routing protocols to direct traffic efficiently across interconnected networks, with the Border Gateway Protocol (BGP) serving as the core mechanism managing routing between autonomous systems. However, BGP configurations are largely manual, making them susceptible to human errors that can lead to outages or security vulnerabilities. Verifying the correctness and convergence of BGP configurations is therefore essential for maintaining a stable and secure Internet. Yet, this verification process faces two key challenges: preserving the privacy of proprietary routing information and ensuring scalability across large, distributed networks. This paper introduces a privacy-preserving verification framework that leverages multiparty computation (MPC) to validate BGP configurations without exposing sensitive routing data. Our approach overcomes both privacy and scalability challenges by ensuring that no information beyond the verification outcome is revealed. Through formal analysis, we show that the proposed method achieves strong privacy guarantees and practical scalability, providing a secure and efficient foundation for verifying BGP-based routing in the Internet backbone.

#50 Obfuscated Location Disclosure for Remote ID Enabled Drones

著者: Alessandro Brighente, Mauro Conti, Matthijs Schotsman, Savio Sciancalepore

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2407.14256

要約:
The Remote ID (RID) regulation recently introduced by several aviation authorities worldwide (including the US and EU) forces commercial drones to regularly (max. every second) broadcast plaintext messages on the wireless channel, providing information about the drone identifier and current location, among others. Although these regulations increase the accountability of drone operations and improve traffic management, they allow malicious users to track drones via the disclosed information, possibly leading to drone capture and severe privacy leaks. In this paper, we propose Obfuscated Location disclOsure for RID-enabled drones (OLO-RID), a solution modifying and extending the RID regulation while preserving drones' location privacy. Rather than disclosing the actual drone's location, drones equipped with OLO-RID disclose a differentially private obfuscated location in a mobile scenario. OLO-RID also extends RID messages with encrypted location information, accessible only by authorized entities and valuable to obtain the current drone's location in safety-critical use cases. We design, implement, and deploy OLO-RID on a Raspberry Pi 3 and release the code of our implementation as open-source. We also perform an extensive performance assessment of the runtime overhead of our solution in terms of processing, communication, memory, and energy consumption. We show that OLO-RID can generate RID messages on a constrained device in less than 0.16 s while also requiring a minimal energy toll on a relevant device (0.0236% of energy for a DJI Mini 2). We also evaluate the utility of the proposed approach in the context of three reference use cases involving the drones' location usage, demonstrating minimal performance degradation when trading off location privacy and utility for next-generation RID-compliant drone ecosystems.

#51 Automatically Detecting Checked-In Secrets in Android Apps: How Far Are We?

著者: Kevin Li, Lin Ling, Jinqiu Yang, Lili Wei

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2412.10922

要約:
Mobile apps are predominantly integrated with cloud services to benefit from enhanced functionalities. Adopting authentication using secrets such as API keys is crucial to ensure secure mobile-cloud interactions. However, developers often overlook the proper storage of such secrets, opting to put them directly into their projects. These secrets are checked into the projects and can be easily extracted and exploited by malicious adversaries. While many researchers investigated the issue of checked-in secret in open-source projects, there is a notable research gap concerning checked-in secrets in Android apps deployed on platforms such as Google Play Store. Unlike open-source projects, the lack of direct access to the source code and the presence of obfuscation complicates the checked-in secret detection for Android apps. This motivates us to conduct an empirical analysis to measure and compare the performance of different checked-in secret detection tools on Android apps. We first conducted a literature review to find all the checked-in secret detection tools that can be applied to Android apps. Then, we evaluate three representative tools on 5,135 Android apps, comparing their performance and analyzing their limitations. Our experiment reveals 2,142 checked-in secrets affecting 2,115 Android apps. We also disclose that the current checked-in secret detection techniques suffer from key limitations. All of the evaluated tools can miss a significant number of checked-in secrets in Android apps. Nevertheless, we observed that the tools are complimentary, suggesting the possibility of developing a more effective checked-in secret detection tool by combining their insights. Additionally, we propose that analyzing string groups within methods containing checked-in secrets may provide a more effective strategy to overcome obfuscation challenges.

#52 When Should Selfish Miners Double-Spend?

著者: Mustafa Doger, Sennur Ulukus

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2501.03227

要約:
Conventional double-spending attack models ignore the revenue losses stemming from the orphan blocks. On the other hand, selfish mining literature usually ignores the chance of the attacker to double-spend at no-cost in each attack cycle. In this paper, we give a rigorous stochastic analysis of an attack where the goal of the adversary is to double-spend while mining selfishly. To do so, we first combine stubborn and selfish mining attacks, i.e., construct a strategy where the attacker acts stubborn until its private branch reaches a certain length and then switches to act selfish. We provide the optimal stubbornness for each parameter regime. Next, we provide the maximum stubbornness that is still more profitable than honest mining and argue a connection between the level of stubbornness and the $k$-confirmation rule. We show that, at each attack cycle, if the level of stubbornness is higher than $k$, the adversary gets a free shot at double-spending. At each cycle, for a given stubbornness level, we rigorously formulate how great the probability of double-spending is. We further modify the attack in the stubborn regime in order to conceal the attack and increase the double-spending probability.

#53 Low-altitude UAV Friendly-Jamming for Satellite-Maritime Communications via Generative AI-enabled Deep Reinforcement Learning

著者: Jiawei Huang, Aimin Wang, Geng Sun, Jiahui Li, Jiacheng Wang, Dusit Niyato, Victor C. M. Leung

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2501.15468

要約:
Low Earth orbit (LEO) satellites can be used to assist maritime wireless communications for wide-area data transmission. However, the extensive coverage of LEO satellites, combined with the openness of channels, can cause the communication process to suffer from security risks. This paper presents a LEO satellite-maritime communication system assisted by low-altitude unmanned aerial vehicle (UAV) friendly-jamming to ensure data security at the physical layer. Since such a system requires balancing the conflicting performance metrics of secrecy rate and energy consumption of the UAV to meet evolving scenario demands, we formulate a secure satellite-maritime communication multi-objective optimization problem (SSMCMOP). In order to solve the dynamic and long-term optimization problem, we reformulate it into a Markov decision process. We then propose a transformer-enhanced soft actor-critic (TransSAC) algorithm, which is a generative artificial intelligence-enabled deep reinforcement learning approach to solve the reformulated problem, thus capturing strong temporal correlations and diversely exploring weights. Simulation results demonstrate that the TransSAC algorithm outperforms comparative approaches and algorithms, maximizing the secrecy rate while effectively minimizing the energy consumption of the UAV. Moreover, the results identify more suitable constraints for the system.

#54 Oblivious Digital Tokens

著者: Mihael Liskij, Xuhua Ding, Gene Tsudik, David Basin

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2503.03494

要約:
A computing device typically identifies itself by exhibiting unique measurable behavior or by proving its knowledge of a secret. In both cases, the identifying device must reveal information to a verifier. Considerable research has focused on protecting identifying entities (provers) and reducing the amount of leaked data. However, little has been done to conceal the fact that the verification occurred. We show how this problem naturally arises in the context of digital emblems, which were recently proposed by the International Committee of the Red Cross to protect digital resources during cyber-conflicts. To address this new and important open problem, we define a new primitive, called an Oblivious Digital Token (ODT) that can be verified obliviously. Verifiers can use this procedure to check whether a device has an ODT without revealing to any other parties (including the device itself) that this check occurred. We demonstrate the feasibility of ODTs and present a concrete construction that provably meets the ODT security requirements, even if the prover device's software is fully compromised. We also implement a prototype of the proposed construction and evaluate its performance, thereby confirming its practicality.

#55 Secret-Key Generation from Private Identifiers under Channel Uncertainty

privacy

著者: Vamoua Yachongka, R\'emi A. Chou

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2503.08632

要約:
This study investigates secret-key generation for device authentication using physical identifiers, such as responses from physical unclonable functions (PUFs). The system includes two legitimate terminals (encoder and decoder) and an eavesdropper (Eve), each with access to different measurements of the identifier. From the device identifier, the encoder generates a secret key, which is securely stored in a private database, along with helper data that is saved in a public database accessible by the decoder for key reconstruction. Eve, who also has access to the public database, may use both her own measurements and the helper data to attempt to estimate the secret key and identifier. Our setup focuses on authentication scenarios where channel statistics are uncertain, with the involved parties employing multiple antennas to enhance signal reception. Our contributions include deriving inner and outer bounds on the optimal trade-off among secret-key, storage, and privacy-leakage rates for general discrete sources, and showing that these bounds are tight for Gaussian sources.

#56 Meta SecAlign: A Secure Foundation LLM Against Prompt Injection Attacks

著者: Sizhe Chen, Arman Zharmagambetov, David Wagner, Chuan Guo

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2507.02735

要約:
Prompt injection attack has been listed as the top-1 security threat to LLM-integrated applications, which interact with external environment data for complex tasks. The untrusted data may contain an injected prompt trying to arbitrarily manipulate the system. Model-level prompt injection defenses have shown strong effectiveness, but are currently deployed into commercial-grade models in a closed-source manner. We believe open-source secure models are needed by the AI security community, where co-development of attacks and defenses through open research drives scientific progress in mitigating prompt injection attacks. To this end, we develop Meta SecAlign, the first fully open-source LLM with built-in model-level defense that achieves commercial-grade performance, powerful enough for complex agentic tasks. We provide complete details of our training recipe, an improved version of the SOTA SecAlign defense. We perform the most comprehensive evaluation to date on 9 utility benchmarks and 7 security benchmarks on general knowledge, instruction following, and agentic workflows. Results show that Meta SecAlign, despite being trained on generic instruction-tuning samples only, surprisingly confers security in unseen downstream tasks, including tool-calling and web-navigation, in addition to general instruction-following. Our best model -- Meta-SecAlign-70B -- establishes a new frontier of utility-security trade-off for open-source LLMs. Even compared to closed-course commercial models such as GPT-5, our model is much securer than most of them. Below are links for the code (https://github.com/facebookresearch/Meta_SecAlign), Meta-SecAlign-70B(https://huggingface.co/facebook/Meta-SecAlign-70B), and Meta-SecAlign-8B(https://huggingface.co/facebook/Meta-SecAlign-8B) models.

#57 White-Basilisk: A Hybrid Model for Code Vulnerability Detection

著者: Ioannis Lamprou, Alexander Shevtsov, Ioannis Arapakis, Sotiris Ioannidis

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2507.08540

要約:
The proliferation of software vulnerabilities presents a significant challenge to cybersecurity, necessitating more effective detection methodologies. We introduce White-Basilisk, a novel approach to vulnerability detection that demonstrates superior performance while challenging prevailing assumptions in AI model scaling. Utilizing an innovative architecture that integrates Mamba layers, linear self-attention, and a Mixture of Experts framework, White-Basilisk achieves state-of-the-art results in vulnerability detection tasks with a parameter count of only 200M. The model's capacity to process sequences of unprecedented length enables comprehensive analysis of extensive codebases in a single pass, surpassing the context limitations of current Large Language Models (LLMs). White-Basilisk exhibits robust performance on imbalanced, real-world datasets, while maintaining computational efficiency that facilitates deployment across diverse organizational scales. This research not only establishes new benchmarks in code security but also provides empirical evidence that compact, efficiently designed models can outperform larger counterparts in specialized tasks, potentially redefining optimization strategies in AI development for domain-specific applications.

#58 Resolving Indirect Calls in Binary Code via Cross-Reference Augmented Graph Neural Networks

著者: Haotian Zhang, Kun Liu, Cristian Garces, Chenke Luo, Yu Lei, Jiang Ming

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2507.18801

要約:
Binary code analysis is essential in scenarios where source code is unavailable, with extensive applications across various security domains. However, accurately resolving indirect call targets remains a longstanding challenge in maintaining the integrity of static analysis in binary code. This difficulty arises because the operand of a call instruction (e.g., call rax) remains unknown until runtime, resulting in an incomplete inter-procedural control flow graph (CFG). Previous approaches have struggled with low accuracy and limited scalability. To address these limitations, recent work has increasingly turned to machine learning (ML) to enhance analysis. However, this ML-driven approach faces two significant obstacles: low-quality callsite-callee training pairs and inadequate binary code representation, both of which undermine the accuracy of ML models. In this paper, we introduce CupidCall, a novel approach for resolving indirect calls using graph neural networks. Existing ML models in this area often overlook key elements such as data and code cross-references, which are essential for understanding a program's control flow. In contrast, CupidCall augments CFGs with cross-references, preserving rich semantic information. Additionally, we leverage advanced compiler-level type analysis to generate high-quality callsite-callee training pairs, enhancing model precision and reliability. We further design a graph neural model that leverages augmented CFGs and relational graph convolutions for accurate target prediction. Evaluated against real-world binaries from GitHub and the Arch User Repository on x86_64 architecture, CupidCall achieves an F1 score of 95.2%, outperforming state-of-the-art ML-based approaches. These results highlight CupidCall's effectiveness in building precise inter-procedural CFGs and its potential to advance downstream binary analysis and security applications.

#59 An Overview of 7726 User Reports: Uncovering SMS Scams and Scammer Strategies

著者: Sharad Agarwal, Guillermo Suarez-Tangil, Marie Vasek

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2508.05276

要約:
Mobile network operators implement firewalls to stop illicit messages, but scammers find ways to evade detection. Previous work has looked into SMS texts that are blocked by these firewalls. However, there is little insight into SMS texts that bypass them and reach users. To this end, we collaborate with a major mobile network operator to receive 1.35m user reports submitted over four months. We find 89.16% of user reports comprise text messages, followed by reports of suspicious calls and URLs. Using our methodological framework, we identify 35.12% of the unique text messages reported by users as spam, while 40.27% are scam text messages. This is the first paper that investigates SMS reports submitted by users and differentiates between spam and scams. Our paper classifies the identified scam text messages into 12 scam types, of which the most popular is 'wrong number' scams. We explore the various infrastructure services that scammers abuse to conduct SMS scams, including mobile network operators and hosting infrastructure, and analyze the text of the scam messages to understand how scammers lure victims into providing them with their personal or financial details.

#60 CISAF: A Framework for Estimating the Security Posture of Academic and Research Cyberinfrastructure

著者: Qishen Liang, Jelena Mirkovic, Brian Kocoloski

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2509.00266

要約:
Academic and research cyberinfrastructures (AR-CIs) present unique security challenges due to their collaborative nature, heterogeneous components, and the lack of practical security assessment frameworks tailored to their needs. We propose Cyber Infrastructure Security Analysis Framework (CISAF) -- a simple, systematic, mission-centric approach to analyze the security posture of a CI and prioritize mitigation actions. CISAF guides administrators through a top-down process: (1) defining unacceptable losses, (2) identifying associated system hazards and critical assets, (3) analyzing possible attack paths that target these critical assets, and (4) analyzing security mechanisms that lie on these attack paths. By combining information about the CI architecture, mission, attack vectors, and security mechanisms, CISAF provides a clear overview of potential security risks and offers valuable information to prioritize mitigation actions.

#61 Odoo-based Subcontract Inter-site Access Control Mechanism for Construction Projects

著者: Huy Hung Ho, Nhan Le Thanh, Nam Nguyen Hong, Phuong-D Nguyen

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2509.05149

要約:
In the era of Construction 4.0, the industry is embracing a new paradigm of labor elasticity, driven by smart and flexible outsourcing and subcontracting strategies. The increased reliance on specialized subcontractors enables companies to scale labor dynamically based on project demands. This adaptable workforce model presents challenges in managing hierarchical integration and coordinating inter-site collaboration. Our design introduces a subsystem integrated into the Odoo ERP framework, employing a modular architecture to streamline labor management, task tracking, and approval workflows. The system adopts a three-pronged approach to ensure synchronized data exchange between general contractors and subcontractors, while maintaining both security and operational independence. The system features hybrid access control, third-party integration for cross-domain communication, and role-based mapping algorithm across sites. The system supports varying degrees of customization through a unified and consolidated attribute mapping center. This center leverages a tree-like index structure and Lagrange interpolation method to enhance the efficiency of role mapping. Demonstrations highlight practical application in outsourcing, integration, and scalability scenarios, confirming the system's robustness under high user volumes and in offline conditions. Experimental results further show improvements in database performance and workflow adaptability to support a scalable, enterprise-level solution that aligns with the evolving demands of smart construction management.

#62 Security-aware Semantic-driven ISAC via Paired Adversarial Residual Networks

著者: Yu Liu, Boxiang He, Fanggang Wang

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2509.20835

要約:
This paper proposes a novel and flexible security-aware semantic-driven integrated sensing and communication (ISAC) framework, namely security semantic ISAC (SS-ISAC). Inspired by the positive impact of the adversarial attack, a pair of pluggable encryption and decryption modules is designed in the proposed SS-ISAC framework. The encryption module is installed after the semantic transmitter, adopting a trainable adversarial residual network (ARN) to create the adversarial attack. Correspondingly, the decryption module before the semantic receiver utilizes another trainable ARN to mitigate the adversarial attack and noise. These two modules can be flexibly assembled considering the system security demands, without drastically modifying the hardware infrastructure. To ensure the sensing and communication (SAC) performance while preventing the eavesdropping threat, the above ARNs are jointly optimized by minimizing a carefully designed loss function that relates to the adversarial attack power, SAC performance, as well as the privacy leakage risk. Simulation results validate the effectiveness of the proposed SS-ISAC framework in terms of both SAC and eavesdropping prevention performance.

#63 VeriLLM: A Lightweight Framework for Publicly Verifiable Decentralized Inference

著者: Ke Wang, Zishuo Zhao, Xinyuan Song, Bill Shi, Libin Xia, Chris Tong, Lynn Ai, Felix Qu, Eric Yang

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2509.24257

要約:
Decentralized inference provides a scalable and resilient paradigm for serving large language models (LLMs), enabling distributed resource utilization and reducing reliance on centralized providers. However, in a permissionless environment without trusted nodes, ensuring the correctness of model outputs remains a core challenge. We introduce VeriLLM, a publicly verifiable protocol for decentralized LLM inference that achieves security under a one-honest-verifier assumption while maintaining practical efficiency. VeriLLM combines lightweight empirical rerunning with cryptographic commitments, allowing verifiers to validate results at approximately 1% of the underlying inference cost. To prevent verification bottlenecks, we design an isomorphic inference-verification architecture that multiplexes both inference and verification roles across the same GPU workers. This design (i) improves GPU utilization and overall throughput, (ii) enlarges the effective validator set, enhancing robustness and liveness, and (iii) enforces task indistinguishability to prevent node-specific optimizations or selective behavior. Through theoretical analysis and system-level evaluation, we show that VeriLLM achieves reliable public verifiability with minimal overhead, offering a practical foundation for trustworthy and scalable decentralized LLM inference.

#64 Verifiable Fine-Tuning for LLMs: Zero-Knowledge Training Proofs Bound to Data Provenance and Policy

著者: Hasan Akgul, Daniel Borg, Arta Berisha, Amina Rahimova, Andrej Novak, Mila Petrov

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2510.16830

要約:
Large language models are often adapted through parameter efficient fine tuning, but current release practices provide weak assurances about what data were used and how updates were computed. We present Verifiable Fine Tuning, a protocol and system that produces succinct zero knowledge proofs that a released model was obtained from a public initialization under a declared training program and an auditable dataset commitment. The approach combines five elements. First, commitments that bind data sources, preprocessing, licenses, and per epoch quota counters to a manifest. Second, a verifiable sampler that supports public replayable and private index hiding batch selection. Third, update circuits restricted to parameter efficient fine tuning that enforce AdamW style optimizer semantics and proof friendly approximations with explicit error budgets. Fourth, recursive aggregation that folds per step proofs into per epoch and end to end certificates with millisecond verification. Fifth, provenance binding and optional trusted execution property cards that attest code identity and constants. On English and bilingual instruction mixtures, the method maintains utility within tight budgets while achieving practical proof performance. Policy quotas are enforced with zero violations, and private sampling windows show no measurable index leakage. Federated experiments demonstrate that the system composes with probabilistic audits and bandwidth constraints. These results indicate that end to end verifiable fine tuning is feasible today for real parameter efficient pipelines, closing a critical trust gap for regulated and decentralized deployments.

#65 Network Intrusion Detection: Evolution from Conventional Approaches to LLM Collaboration and Emerging Risks

著者: Yaokai Feng, Kouichi Sakurai

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2510.23313

要約:
This survey systematizes the evolution of network intrusion detection systems (NIDS), from conventional methods such as signature-based and neural network (NN)-based approaches to recent integrations with large language models (LLMs). It clearly and concisely summarizes the current status, strengths, and limitations of conventional techniques, and explores the practical benefits of integrating LLMs into NIDS. Recent research on the application of LLMs to NIDS in diverse environments is reviewed, including conventional network infrastructures, autonomous vehicle environments and IoT environments. From this survey, readers will learn that: 1) the earliest methods, signature-based IDSs, continue to make significant contributions to modern systems, despite their well-known weaknesses; 2) NN-based detection, although considered promising and under development for more than two decades, and despite numerous related approaches, still faces significant challenges in practical deployment; 3) LLMs are useful for NIDS in many cases, and a number of related approaches have been proposed; however, they still face significant challenges in practical applications. Moreover, they can even be exploited as offensive tools, such as for generating malware, crafting phishing messages, or launching cyberattacks. Recently, several studies have been proposed to address these challenges, which are also reviewed in this survey; and 4) strategies for constructing domain-specific LLMs have been proposed and are outlined in this survey, as it is nearly impossible to train a NIDS-specific LLM from scratch.

#66 Secure Retrieval-Augmented Generation against Poisoning Attacks

backdoor

著者: Zirui Cheng, Jikai Sun, Anjun Gao, Yueyang Quan, Zhuqing Liu, Xiaohua Hu, Minghong Fang

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2510.25025

要約:
Large language models (LLMs) have transformed natural language processing (NLP), enabling applications from content generation to decision support. Retrieval-Augmented Generation (RAG) improves LLMs by incorporating external knowledge but also introduces security risks, particularly from data poisoning, where the attacker injects poisoned texts into the knowledge database to manipulate system outputs. While various defenses have been proposed, they often struggle against advanced attacks. To address this, we introduce RAGuard, a detection framework designed to identify poisoned texts. RAGuard first expands the retrieval scope to increase the proportion of clean texts, reducing the likelihood of retrieving poisoned content. It then applies chunk-wise perplexity filtering to detect abnormal variations and text similarity filtering to flag highly similar texts. This non-parametric approach enhances RAG security, and experiments on large-scale datasets demonstrate its effectiveness in detecting and mitigating poisoning attacks, including strong adaptive attacks.

#67 ZK-SenseLM: Verifiable Large-Model Wireless Sensing with Selective Abstention and Zero-Knowledge Attestation

著者: Hasan Akgul, Mari Eplik, Javier Rojas, Aina Binti Abdullah, Pieter van der Merwe

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2510.25677

要約:
ZK-SenseLM is a secure and auditable wireless sensing framework that pairs a large-model encoder for Wi-Fi channel state information (and optionally mmWave radar or RFID) with a policy-grounded decision layer and end-to-end zero-knowledge proofs of inference. The encoder uses masked spectral pretraining with phase-consistency regularization, plus a light cross-modal alignment that ties RF features to compact, human-interpretable policy tokens. To reduce unsafe actions under distribution shift, we add a calibrated selective-abstention head; the chosen risk-coverage operating point is registered and bound into the proof. We implement a four-stage proving pipeline: (C1) feature sanity and commitment, (C2) threshold and version binding, (C3) time-window binding, and (C4) PLONK-style proofs that the quantized network, given the committed window, produced the logged action and confidence. Micro-batched proving amortizes cost across adjacent windows, and a gateway option offloads proofs from low-power devices. The system integrates with differentially private federated learning and on-device personalization without weakening verifiability: model hashes and the registered threshold are part of each public statement. Across activity, presence or intrusion, respiratory proxy, and RF fingerprinting tasks, ZK-SenseLM improves macro-F1 and calibration, yields favorable coverage-risk curves under perturbations, and rejects tamper and replay with compact proofs and fast verification.

#68 Model Inversion Attacks Meet Cryptographic Fuzzy Extractors

privacy

著者: Mallika Prabhakar, Louise Xu, Prateek Saxena

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2510.25687

要約:
Model inversion attacks pose an open challenge to privacy-sensitive applications that use machine learning (ML) models. For example, face authentication systems use modern ML models to compute embedding vectors from face images of the enrolled users and store them. If leaked, inversion attacks can accurately reconstruct user faces from the leaked vectors. There is no systematic characterization of properties needed in an ideal defense against model inversion, even for the canonical example application of a face authentication system susceptible to data breaches, despite a decade of best-effort solutions. In this paper, we formalize the desired properties of a provably strong defense against model inversion and connect it, for the first time, to the cryptographic concept of fuzzy extractors. We further show that existing fuzzy extractors are insecure for use in ML-based face authentication. We do so through a new model inversion attack called PIPE, which achieves a success rate of over 89% in most cases against prior schemes. We then propose L2FE-Hash, the first candidate fuzzy extractor which supports standard Euclidean distance comparators as needed in many ML-based applications, including face authentication. We formally characterize its computational security guarantees, even in the extreme threat model of full breach of stored secrets, and empirically show its usable accuracy in face authentication for practical face distributions. It offers attack-agnostic security without requiring any re-training of the ML model it protects. Empirically, it nullifies both prior state-of-the-art inversion attacks as well as our new PIPE attack.

#69 The SDSC Satellite Reverse Proxy Service for Launching Secure Jupyter Notebooks on High-Performance Computing Systems

著者: Mary P Thomas, Martin Kandes, James McDougall, Dmitry Mishin, Scott Sakai, Subhashini Sivagnanam, Mahidhar Tatineni

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.02116

要約:
Using Jupyter notebooks in an HPC environment exposes a system and its users to several security risks. The Satellite Proxy Service, developed at SDSC, addresses many of these security concerns by providing Jupyter Notebook servers with a token-authenticated HTTPS reverse proxy through which end users can access their notebooks securely with a single URL copied and pasted into their web browser.

#70 Auditing M-LLMs for Privacy Risks: A Synthetic Benchmark and Evaluation Framework

privacy

著者: Junhao Li, Jiahao Chen, Zhou Feng, Chunyi Zhou

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.03248

要約:
Recent advances in multi-modal Large Language Models (M-LLMs) have demonstrated a powerful ability to synthesize implicit information from disparate sources, including images and text. These resourceful data from social media also introduce a significant and underexplored privacy risk: the inference of sensitive personal attributes from seemingly daily media content. However, the lack of benchmarks and comprehensive evaluations of state-of-the-art M-LLM capabilities hinders the research of private attribute profiling on social media. Accordingly, we propose (1) PRISM, the first multi-modal, multi-dimensional and fine-grained synthesized dataset incorporating a comprehensive privacy landscape and dynamic user history; (2) an Efficient evaluation framework that measures the cross-modal privacy inference capabilities of advanced M-LLM. Specifically, PRISM is a large-scale synthetic benchmark designed to evaluate cross-modal privacy risks. Its key feature is 12 sensitive attribute labels across a diverse set of multi-modal profiles, which enables targeted privacy analysis. These profiles are generated via a sophisticated LLM agentic workflow, governed by a prior distribution to ensure they realistically mimic social media users. Additionally, we propose a Multi-Agent Inference Framework that leverages a pipeline of specialized LLMs to enhance evaluation capabilities. We evaluate the inference capabilities of six leading M-LLMs (Qwen, Gemini, GPT-4o, GLM, Doubao, and Grok) on PRISM. The comparison with human performance reveals that these MLLMs significantly outperform in accuracy and efficiency, highlighting the threat of potential privacy risks and the urgent need for robust defenses. Dataset available at https://huggingface.co/datasets/xaddh/multimodal-privacy

#71 OTS-PC: OTS-based Payment Channels for the Lightning Network

著者: Sergio Demian Lerner, Ariel Futoransky

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.04021

要約:
We present a new type of bidirectional payment channel based on One-Time Signatures on state sequence numbers. This new construction is simpler than the Poon-Dryja construction, but provides a number of benefits such as $O(1)$ storage per channel, minimal information leakage, and compatibility with Lightning Network routing.

#72 Privacy in Speech Technology

privacy

著者: Tom B\"ackstr\"om

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2305.05227

要約:
Speech technology for communication, accessing information, and services has rapidly improved in quality. It is convenient and appealing because speech is the primary mode of communication for humans. Such technology, however, also presents proven threats to privacy. Speech is a tool for communication and it will thus inherently contain private information. Importantly, it however also contains a wealth of side information, such as information related to health, emotions, affiliations, and relationships, all of which are private. Exposing such private information can lead to serious threats such as price gouging, harassment, extortion, and stalking. This paper is a tutorial on privacy issues related to speech technology, modeling their threats, approaches for protecting users' privacy, measuring the performance of privacy-protecting methods, perception of privacy as well as societal and legal consequences. In addition to a tutorial overview, it also presents lines for further development where improvements are most urgently needed.

#73 JailbreakZoo: Survey, Landscapes, and Horizons in Jailbreaking Large Language and Vision-Language Models

著者: Haibo Jin, Leyang Hu, Xinnuo Li, Peiyan Zhang, Chonghan Chen, Jun Zhuang, Haohan Wang

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2407.01599

要約:
The rapid evolution of artificial intelligence (AI) through developments in Large Language Models (LLMs) and Vision-Language Models (VLMs) has brought significant advancements across various technological domains. While these models enhance capabilities in natural language processing and visual interactive tasks, their growing adoption raises critical concerns regarding security and ethical alignment. This survey provides an extensive review of the emerging field of jailbreaking--deliberately circumventing the ethical and operational boundaries of LLMs and VLMs--and the consequent development of defense mechanisms. Our study categorizes jailbreaks into seven distinct types and elaborates on defense strategies that address these vulnerabilities. Through this comprehensive examination, we identify research gaps and propose directions for future studies to enhance the security frameworks of LLMs and VLMs. Our findings underscore the necessity for a unified perspective that integrates both jailbreak strategies and defensive solutions to foster a robust, secure, and reliable environment for the next generation of language models. More details can be found on our website: https://chonghan-chen.com/llm-jailbreak-zoo-survey/.

#74 TIMESAFE: Timing Interruption Monitoring and Security Assessment for Fronthaul Environments

著者: Joshua Groen, Simone Di Valerio, Imtiaz Karim, Davide Villa, Yiewi Zhang, Leonardo Bonati, Michele Polese, Salvatore D'Oro, Tommaso Melodia, Elisa Bertino, Francesca Cuomo, Kaushik Chowdhury

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2412.13049

要約:
5G and beyond cellular systems embrace the disaggregation of Radio Access Network (RAN) components, exemplified by the evolution of the fronthaul (FH) connection between cellular baseband and radio unit equipment. Crucially, synchronization over the FH is pivotal for reliable 5G services. In recent years, there has been a push to move these links to an Ethernet-based packet network topology, leveraging existing standards and ongoing research for Time-Sensitive Networking (TSN). However, TSN standards, such as Precision Time Protocol (PTP), focus on performance with little to no concern for security. This increases the exposure of the open FH to security risks. Attacks targeting synchronization mechanisms pose significant threats, potentially disrupting 5G networks and impairing connectivity. In this paper, we demonstrate the impact of successful spoofing and replay attacks against PTP synchronization. We show how a spoofing attack is able to cause a production-ready O-RAN and 5G-compliant private cellular base station to catastrophically fail within 2 seconds of the attack, necessitating manual intervention to restore full network operations. To counter this, we design a Machine Learning (ML)-based monitoring solution capable of detecting various malicious attacks with over 97.5% accuracy.

#75 HumorReject: Decoupling LLM Safety from Refusal Prefix via A Little Humor

著者: Zihui Wu, Haichang Gao, Jiacheng Luo, Zhaoxiang Liu

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2501.13677

要約:
Large Language Models (LLMs) commonly rely on explicit refusal prefixes for safety, making them vulnerable to prefix injection attacks. We introduce HumorReject, a novel data-driven approach that reimagines LLM safety by decoupling it from refusal prefixes through humor as an indirect refusal strategy. Rather than explicitly rejecting harmful instructions, HumorReject responds with contextually appropriate humor that naturally defuses potentially dangerous requests. Our approach effectively addresses common "over-defense" issues while demonstrating superior robustness against various attack vectors. Our findings suggest that improvements in training data design can be as important as the alignment algorithm itself in achieving effective LLM safety. The code and dataset are available at https://github.com/wooozihui/HumorReject.

#76 Mitigating Sexual Content Generation via Embedding Distortion in Text-conditioned Diffusion Models

diffusion

著者: Jaesin Ahn, Heechul Jung

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2501.18877

要約:
Diffusion models show remarkable image generation performance following text prompts, but risk generating sexual contents. Existing approaches, such as prompt filtering, concept removal, and even sexual contents mitigation methods, struggle to defend against adversarial attacks while maintaining benign image quality. In this paper, we propose a novel approach called Distorting Embedding Space (DES), a text encoder-based defense mechanism that effectively tackles these issues through innovative embedding space control. DES transforms unsafe embeddings, extracted from a text encoder using unsafe prompts, toward carefully calculated safe embedding regions to prevent unsafe contents generation, while reproducing the original safe embeddings. DES also neutralizes the ``nudity'' embedding, by aligning it with neutral embedding to enhance robustness against adversarial attacks. As a result, extensive experiments on explicit content mitigation and adaptive attack defense show that DES achieves state-of-the-art (SOTA) defense, with attack success rate (ASR) of 9.47% on FLUX.1, a recent popular model, and 0.52% on the widely adopted Stable Diffusion v1.5. These correspond to ASR reductions of 76.5% and 63.9% compared to previous SOTA methods, EraseAnything and AdvUnlearn, respectively. Furthermore, DES maintains benign image quality, achieving Frechet Inception Distance and CLIP score comparable to those of the original FLUX.1 and Stable Diffusion v1.5.

#77 TrustChain: A Blockchain Framework for Auditing and Verifying Aggregators in Decentralized Federated Learning

著者: Ehsan Hallaji, Roozbeh Razavi-Far, Mehrdad Saif

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2502.16406

要約:
The server-less nature of Decentralized Federated Learning (DFL) requires allocating the aggregation role to specific participants in each federated round. Current DFL architectures ensure the trustworthiness of the aggregator node upon selection. However, most of these studies overlook the possibility that the aggregating node may turn rogue and act maliciously after being nominated. To address this problem, this paper proposes a DFL structure, called TrustChain, that scores the aggregators before selection based on their past behavior and additionally audits them after the aggregation. To do this, the statistical independence between the client updates and the aggregated model is continuously monitored using the Hilbert-Schmidt Independence Criterion (HSIC). The proposed method relies on several principles, including blockchain, anomaly detection, and concept drift analysis. The designed structure is evaluated on several federated datasets and attack scenarios with different numbers of Byzantine nodes.

#78 ARGO-SLSA: Software Supply Chain Security in Argo Workflows

著者: Mohomed Thariq, Indrajith Ekanayake

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2503.20079

要約:
Distributed systems widely adopt microservice architecture to handle growing complexity and scale. This approach breaks applications into independent, loosely coupled services. Kubernetes has become the de facto standard for managing microservices, and automating complex, multi-step workflows is a common requirement in Kubernetes. Argo Workflows is a Kubernetes-native engine for managing these workflows in an automated fashion. These workflows generate artifacts such as executables, logs, container images, and packages, which often require proper management through software supply chain security. However, Argo Workflows does not include built-in functionality for frameworks like Supply-chain Levels for Software Artifacts (SLSA), which is essential for ensuring artifact integrity, traceability, and security. This gap compels practitioners to rely on external tools to meet software supply chain security standards. In response, this paper proposes a Kubernetes-native controller built on top of existing open-source Argo Workflows to enhance artifact security. By generating cryptographic signing and provenance attestations, the controller enables Argo Workflows to comply with SLSA standards. We demonstrate that implementations can provide such cryptographic signing and provenance attestations for artifacts produced by the controller, allowing software artifacts built with Argo Workflows to adhere to SLSA requirements. The proposed validation model evaluates the proof of concept of the controller, including its ability to reconcile workflows, detect pods associated with workflow nodes, operate without disrupting existing operations, enforce integrity, and monitor software artifacts.

#79 Private Statistical Estimation via Truncation

privacy

著者: Manolis Zampetakis, Felix Zhou

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2505.12541

要約:
We introduce a novel framework for differentially private (DP) statistical estimation via data truncation, addressing a key challenge in DP estimation when the data support is unbounded. Traditional approaches rely on problem-specific sensitivity analysis, limiting their applicability. By leveraging techniques from truncated statistics, we develop computationally efficient DP estimators for exponential family distributions, including Gaussian mean and covariance estimation, achieving near-optimal sample complexity. Previous works on exponential families only consider bounded or one-dimensional families. Our approach mitigates sensitivity through truncation while carefully correcting for the introduced bias using maximum likelihood estimation and DP stochastic gradient descent. Along the way, we establish improved uniform convergence guarantees for the log-likelihood function of exponential families, which may be of independent interest. Our results provide a general blueprint for DP algorithm design via truncated statistics.

#80 Differentially Private Distribution Release of Gaussian Mixture Models via KL-Divergence Minimization

privacy

著者: Hang Liu, Anna Scaglione, Sean Peisert

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2506.03467

要約:
Gaussian Mixture Models (GMMs) are widely used statistical models for representing multi-modal data distributions, with numerous applications in data mining, pattern recognition, data simulation, and machine learning. However, recent research has shown that releasing GMM parameters poses significant privacy risks, potentially exposing sensitive information about the underlying data. In this paper, we address the challenge of releasing GMM parameters while ensuring differential privacy (DP) guarantees. Specifically, we focus on the privacy protection of mixture weights, component means, and covariance matrices. We propose to use Kullback-Leibler (KL) divergence as a utility metric to assess the accuracy of the released GMM, as it captures the joint impact of noise perturbation on all the model parameters. To achieve privacy, we introduce a DP mechanism that adds carefully calibrated random perturbations to the GMM parameters. Through theoretical analysis, we quantify the effects of privacy budget allocation and perturbation statistics on the DP guarantee, and derive a tractable expression for evaluating KL divergence. We formulate and solve an optimization problem to minimize the KL divergence between the released and original models, subject to a given $(\epsilon, \delta)$-DP constraint. Extensive experiments on both synthetic and real-world datasets demonstrate that our approach achieves strong privacy guarantees while maintaining high utility.

#81 Prompt Injection Vulnerability of Consensus Generating Applications in Digital Democracy

著者: Jairo Gudi\~no-Rosero, Cl\'ement Contet, Umberto Grandi, C\'esar A. Hidalgo

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2508.04281

要約:
Large Language Models (LLMs) are gaining traction as a method to generate consensus statements and aggregate preferences in digital democracy experiments. Yet, LLMs could introduce critical vulnerabilities in these systems. Here, we explore the vulnerability of some off-the-shelf LLMs to prompt-injection attacks in consensus generating systems using a four-dimensional taxonomy of attacks. In LLaMA 3.1 8B and Chat GPT 4.1 Nano, we find LLMs to be more vulnerable to attacks using disagreeable prompts and when targeting situations with unclear consensus. We also find evidence of more effective manipulation when using explicit imperatives and rational-sounding arguments compared to emotional language or fabricated statistics. To mitigate these vulnerabilities, we apply Direct Preference Optimization (DPO), an alignment method that fine-tunes LLMs to prefer unperturbed consensus statements. While DPO and additional layered defenses significantly improve robustness, it still offers limited protection against attacks targeting ambiguous consensus. These results advance our understanding of the vulnerability and robustness of consensus generating LLMs in digital democracy applications.

#82 Oblivionis: A Lightweight Learning and Unlearning Framework for Federated Large Language Models

privacy

著者: Fuyao Zhang, Xinyu Yan, Tiantong Wu, Wenjie Li, Tianxiang Chen, Yang Cao, Ran Yan, Longtao Huang, Wei Yang Bryan Lim, Qiang Yang

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2508.08875

要約:
Large Language Models (LLMs) increasingly leverage Federated Learning (FL) to utilize private, task-specific datasets for fine-tuning while preserving data privacy. However, while federated LLM frameworks effectively enable collaborative training without raw data sharing, they critically lack built-in mechanisms for regulatory compliance like GDPR's right to be forgotten. Integrating private data heightens concerns over data quality and long-term governance, yet existing distributed training frameworks offer no principled way to selectively remove specific client contributions post-training. Due to distributed data silos, stringent privacy constraints, and the intricacies of interdependent model aggregation, federated LLM unlearning is significantly more complex than centralized LLM unlearning. To address this gap, we introduce Oblivionis, a lightweight learning and unlearning framework that enables clients to selectively remove specific private data during federated LLM training, enhancing trustworthiness and regulatory compliance. By unifying FL and unlearning as a dual optimization objective, we incorporate 6 FL and 5 unlearning algorithms for comprehensive evaluation and comparative analysis, establishing a robust pipeline for federated LLM unlearning. Extensive experiments demonstrate that Oblivionis outperforms local training, achieving a robust balance between forgetting efficacy and model utility, with cross-algorithm comparisons providing clear directions for future LLM development.

#83 On Cryptography and Distribution Verification, with Applications to Quantum Advantage

著者: Bruno Cavalar, Eli Goldin, Matthew Gray, Taiga Hiroka, Tomoyuki Morimae

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2510.05028

要約:
One of the most fundamental problems in the field of hypothesis testing is the identity testing problem: whether samples from some unknown distribution $\mathcal{G}$ are actually from some explicit distribution $\mathcal{D}$. It is known that when the distribution $\mathcal{D}$ has support $[N]$, the optimal sample complexity for the identity testing problem is roughly $O(\sqrt{N})$. However, many distributions of interest, including those which can be sampled efficiently, have exponential support size, and therefore the optimal identity tester also requires exponential samples. In this paper, we bypass this lower bound by considering restricted settings. The above $O(\sqrt{N})$ sample complexity identity tester is constructed so that it is not fooled by any (even inefficiently-sampled) distributions. However, in most applications, the distributions under consideration are efficiently samplable, and therefore it is enough to consider only identity testers that are not fooled by efficiently-sampled distributions. In this setting we can hope to construct efficient identity testers. We investigate relations between efficient verification of classical/quantum distributions with classical/quantum cryptography, showing the following results: (1). Classically efficiently samplable distributions are verifiable if and only if one-way functions do not exist. (2). Quantumly efficiently samplable distributions are verifiable by $\mathbf{P}^\mathbf{PP}$ with a polynomial number of samples. (3). Sampling-based quantum advantage can be verified quantumly (with a polynomial number of samples) if one-way puzzles do not exist. (4). If QEFID pairs exist, then some quantumly efficiently samplable distributions are not verifiable.

#84 Descriptor-Based Object-Aware Memory Systems: A Comprehensive Review

著者: Dong Tong

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2510.27070

要約:
The security and efficiency of modern computing systems are fundamentally undermined by the absence of a native architectural mechanism to propagate high-level program semantics, such as object identity, bounds, and lifetime, across the hardware/software interface. This paper presents a comprehensive survey of the architectural paradigm designed to bridge this semantic gap: descriptor-based, object-aware memory systems. By elevating the descriptor to a first-class architectural abstraction, this paradigm enables hardware to dynamically acquire and enforce the rich semantics of software-defined objects. This survey systematically charts the evolution and current landscape of this approach. We establish the foundational concepts of memory objects and descriptors and introduce a novel taxonomy of descriptor addressing modes, providing a structured framework for analyzing and comparing diverse implementations. Our unified analysis reveals how this paradigm holistically addresses the intertwined challenges of memory protection, management, and processing. As a culminating case study, we re-examine the CentroID model, demonstrating how its hybrid tagged-pointer encoding and descriptor processing mechanisms embody the path toward practical and efficient object-aware designs. Finally, we outline how the explicit cross-layer communication of object semantics provides a foundational research direction for next-generation cache hierarchies, unified virtual memory, and even 128-bit architectures.

#85 Rethinking Robust Adversarial Concept Erasure in Diffusion Models

diffusion

著者: Qinghong Yin, Yu Tian, Heming Yang, Xiang Chen, Xianlin Zhang, Xueming Li, Yue Zhan

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2510.27285

要約:
Concept erasure aims to selectively unlearning undesirable content in diffusion models (DMs) to reduce the risk of sensitive content generation. As a novel paradigm in concept erasure, most existing methods employ adversarial training to identify and suppress target concepts, thus reducing the likelihood of sensitive outputs. However, these methods often neglect the specificity of adversarial training in DMs, resulting in only partial mitigation. In this work, we investigate and quantify this specificity from the perspective of concept space, i.e., can adversarial samples truly fit the target concept space? We observe that existing methods neglect the role of conceptual semantics when generating adversarial samples, resulting in ineffective fitting of concept spaces. This oversight leads to the following issues: 1) when there are few adversarial samples, they fail to comprehensively cover the object concept; 2) conversely, they will disrupt other target concept spaces. Motivated by the analysis of these findings, we introduce S-GRACE (Semantics-Guided Robust Adversarial Concept Erasure), which grace leveraging semantic guidance within the concept space to generate adversarial samples and perform erasure training. Experiments conducted with seven state-of-the-art methods and three adversarial prompt generation strategies across various DM unlearning scenarios demonstrate that S-GRACE significantly improves erasure performance 26%, better preserves non-target concepts, and reduces training time by 90%. Our code is available at https://github.com/Qhong-522/S-GRACE.

#86 AutoAdv: Automated Adversarial Prompting for Multi-Turn Jailbreaking of Large Language Models

著者: Aashray Reddy, Andrew Zagula, Nicholas Saban, Kevin Zhu

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.02376

要約:
Large Language Models (LLMs) remain vulnerable to jailbreaking attacks where adversarial prompts elicit harmful outputs, yet most evaluations focus on single-turn interactions while real-world attacks unfold through adaptive multi-turn conversations. We present AutoAdv, a training-free framework for automated multi-turn jailbreaking that achieves up to 95% attack success rate on Llama-3.1-8B within six turns a 24 percent improvement over single turn baselines. AutoAdv uniquely combines three adaptive mechanisms: a pattern manager that learns from successful attacks to enhance future prompts, a temperature manager that dynamically adjusts sampling parameters based on failure modes, and a two-phase rewriting strategy that disguises harmful requests then iteratively refines them. Extensive evaluation across commercial and open-source models (GPT-4o-mini, Qwen3-235B, Mistral-7B) reveals persistent vulnerabilities in current safety mechanisms, with multi-turn attacks consistently outperforming single-turn approaches. These findings demonstrate that alignment strategies optimized for single-turn interactions fail to maintain robustness across extended conversations, highlighting an urgent need for multi-turn-aware defenses.

#87 Adaptive and Robust Data Poisoning Detection and Sanitization in Wearable IoT Systems using Large Language Models

backdoor

著者: W. K. M Mithsara, Ning Yang, Ahmed Imteaj, Hussein Zangoti, Abdur R. Shahid

公開日: Tue, 11 Nov 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.02894

要約:
The widespread integration of wearable sensing devices in Internet of Things (IoT) ecosystems, particularly in healthcare, smart homes, and industrial applications, has required robust human activity recognition (HAR) techniques to improve functionality and user experience. Although machine learning models have advanced HAR, they are increasingly susceptible to data poisoning attacks that compromise the data integrity and reliability of these systems. Conventional approaches to defending against such attacks often require extensive task-specific training with large, labeled datasets, which limits adaptability in dynamic IoT environments. This work proposes a novel framework that uses large language models (LLMs) to perform poisoning detection and sanitization in HAR systems, utilizing zero-shot, one-shot, and few-shot learning paradigms. Our approach incorporates \textit{role play} prompting, whereby the LLM assumes the role of expert to contextualize and evaluate sensor anomalies, and \textit{think step-by-step} reasoning, guiding the LLM to infer poisoning indicators in the raw sensor data and plausible clean alternatives. These strategies minimize reliance on curation of extensive datasets and enable robust, adaptable defense mechanisms in real-time. We perform an extensive evaluation of the framework, quantifying detection accuracy, sanitization quality, latency, and communication cost, thus demonstrating the practicality and effectiveness of LLMs in improving the security and reliability of wearable IoT systems.

cs.CR updates on arXiv.org

📋 論文タイトル一覧