cs.CR updates on arXiv.org

更新日時: Mon, 24 Nov 2025 05:00:36 +0000
論文数: 31件
0件選択中

📋 論文タイトル一覧

1. AutoBackdoor: Automating Backdoor Attacks via LLM Agents backdooragent
2. Password Strength Analysis Through Social Network Data Exposure: A Combined Approach Relying on Data Reconstruction and Generative Models privacy
3. RampoNN: A Reachability-Guided System Falsification for Efficient Cyber-Kinetic Vulnerability Detection
4. Membership Inference Attacks Beyond Overfitting privacy
5. TICAL: Trusted and Integrity-protected Compilation of AppLications
6. AutoGraphAD: A novel approach using Variational Graph Autoencoders for anomalous network flow detection
7. Constant-Size Cryptographic Evidence Structures for Regulated AI Workflows
8. Steering in the Shadows: Causal Amplification for Activation Space Attacks in Large Language Models
9. Persistent BitTorrent Trackers
10. ThreadFuzzer: Fuzzing Framework for Thread Protocol
11. A Patient-Centric Blockchain Framework for Secure Electronic Health Record Management: Decoupling Data Storage from Access Control
12. A Robust Federated Learning Approach for Combating Attacks Against IoT Systems Under non-IID Challenges
13. MultiPriv: Benchmarking Individual-Level Privacy Reasoning in Vision-Language Models privacy
14. Differentially private testing for relevant dependencies in high dimensions privacy
15. Mitigating the Impact of Malware Evolution on API Sequence-based Windows Malware Detector
16. HidePrint: Protecting Device Anonymity by Obscuring Radio Fingerprints
17. Reason2Attack: Jailbreaking Text-to-Image Models via LLM Reasoning
18. Compact and Selective Disclosure for Verifiable Credentials
19. A TEE-based Approach for Security and Privacy in Decision Support privacy
20. The Coding Limits of Robust Watermarking for Generative Models intellectual property
21. GhostEI-Bench: Do Mobile Agents Resilience to Environmental Injection in Dynamic On-Device Environments? agent
22. T2I-RiskyPrompt: A Benchmark for Safety Evaluation, Attack, and Defense on Text-to-Image Model
23. Best Practices for Biorisk Evaluations on Open-Weight Bio-Foundation Models
24. AudAgent: Automated Auditing of Privacy Policy Compliance in AI Agents privacyagent
25. SALT: Steering Activations towards Leakage-free Thinking in Chain of Thought
26. SoK: Security Evaluation of Wi-Fi CSI Biometrics: Attacks, Metrics, and Open Challenges
27. A Detailed Comparative Analysis of Blockchain Consensus Mechanisms
28. LLM-Agent-UMF: LLM-based Agent Unified Modeling Framework for Seamless Design of Multi Active/Passive Core-Agent Architectures agent
29. Defending the Edge: Representative-Attention Defense against Backdoor Attacks in Federated Learning backdoor
30. SHIELD: Secure Hypernetworks for Incremental Expansion Learning Defense
31. Adaptive and Robust Data Poisoning Detection and Sanitization in Wearable IoT Systems using Large Language Models backdoor
📄 論文詳細
backdooragent
著者: Yige Li, Zhe Li, Wei Zhao, Nay Myat Min, Hanxun Huang, Xingjun Ma, Jun Sun
公開日: Mon, 24 Nov 2025 00:00:00 -0500
要約:
Backdoor attacks pose a serious threat to the secure deployment of large language models (LLMs), enabling adversaries to implant hidden behaviors triggered by specific inputs. However, existing methods often rely on manually crafted triggers and static data pipelines, which are rigid, labor-intensive, and inadequate for systematically evaluating modern defense robustness. As AI agents become increasingly capable, there is a growing need for more rigorous, diverse, and scalable \textit{red-teaming frameworks} that can realistically simulate backdoor threats and assess model resilience under adversarial conditions. In this work, we introduce \textsc{AutoBackdoor}, a general framework for automating backdoor injection, encompassing trigger generation, poisoned data construction, and model fine-tuning via an autonomous agent-driven pipeline. Unlike prior approaches, AutoBackdoor uses a powerful language model agent to generate semantically coherent, context-aware trigger phrases, enabling scalable poisoning across arbitrary topics with minimal human effort. We evaluate AutoBackdoor under three realistic threat scenarios, including \textit{Bias Recommendation}, \textit{Hallucination Injection}, and \textit{Peer Review Manipulation}, to simulate a broad range of attacks. Experiments on both open-source and commercial models, including LLaMA-3, Mistral, Qwen, and GPT-4o, demonstrate that our method achieves over 90\% attack success with only a small number of poisoned samples. More importantly, we find that existing defenses often fail to mitigate these attacks, underscoring the need for more rigorous and adaptive evaluation techniques against agent-driven threats as explored in this work. All code, datasets, and experimental configurations will be merged into our primary repository at https://github.com/bboylyg/BackdoorLLM.
privacy
著者: Maurizio Atzori, Eleonora Cal\`o, Loredana Caruccio, Stefano Cirillo, Giuseppe Polese, Giandomenico Solimando
公開日: Mon, 24 Nov 2025 00:00:00 -0500
要約:
Although passwords remain the primary defense against unauthorized access, users often tend to use passwords that are easy to remember. This behavior significantly increases security risks, also due to the fact that traditional password strength evaluation methods are often inadequate. In this discussion paper, we present SODA ADVANCE, a data reconstruction tool also designed to enhance evaluation processes related to the password strength. In particular, SODA ADVANCE integrates a specialized module aimed at evaluating password strength by leveraging publicly available data from multiple sources, including social media platforms. Moreover, we investigate the capabilities and risks associated with emerging Large Language Models (LLMs) in evaluating and generating passwords, respectively. Experimental assessments conducted with 100 real users demonstrate that LLMs can generate strong and personalized passwords possibly defined according to user profiles. Additionally, LLMs were shown to be effective in evaluating passwords, especially when they can take into account user profile data.
著者: Kohei Tsujio, Mohammad Abdullah Al Faruque, Yasser Shoukry
公開日: Mon, 24 Nov 2025 00:00:00 -0500
要約:
Detecting kinetic vulnerabilities in Cyber-Physical Systems (CPS), vulnerabilities in control code that can precipitate hazardous physical consequences, is a critical challenge. This task is complicated by the need to analyze the intricate coupling between complex software behavior and the system's physical dynamics. Furthermore, the periodic execution of control code in CPS applications creates a combinatorial explosion of execution paths that must be analyzed over time, far exceeding the scope of traditional single-run code analysis. This paper introduces RampoNN, a novel framework that systematically identifies kinetic vulnerabilities given the control code, a physical system model, and a Signal Temporal Logic (STL) specification of safe behavior. RampoNN first analyzes the control code to map the control signals that can be generated under various execution branches. It then employs a neural network to abstract the physical system's behavior. To overcome the poor scaling and loose over-approximations of standard neural network reachability, RampoNN uniquely utilizes Deep Bernstein neural networks, which are equipped with customized reachability algorithms that yield orders of magnitude tighter bounds. This high-precision reachability analysis allows RampoNN to rapidly prune large sets of guaranteed-safe behaviors and rank the remaining traces by their potential to violate the specification. The results of this analysis are then used to effectively guide a falsification engine, focusing its search on the most promising system behaviors to find actual vulnerabilities. We evaluated our approach on a PLC-controlled water tank system and a switched PID controller for an automotive engine. The results demonstrate that RampoNN leads to acceleration of the process of finding kinetic vulnerabilities by up to 98.27% and superior scalability compared to other state-of-the-art methods.
privacy
著者: Mona Khalil, Alberto Blanco-Justicia, Najeeb Jebreel, Josep Domingo-Ferrer
公開日: Mon, 24 Nov 2025 00:00:00 -0500
要約:
Membership inference attacks (MIAs) against machine learning (ML) models aim to determine whether a given data point was part of the model training data. These attacks may pose significant privacy risks to individuals whose sensitive data were used for training, which motivates the use of defenses such as differential privacy, often at the cost of high accuracy losses. MIAs exploit the differences in the behavior of a model when making predictions on samples it has seen during training (members) versus those it has not seen (non-members). Several studies have pointed out that model overfitting is the major factor contributing to these differences in behavior and, consequently, to the success of MIAs. However, the literature also shows that even non-overfitted ML models can leak information about a small subset of their training data. In this paper, we investigate the root causes of membership inference vulnerabilities beyond traditional overfitting concerns and suggest targeted defenses. We empirically analyze the characteristics of the training data samples vulnerable to MIAs in models that are not overfitted (and hence able to generalize). Our findings reveal that these samples are often outliers within their classes (e.g., noisy or hard to classify). We then propose potential defensive strategies to protect these vulnerable samples and enhance the privacy-preserving capabilities of ML models. Our code is available at https://github.com/najeebjebreel/mia_analysis.
著者: Robert Krahn, Nikson Kanti Paul, Franz Gregor, Do Le Quoc, Andrey Brito, Andr\'e Martin, Christof Fetzer
公開日: Mon, 24 Nov 2025 00:00:00 -0500
要約:
During the past few years, we have witnessed various efforts to provide confidentiality and integrity for applications running in untrusted environments such as public clouds. In most of these approaches, hardware extensions such as Intel SGX, TDX, AMD SEV, etc., are leveraged to provide encryption and integrity protection on process or VM level. Although all of these approaches increase the trust in the application at runtime, an often overlooked aspect is the integrity and confidentiality protection at build time, which is equally important as maliciously injected code during compilation can compromise the entire application and system.In this paper, we present Tical, a practical framework for trusted compilation that provides integrity protection and confidentiality in build pipelines from source code to the final executable. Our approach harnesses TEEs as runtime protection but enriches TEEs with file system shielding and an immutable audit log with version history to provide accountability. This way, we can ensure that the compiler chain can only access trusted files and intermediate output, such as object files produced by trusted processes. Our evaluation using micro- and macro-benchmarks shows that Tical can protect the confidentiality and integrity of whole CI/CD pipelines with an acceptable performance overhead.
著者: Georgios Anyfantis, Pere Barlet-Ros
公開日: Mon, 24 Nov 2025 00:00:00 -0500
要約:
Network Intrusion Detection Systems (NIDS) are essential tools for detecting network attacks and intrusions. While extensive research has explored the use of supervised Machine Learning for attack detection and characterisation, these methods require accurately labelled datasets, which are very costly to obtain. Moreover, existing public datasets have limited and/or outdated attacks, and many of them suffer from mislabelled data. To reduce the reliance on labelled data, we propose AutoGraphAD, a novel unsupervised anomaly detection approach based on a Heterogeneous Variational Graph Autoencoder. AutoGraphAD operates on heterogeneous graphs, made from connection and IP nodes that capture network activity within a time window. The model is trained using unsupervised and contrastive learning, without relying on any labelled data. The reconstruction, structural loss, and KL divergence are then weighted and combined in an anomaly score that is then used for anomaly detection. Overall, AutoGraphAD yields the same, and in some cases better, results than previous unsupervised approaches, such as Anomal-E, but without requiring costly downstream anomaly detectors. As a result, AutoGraphAD achieves around 1.18 orders of magnitude faster training and 1.03 orders of magnitude faster inference, which represents a significant advantage for operational deployment.
著者: Leo Kao
公開日: Mon, 24 Nov 2025 00:00:00 -0500
要約:
This paper introduces constant-size cryptographic evidence structures, a general abstraction for representing verifiable audit evidence for AI workflows in regulated environments. Each evidence item is a fixed-size tuple of cryptographic fields, designed to (i) provide strong binding to workflow events and configurations, (ii) support constant-size storage and uniform verification cost per event, and (iii) compose cleanly with hash-chain and Merkle-based audit constructions. We formalize a simple model of regulated AI workflows, define syntax and algorithms for evidence structures, and articulate security goals such as audit integrity and non-equivocation. We present a generic hash-and-sign construction that instantiates this abstraction using a collision-resistant hash function and a standard digital signature scheme. We then show how to integrate the construction with hash-chained logs, Merkle-tree anchoring, and optionally trusted execution environments, and we analyze the asymptotic complexity of evidence generation and verification. Finally, we implement a prototype library and report microbenchmark results on commodity hardware, demonstrating that the per-event overhead of constant-size evidence is small and predictable. The design is informed by industrial experience with regulated AI systems at Codebat Technologies Inc., while the paper focuses on the abstraction, algorithms, and their security and performance characteristics, with implications for clinical trial management, pharmaceutical compliance, and medical AI governance.
著者: Zhiyuan Xu, Stanislav Abaimov, Joseph Gardiner, Sana Belguith
公開日: Mon, 24 Nov 2025 00:00:00 -0500
要約:
Modern large language models (LLMs) are typically secured by auditing data, prompts, and refusal policies, while treating the forward pass as an implementation detail. We show that intermediate activations in decoder-only LLMs form a vulnerable attack surface for behavioral control. Building on recent findings on attention sinks and compression valleys, we identify a high-gain region in the residual stream where small, well-aligned perturbations are causally amplified along the autoregressive trajectory--a Causal Amplification Effect (CAE). We exploit this as an attack surface via Sensitivity-Scaled Steering (SSS), a progressive activation-level attack that combines beginning-of-sequence (BOS) anchoring with sensitivity-based reinforcement to focus a limited perturbation budget on the most vulnerable layers and tokens. We show that across multiple open-weight models and four behavioral axes, SSS induces large shifts in evil, hallucination, sycophancy, and sentiment while preserving high coherence and general capabilities, turning activation steering into a concrete security concern for white-box and supply-chain LLM deployments.
著者: Francois Xavier Wicht, Zhengwei Tong, Shunfan Zhou, Hang Yin, Aviv Yaish
公開日: Mon, 24 Nov 2025 00:00:00 -0500
要約:
Private BitTorrent trackers enforce upload-to-download ratios to prevent free-riding, but suffer from three critical weaknesses: reputation cannot move between trackers, centralized servers create single points of failure, and upload statistics are self-reported and unverifiable. When a tracker shuts down (whether by operator choice, technical failure, or legal action) users lose their contribution history and cannot prove their standing to new communities. We address these problems by storing reputation in smart contracts and replacing self-reports with cryptographic attestations. Receiving peers sign receipts for transferred pieces, which the tracker aggregates and verifies before updating on-chain reputation. Trackers run in Trusted Execution Environments (TEEs) to guarantee correct aggregation and prevent manipulation of state. If a tracker is unavailable, peers use an authenticated Distributed Hash Table (DHT) for discovery: the on-chain reputation acts as a Public Key Infrastructure (PKI), so peers can verify each other and maintain access control without the tracker. This design persists reputation across tracker failures and makes it portable to new instances through single-hop migration in factory-deployed contracts. We formalize the security requirements, prove correctness under standard cryptographic assumptions, and evaluate a prototype on Intel TDX. Measurements show that transfer receipts adds less than 6\% overhead with typical piece sizes, and signature aggregation speeds up verification by $2.5\times$.
著者: Ilja Siro\v{s}, Jakob Heirwegh, Dave Singel\'ee, Bart Preneel
公開日: Mon, 24 Nov 2025 00:00:00 -0500
要約:
With the rapid growth of IoT, secure and efficient mesh networking has become essential. Thread has emerged as a key protocol, widely used in smart-home and commercial systems, and serving as a core transport layer in the Matter standard. This paper presents ThreadFuzzer, the first dedicated fuzzing framework for systematically testing Thread protocol implementations. By manipulating packets at the MLE layer, ThreadFuzzer enables fuzzing of both virtual OpenThread nodes and physical Thread devices. The framework incorporates multiple fuzzing strategies, including Random and Coverage-based fuzzers from CovFuzz, as well as a newly introduced TLV Inserter, designed specifically for TLV-structured MLE messages. These strategies are evaluated on the OpenThread stack using code-coverage and vulnerability-discovery metrics. The evaluation uncovered five previously unknown vulnerabilities in the OpenThread stack, several of which were successfully reproduced on commercial devices that rely on OpenThread. Moreover, ThreadFuzzer was benchmarked against an oracle AFL++ setup using the manually extended OSS-Fuzz harness from OpenThread, demonstrating strong effectiveness. These results demonstrate the practical utility of ThreadFuzzer while highlighting challenges and future directions in the wireless protocol fuzzing research space.
著者: Tanzim Hossain Romel, Kawshik Kumar Paul, Tanberul Islam Ruhan, Maisha Rahman Mim, Abu Sayed Md. Latiful Hoque
公開日: Mon, 24 Nov 2025 00:00:00 -0500
要約:
We present a patient-centric architecture for electronic health record (EHR) sharing that separates content storage from authorization and audit. Encrypted FHIR resources are stored off-chain; a public blockchain records only cryptographic commitments and patient-signed, time-bounded permissions using EIP-712. Keys are distributed via public-key wrapping, enabling storage providers to remain honest-but-curious without risking confidentiality. We formalize security goals (confidentiality, integrity, cryptographically attributable authorization, and auditability of authorization events) and provide a Solidity reference implementation deployed as single-patient contracts. On-chain costs for permission grants average 78,000 gas (L1), and end-to-end access latency for 1 MB records is 0.7--1.4s (mean values for S3 and IPFS respectively), dominated by storage retrieval. Layer-2 deployment reduces gas usage by 10--13x, though data availability charges dominate actual costs. We discuss metadata privacy, key registry requirements, and regulatory considerations (HIPAA/GDPR), demonstrating a practical route to restoring patient control while preserving security properties required for sensitive clinical data.
著者: Eyad Gad, Zubair Md Fadlullah, Mostafa M. Fouda
公開日: Mon, 24 Nov 2025 00:00:00 -0500
要約:
In the context of the growing proliferation of user devices and the concurrent surge in data volumes, the complexities arising from the substantial increase in data have posed formidable challenges to conventional machine learning model training. Particularly, this is evident within resource-constrained and security-sensitive environments such as those encountered in networks associated with the Internet of Things (IoT). Federated Learning has emerged as a promising remedy to these challenges by decentralizing model training to edge devices or parties, effectively addressing privacy concerns and resource limitations. Nevertheless, the presence of statistical heterogeneity in non-Independently and Identically Distributed (non-IID) data across different parties poses a significant hurdle to the effectiveness of FL. Many FL approaches have been proposed to enhance learning effectiveness under statistical heterogeneity. However, prior studies have uncovered a gap in the existing research landscape, particularly in the absence of a comprehensive comparison between federated methods addressing statistical heterogeneity in detecting IoT attacks. In this research endeavor, we delve into the exploration of FL algorithms, specifically FedAvg, FedProx, and Scaffold, under different data distributions. Our focus is on achieving a comprehensive understanding of and addressing the challenges posed by statistical heterogeneity. In this study, We classify large-scale IoT attacks by utilizing the CICIoT2023 dataset. Through meticulous analysis and experimentation, our objective is to illuminate the performance nuances of these FL methods, providing valuable insights for researchers and practitioners in the domain.
privacy
著者: Xiongtao Sun, Hui Li, Jiaming Zhang, Yujie Yang, Kaili Liu, Ruxin Feng, Wen Jun Tan, Wei Yang Bryan Lim
公開日: Mon, 24 Nov 2025 00:00:00 -0500
要約:
Modern Vision-Language Models (VLMs) demonstrate sophisticated reasoning, escalating privacy risks beyond simple attribute perception to individual-level linkage. Current privacy benchmarks are structurally insufficient for this new threat, as they primarily evaluate privacy perception while failing to address the more critical risk of privacy reasoning: a VLM's ability to infer and link distributed information to construct individual profiles. To address this critical gap, we propose \textbf{MultiPriv}, the first benchmark designed to systematically evaluate individual-level privacy reasoning in VLMs. We introduce the \textbf{Privacy Perception and Reasoning (PPR)} framework and construct a novel, bilingual multimodal dataset to support it. The dataset uniquely features a core component of synthetic individual profiles where identifiers (e.g., faces, names) are meticulously linked to sensitive attributes. This design enables nine challenging tasks evaluating the full PPR spectrum, from attribute detection to cross-image re-identification and chained inference. We conduct a large-scale evaluation of over 50 foundational and commercial VLMs. Our analysis reveals: (1) Many VLMs possess significant, unmeasured reasoning-based privacy risks. (2) Perception-level metrics are poor predictors of these reasoning risks, revealing a critical evaluation gap. (3) Existing safety alignments are inconsistent and ineffective against such reasoning-based attacks. MultiPriv exposes systemic vulnerabilities and provides the necessary framework for developing robust, privacy-preserving VLMs.
privacy
著者: Patrick Bastian, Holger Dette, Martin Dunsche
公開日: Mon, 24 Nov 2025 00:00:00 -0500
要約:
We investigate the problem of detecting dependencies between the components of a high-dimensional vector. Our approach advances the existing literature in two important respects. First, we consider the problem under privacy constraints. Second, instead of testing whether the coordinates are pairwise independent, we are interested in determining whether certain pairwise associations between the components (such as all pairwise Kendall's $\tau$ coefficients) do not exceed a given threshold in absolute value. Considering hypotheses of this form is motivated by the observation that in the high-dimensional regime, it is rare and perhaps impossible to have a null hypothesis that can be modeled exactly by assuming that all pairwise associations are precisely equal to zero. The formulation of the null hypothesis as a composite hypothesis makes the problem of constructing tests already non-standard in the non-private setting. Additionally, under privacy constraints, state of the art procedures rely on permutation approaches that are rendered invalid under a composite null. We propose a novel bootstrap based methodology that is especially powerful in sparse settings, develop theoretical guarantees under mild assumptions and show that the proposed method enjoys good finite sample properties even in the high privacy regime. Additionally, we present applications in medical data that showcase the applicability of our methodology.
著者: Xingyuan Wei, Ce Li, Qiujian Lv, Ning Li, Degang Sun, Yan Wang
公開日: Mon, 24 Nov 2025 00:00:00 -0500
要約:
In dynamic Windows malware detection, deep learning models are extensively deployed to analyze API sequences. Methods based on API sequences play a crucial role in malware prevention. However, due to the continuous updates of APIs and the changes in API sequence calls leading to the constant evolution of malware variants, the detection capability of API sequence-based malware detection models significantly diminishes over time. We observe that the API sequences of malware samples before and after evolution usually have similar malicious semantics. Specifically, compared to the original samples, evolved malware samples often use the API sequences of the pre-evolution samples to achieve similar malicious behaviors. For instance, they access similar sensitive system resources and extend new malicious functions based on the original functionalities. In this paper, we propose a framework MME(Mitigating the impact of Malware Evolution), a framework that can enhance existing API sequence-based malware detectors and mitigate the adverse effects of malware evolution. To help detection models capture the similar semantics of these post-evolution API sequences, our framework represents API sequences using API knowledge graphs and system resource encodings and applies contrastive learning to enhance the model's encoder. Results indicate that, compared to regular Text-CNN, our framework can significantly reduce the false positive rate by 13.10% and improve the F1-Score by 8.47% on five years of data, achieving the best experimental results. Additionally, evaluations show that our framework can save on the human costs required for model maintenance. We only need 1% of the budget per month to reduce the false positive rate by 11.16% and improve the F1-Score by 6.44%.
著者: Gabriele Oligeri, Savio Sciancalepore
公開日: Mon, 24 Nov 2025 00:00:00 -0500
要約:
Radio Frequency Fingerprinting (RFF) techniques allow a receiver to authenticate a transmitter by analyzing the physical layer of the radio spectrum. Although the vast majority of scientific contributions focus on improving the performance of RFF considering different parameters and scenarios, in this work, we consider RFF as an attack vector to identify a target device in the radio spectrum. \\ We propose, implement, and evaluate {\em HidePrint}, a solution to prevent identification through RFF without affecting the quality of the communication link between the transmitter and the receiver. {\em HidePrint} hides the transmitter's fingerprint against an illegitimate eavesdropper through the injection of controlled noise into the transmitted signal. We evaluate our solution against various state-of-the-art RFF techniques, considering several adversarial models, data from real-world communication links (wired and wireless), and protocol configurations. Our results show that the injection of a Gaussian noise pattern with a normalized standard deviation of (at least) 0.02 prevents device fingerprinting in all the considered scenarios, while affecting the Signal-to-Noise Ratio (SNR) of the received signal by only 0.1 dB. Moreover, we introduce {\em selective radio fingerprint disclosure}, a new technique that allows the transmitter to disclose the radio fingerprint to only a subset of intended receivers.
著者: Chenyu Zhang, Lanjun Wang, Yiwen Ma, Wenhui Li, An-An Liu
公開日: Mon, 24 Nov 2025 00:00:00 -0500
要約:
Text-to-Image(T2I) models typically deploy safety filters to prevent the generation of sensitive images. Unfortunately, recent jailbreaking attack methods manually design instructions for the LLM to generate adversarial prompts, which effectively bypass safety filters while producing sensitive images, exposing safety vulnerabilities of T2I models. However, due to the LLM's limited understanding of the T2I model and its safety filters, existing methods require numerous queries to achieve a successful attack, limiting their practical applicability. To address this issue, we propose Reason2Attack(R2A), which aims to enhance the LLM's reasoning capabilities in generating adversarial prompts by incorporating the jailbreaking attack into the post-training process of the LLM. Specifically, we first propose a CoT example synthesis pipeline based on Frame Semantics, which generates adversarial prompts by identifying related terms and corresponding context illustrations. Using CoT examples generated by the pipeline, we fine-tune the LLM to understand the reasoning path and format the output structure. Subsequently, we incorporate the jailbreaking attack task into the reinforcement learning process of the LLM and design an attack process reward that considers prompt length, prompt stealthiness, and prompt effectiveness, aiming to further enhance reasoning accuracy. Extensive experiments on various T2I models show that R2A achieves a better attack success ratio while requiring fewer queries than baselines. Moreover, our adversarial prompts demonstrate strong attack transferability across both open-source and commercial T2I models.
著者: Alessandro Buldini, Carlo Mazzocca, Rebecca Montanari, Selcuk Uluagac
公開日: Mon, 24 Nov 2025 00:00:00 -0500
要約:
Self-Sovereign Identity (SSI) is a novel identity model that empowers individuals with full control over their data, enabling them to choose what information to disclose, with whom, and when. This paradigm is rapidly gaining traction worldwide, supported by numerous initiatives such as the European Digital Identity (EUDI) Regulation or Singapore's National Digital Identity (NDI). For instance, by 2026, the EUDI Regulation will enable all European citizens to seamlessly access services across Europe using Verifiable Credentials (VCs). A key feature of SSI is the ability to selectively disclose only specific claims within a credential, enhancing the privacy protection of the identity owner. This paper proposes a novel mechanism designed to achieve Compact and Selective Disclosure for VCs (CSD-JWT). Our method leverages a cryptographic accumulator to encode claims within a credential into a unique, compact representation. We implemented CSD-JWT as an open-source solution and extensively evaluated its performance under various conditions. CSD-JWT provides significant memory savings, lowering usage by up to 46% compared to the state-of-the-art. It also minimizes network overhead by producing remarkably smaller Verifiable Presentations (VPs), with size reduction from 27% to 93%. Such features make CSD-JWT especially well-suited for resource-constrained devices, including hardware wallets designed for managing credentials.
privacy
著者: Edoardo Marangone, Eugenio Nerio Nemmi, Daniele Friolo, Giuseppe Ateniese, Ingo Weber, Claudio Di Ciccio
公開日: Mon, 24 Nov 2025 00:00:00 -0500
要約:
Decision Support Systems are increasingly adopted to automate decision-making processes across industries, organizations and governments. However, decision support requires maintaining data privacy, integrity, and availability while ensuring customization, security, and verifiability of the decision process. Existing solutions fail to guarantee those properties altogether. Most commercial tools cater for data integrity and process customization but are centralized. This centralization potentially compromises data privacy and availability, as well as process security and verifiability. To overcome these limitations, we propose SPARTA, an approach based on Trusted Execution Environments (TEEs) that automates decision processes. To maintain data privacy, integrity, and availability, SPARTA employs efficient cryptographic techniques on notarized data with access mediated through user-defined access policies. Our solution also allows users to define decision rules, which are translated to certified software objects deployed within TEEs, thereby guaranteeing customization, verifiability, and security of the process. Based on experiments conducted on public benchmarks and synthetic data, we show that our approach is scalable and adds limited overhead compared to non-cryptographically secured solutions.
intellectual property
著者: Danilo Francati, Yevin Nikhel Goonatilake, Shubham Pawar, Daniele Venturi, Giuseppe Ateniese
公開日: Mon, 24 Nov 2025 00:00:00 -0500
要約:
We ask a basic question about cryptographic watermarking for generative models: to what extent can a watermark remain reliable when an adversary is allowed to corrupt the encoded signal? To study this question, we introduce a minimal coding abstraction that we call a zero-bit tamper-detection code. This is a secret-key procedure that samples a pseudorandom codeword and, given a candidate word, decides whether it should be treated as unmarked content or as the result of tampering with a valid codeword. It captures the two core requirements of robust watermarking: soundness and tamper detection. Within this abstraction we prove a sharp unconditional limit on robustness to independent symbol corruption. For an alphabet of size $q$, there is a critical corruption rate of $1 - 1/q$ such that no scheme with soundness, even relaxed to allow a fixed constant false positive probability on random content, can reliably detect tampering once an adversary can change more than this fraction of symbols. In particular, in the binary case no cryptographic watermark can remain robust if more than half of the encoded bits are modified. We also show that this threshold is tight by giving simple information-theoretic constructions that achieve soundness and tamper detection for all strictly smaller corruption rates. We then test experimentally whether this limit appears in practice by looking at the recent watermarking for images of Gunn, Zhao, and Song (ICLR 2025). We show that a simple crop and resize operation reliably flipped about half of the latent signs and consistently prevented belief-propagation decoding from recovering the codeword, erasing the watermark while leaving the image visually intact.
agent
著者: Chiyu Chen, Xinhao Song, Yunkai Chai, Yang Yao, Haodong Zhao, Lijun Li, Jie Li, Yan Teng, Gongshen Liu, Yingchun Wang
公開日: Mon, 24 Nov 2025 00:00:00 -0500
要約:
Vision-Language Models (VLMs) are increasingly deployed as autonomous agents to navigate mobile graphical user interfaces (GUIs). Operating in dynamic on-device ecosystems, which include notifications, pop-ups, and inter-app interactions, exposes them to a unique and underexplored threat vector: environmental injection. Unlike prompt-based attacks that manipulate textual instructions, environmental injection corrupts an agent's visual perception by inserting adversarial UI elements (for example, deceptive overlays or spoofed notifications) directly into the GUI. This bypasses textual safeguards and can derail execution, causing privacy leakage, financial loss, or irreversible device compromise. To systematically evaluate this threat, we introduce GhostEI-Bench, the first benchmark for assessing mobile agents under environmental injection attacks within dynamic, executable environments. Moving beyond static image-based assessments, GhostEI-Bench injects adversarial events into realistic application workflows inside fully operational Android emulators and evaluates performance across critical risk scenarios. We further propose a judge-LLM protocol that conducts fine-grained failure analysis by reviewing the agent's action trajectory alongside the corresponding screenshot sequence, pinpointing failure in perception, recognition, or reasoning. Comprehensive experiments on state-of-the-art agents reveal pronounced vulnerability to deceptive environmental cues: current models systematically fail to perceive and reason about manipulated UIs. GhostEI-Bench provides a framework for quantifying and mitigating this emerging threat, paving the way toward more robust and secure embodied agents.
著者: Chenyu Zhang, Tairen Zhang, Lanjun Wang, Ruidong Chen, Wenhui Li, Anan Liu
公開日: Mon, 24 Nov 2025 00:00:00 -0500
要約:
Using risky text prompts, such as pornography and violent prompts, to test the safety of text-to-image (T2I) models is a critical task. However, existing risky prompt datasets are limited in three key areas: 1) limited risky categories, 2) coarse-grained annotation, and 3) low effectiveness. To address these limitations, we introduce T2I-RiskyPrompt, a comprehensive benchmark designed for evaluating safety-related tasks in T2I models. Specifically, we first develop a hierarchical risk taxonomy, which consists of 6 primary categories and 14 fine-grained subcategories. Building upon this taxonomy, we construct a pipeline to collect and annotate risky prompts. Finally, we obtain 6,432 effective risky prompts, where each prompt is annotated with both hierarchical category labels and detailed risk reasons. Moreover, to facilitate the evaluation, we propose a reason-driven risky image detection method that explicitly aligns the MLLM with safety annotations. Based on T2I-RiskyPrompt, we conduct a comprehensive evaluation of eight T2I models, nine defense methods, five safety filters, and five attack strategies, offering nine key insights into the strengths and limitations of T2I model safety. Finally, we discuss potential applications of T2I-RiskyPrompt across various research fields. The dataset and code are provided in https://github.com/datar001/T2I-RiskyPrompt.
著者: Boyi Wei, Zora Che, Nathaniel Li, Udari Madhushani Sehwag, Jasper G\"otting, Samira Nedungadi, Julian Michael, Summer Yue, Dan Hendrycks, Peter Henderson, Zifan Wang, Seth Donoughe, Mantas Mazeika
公開日: Mon, 24 Nov 2025 00:00:00 -0500
要約:
Open-weight bio-foundation models present a dual-use dilemma. While holding great promise for accelerating scientific research and drug development, they could also enable bad actors to develop more deadly bioweapons. To mitigate the risk posed by these models, current approaches focus on filtering biohazardous data during pre-training. However, the effectiveness of such an approach remains unclear, particularly against determined actors who might fine-tune these models for malicious use. To address this gap, we propose BioRiskEval, a framework to evaluate the robustness of procedures that are intended to reduce the dual-use capabilities of bio-foundation models. BioRiskEval assesses models' virus understanding through three lenses, including sequence modeling, mutational effects prediction, and virulence prediction. Our results show that current filtering practices may not be particularly effective: Excluded knowledge can be rapidly recovered in some cases via fine-tuning, and exhibits broader generalizability in sequence modeling. Furthermore, dual-use signals may already reside in the pretrained representations, and can be elicited via simple linear probing. These findings highlight the challenges of data filtering as a standalone procedure, underscoring the need for further research into robust safety and security strategies for open-weight bio-foundation models.
privacyagent
著者: Ye Zheng, Yidan Hu
公開日: Mon, 24 Nov 2025 00:00:00 -0500
要約:
AI agents can autonomously perform tasks and, often without explicit user consent, collect or disclose users' sensitive local data, which raises serious privacy concerns. Although AI agents' privacy policies describe their intended data practices, there remains limited transparency and accountability about whether runtime behavior matches those policies. To close this gap, we introduce AudAgent, a visual tool that continuously monitors AI agents' data practices in real time and guards compliance with stated privacy policies. AudAgent consists of four components for automated privacy auditing of AI agents. (i) Policy formalization: a novel cross-LLM voting mechanism to guarantee confidence of the parsed privacy policy model. (ii) Runtime annotation: a lightweight Presidio-based analyzer detects sensitive data and annotates data practices based on the AI agent's context and the privacy policy model. (iii) Compliance auditing: ontology graphs and automata-based checking connect the privacy policy model with runtime annotations, enabling on-the-fly compliance checking. (iv) User interface: an infrastructure-independent implementation visualizes the real-time execution trace of AI agents along with potential privacy policy violations, providing user-friendly transparency and accountability. We evaluate AudAgent with AI agents built using mainstream frameworks, demonstrating its effectiveness in detecting and visualizing privacy policy violations in real time. Using AudAgent, we also find that most privacy policies omit explicit safeguards for highly sensitive data such as SSNs, whose misuse violates legal requirements, and that many agents do not refuse handling such data via third-party tools, including those controlled by Claude, Gemini, and DeepSeek. AudAgent proactively blocks operations on such data, overriding the agents' original privacy policy and behavior.
著者: Shourya Batra, Pierce Tillman, Samarth Gaggar, Shashank Kesineni, Kevin Zhu, Sunishchal Dev, Ashwinee Panda, Vasu Sharma, Maheep Chaudhary
公開日: Mon, 24 Nov 2025 00:00:00 -0500
要約:
As Large Language Models (LLMs) evolve into personal assistants with access to sensitive user data, they face a critical privacy challenge: while prior work has addressed output-level privacy, recent findings reveal that LLMs often leak private information through their internal reasoning processes, violating contextual privacy expectations. These leaky thoughts occur when models inadvertently expose sensitive details in their reasoning traces, even when final outputs appear safe. The challenge lies in preventing such leakage without compromising the model's reasoning capabilities, requiring a delicate balance between privacy and utility. We introduce Steering Activations towards Leakage-free Thinking (SALT), a lightweight test-time intervention that mitigates privacy leakage in model's Chain of Thought (CoT) by injecting targeted steering vectors into hidden state. We identify the high-leakage layers responsible for this behavior. Through experiments across multiple LLMs, we demonstrate that SALT achieves reductions including $18.2\%$ reduction in CPL on QwQ-32B, $17.9\%$ reduction in CPL on Llama-3.1-8B, and $31.2\%$ reduction in CPL on Deepseek in contextual privacy leakage dataset AirGapAgent-R while maintaining comparable task performance and utility. Our work establishes SALT as a practical approach for test-time privacy protection in reasoning-capable language models, offering a path toward safer deployment of LLM-based personal agents.
著者: Gioliano de Oliveira Braga, Pedro Henrique dos Santos Rocha, Rafael Pimenta de Mattos Paix\~ao, Giovani Hoff da Costa, Gustavo Cavalcanti Morais, Louren\c{c}o Alves Pereira J\'unior
公開日: Mon, 24 Nov 2025 00:00:00 -0500
要約:
Wi-Fi Channel State Information (CSI) has been repeatedly proposed as a biometric modality, often with reports of high accuracy and operational feasibility. However, the field lacks a consolidated understanding of its security properties, adversarial resilience, and methodological consistency. This Systematization of Knowledge (SoK) examines CSI-based biometric authentication through a security lens, analyzing how existing works diverge in sensing infrastructure, signal representations, feature pipelines, learning models, and evaluation methodologies. Our synthesis reveals systemic inconsistencies: reliance on aggregate accuracy metrics, limited reporting of FAR/FRR/EER, absence of per-user risk analysis, and scarce consideration of threat models or adversarial feasibility. To this end, we construct a unified evaluation framework to expose these issues empirically and demonstrate how security-relevant metrics such as per-class EER, Frequency Count of Scores (FCS), and the Gini Coefficient uncover risk concentration that remains hidden under traditional reporting practices. The resulting analysis highlights concrete attack surfaces--including replay, geometric mimicry, and environmental perturbation--and shows how methodological choices materially influence vulnerability profiles. Based on these findings, we articulate the security boundaries of current CSI biometrics and provide guidelines for rigorous evaluation, reproducible experimentation, and future research directions. This SoK offers the security community a structured, evidence-driven reassessment of Wi-Fi CSI biometrics and their suitability as an authentication primitive.
著者: Kaeli Andrews, Linh B. Ngo, Md Amiruzzaman
公開日: Mon, 24 Nov 2025 00:00:00 -0500
要約:
This paper presents a comprehensive comparative analysis of two dominant blockchain consensus mechanisms, Proof of Work (PoW) and Proof of Stake (PoS), evaluated across seven critical metrics: energy use, security, transaction speed, scalability, centralization risk, environmental impact, and transaction fees. Utilizing recent academic research and real-world blockchain data, the study highlights that PoW offers robust, time-tested security but suffers from high energy consumption, slower throughput, and centralization through mining pools. In contrast, PoS demonstrates improved scalability and efficiency, significantly reduced environmental impact, and more stable transaction fees, however it raises concerns over validator centralization and long-term security maturity. The findings underscore the trade-offs inherent in each mechanism and suggest hybrid designs may combine PoW's security with PoS's efficiency and sustainability. The study aims to inform future blockchain infrastructure development by striking a balance between decentralization, performance, and ecological responsibility.
agent
著者: Amine Ben Hassouna, Hana Chaari, Ines Belhaj
公開日: Mon, 24 Nov 2025 00:00:00 -0500
要約:
In an era where vast amounts of data are collected and processed from diverse sources, there is a growing demand for sophisticated AI systems capable of intelligently fusing and analyzing this information. To address these challenges, researchers have turned towards integrating tools into LLM-powered agents to enhance the overall information fusion process. However, the conjunction of these technologies and the proposed enhancements in several state-of-the-art works followed a non-unified software architecture, resulting in a lack of modularity and terminological inconsistencies among researchers. To address these issues, we propose a novel LLM-based Agent Unified Modeling Framework (LLM-Agent-UMF) that establishes a clear foundation for agent development from both functional and software architectural perspectives, developed and evaluated using the Architecture Tradeoff and Risk Analysis Framework (ATRAF). Our framework clearly distinguishes between the different components of an LLM-based agent, setting LLMs and tools apart from a new element, the core-agent, which plays the role of central coordinator. This pivotal entity comprises five modules: planning, memory, profile, action, and security -- the latter often neglected in previous works. By classifying core-agents into passive and active types based on their authoritative natures, we propose various multi-core agent architectures that combine unique characteristics of distinctive agents to tackle complex tasks more efficiently. We evaluate our framework by applying it to thirteen state-of-the-art agents, thereby demonstrating its alignment with their functionalities and clarifying overlooked architectural aspects. Moreover, we thoroughly assess five architecture variants of our framework by designing new agent architectures that combine characteristics of state-of-the-art agents to address specific goals. ...
backdoor
著者: Chibueze Peace Obioma, Youcheng Sun, Mustafa A. Mustafa
公開日: Mon, 24 Nov 2025 00:00:00 -0500
要約:
Federated learning (FL) remains highly vulnerable to adaptive backdoor attacks that preserve stealth by closely imitating benign update statistics. Existing defenses predominantly rely on anomaly detection in parameter or gradient space, overlooking behavioral constraints that backdoor attacks must satisfy to ensure reliable trigger activation. These anomaly-centric methods fail against adaptive attacks that normalize update magnitudes and mimic benign statistical patterns while preserving backdoor functionality, creating a fundamental detection gap. To address this limitation, this paper introduces FeRA (Federated Representative Attention) -- a novel attention-driven defense that shifts the detection paradigm from anomaly-centric to consistency-centric analysis. FeRA exploits the intrinsic need for backdoor persistence across training rounds, identifying malicious clients through suppressed representation-space variance, an orthogonal property to traditional magnitude-based statistics. The framework conducts multi-dimensional behavioral analysis combining spectral and spatial attention, directional alignment, mutual similarity, and norm inflation across two complementary detection mechanisms: consistency analysis and norm-inflation detection. Through this mechanism, FeRA isolates malicious clients that exhibit low-variance consistency or magnitude amplification. Extensive evaluation across six datasets, nine attacks, and three model architectures under both Independent and Identically Distributed (IID) and non-IID settings confirm FeRA achieves superior backdoor mitigation. Under different non-IID settings, FeRA achieved the lowest average Backdoor Accuracy (BA), about 1.67% while maintaining high clean accuracy compared to other state-of-the-art defenses. The code is available at https://github.com/Peatech/FeRA_defense.git.
著者: Patryk Krukowski, {\L}ukasz Gorczyca, Piotr Helm, Kamil Ksi\k{a}\.zek, Przemys{\l}aw Spurek
公開日: Mon, 24 Nov 2025 00:00:00 -0500
要約:
Continual learning under adversarial conditions remains an open problem, as existing methods often compromise either robustness, scalability, or both. We propose a novel framework that integrates Interval Bound Propagation (IBP) with a hypernetwork-based architecture to enable certifiably robust continual learning across sequential tasks. Our method, SHIELD, generates task-specific model parameters via a shared hypernetwork conditioned solely on compact task embeddings, eliminating the need for replay buffers or full model copies and enabling efficient over time. To further enhance robustness, we introduce Interval MixUp, a novel training strategy that blends virtual examples represented as $\ell_{\infty}$ balls centered around MixUp points. Leveraging interval arithmetic, this technique guarantees certified robustness while mitigating the wrapping effect, resulting in smoother decision boundaries. We evaluate SHIELD under strong white-box adversarial attacks, including PGD and AutoAttack, across multiple benchmarks. It consistently outperforms existing robust continual learning methods, achieving state-of-the-art average accuracy while maintaining both scalability and certification. These results represent a significant step toward practical and theoretically grounded continual learning in adversarial settings.
backdoor
著者: W. K. M Mithsara, Ning Yang, Ahmed Imteaj, Hussein Zangoti, Abdur R. Shahid
公開日: Mon, 24 Nov 2025 00:00:00 -0500
要約:
The widespread integration of wearable sensing devices in Internet of Things (IoT) ecosystems, particularly in healthcare, smart homes, and industrial applications, has required robust human activity recognition (HAR) techniques to improve functionality and user experience. Although machine learning models have advanced HAR, they are increasingly susceptible to data poisoning attacks that compromise the data integrity and reliability of these systems. Conventional approaches to defending against such attacks often require extensive task-specific training with large, labeled datasets, which limits adaptability in dynamic IoT environments. This work proposes a novel framework that uses large language models (LLMs) to perform poisoning detection and sanitization in HAR systems, utilizing zero-shot, one-shot, and few-shot learning paradigms. Our approach incorporates \textit{role play} prompting, whereby the LLM assumes the role of expert to contextualize and evaluate sensor anomalies, and \textit{think step-by-step} reasoning, guiding the LLM to infer poisoning indicators in the raw sensor data and plausible clean alternatives. These strategies minimize reliance on curation of extensive datasets and enable robust, adaptable defense mechanisms in real-time. We perform an extensive evaluation of the framework, quantifying detection accuracy, sanitization quality, latency, and communication cost, thus demonstrating the practicality and effectiveness of LLMs in improving the security and reliability of wearable IoT systems.
生成日時: 2025-11-24 18:00:02