arXiv論文一覧 - cs.CR updates on arXiv.org

#1 SolRugDetector: Investigating Rug Pulls on Solana

著者: Jiaxin Chen, Ziwei Li, Zigui Jiang, Ruihong He, Yantong Zhou, Jiajing Wu, Zibin Zheng

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2603.24625

要約:
Solana has experienced rapid growth due to its high performance and low transaction costs, but the extremely low barrier to token issuance has also led to widespread Rug Pulls. Unlike Ethereum-based Rug Pulls that rely on malicious smart contracts, the unified SPL Token program on Solana shifts fraudulent behaviors toward on-chain operations such as market manipulation. However, existing research has not yet conducted a systematic analysis of these specific Rug Pull patterns on Solana. In this paper, we present a comprehensive empirical study of Rug Pulls on Solana. Based on 68 real-world incident reports, we construct and release a manually labeled dataset containing 117 confirmed Rug Pull tokens and characterize the workflow of Rug Pulls on Solana. Building on this analysis, we propose SolRugDetector, a detection system that identifies fraudulent tokens solely using on-chain transaction and state data. Experimental results show that SolRugDetector outperforms existing tools on the labeled dataset. We further conduct a large-scale measurement on 100,063 tokens newly issued in the first half of 2025 and identify 76,469 Rug Pull tokens. After validating the in-the-wild detection results, we release this dataset and analyze the Rug Pull ecosystem on Solana. Our analysis reveals that Rug Pulls on Solana exhibit extremely short lifecycles, strong price-driven dynamics, severe economic losses, and highly organized group behaviors. These findings provide insights into the Solana Rug Pull landscape and support the development of effective on-chain defense mechanisms.

#2 An Explainable Federated Framework for Zero Trust Micro-Segmentation in IIoT Networks

著者: Muhammad Liman Gambo, Ahmad Almulhem

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2603.24754

要約:
Micro-segmentation as a core requirement of zero trust architecture (ZTA) divides networks into small security zones, called micro-segments, thereby minimizing impact of security breaches and restricting lateral movement of attackers. Existing approaches for Industrial Internet of Things (IIoT) networks often remain centralized, static, or difficult to interpret. These limitations are critical in IIoT, where devices are heterogeneous, communication behavior evolves over time, and raw data sharing across sites is often undesirable. Accordingly, we propose EFAH-ZTM, an Explainable Federated Autoencoder-Hypergraph framework for Zero Trust micro-segmentation in IIoT networks. The framework includes a trained federated DNAE that learns behavioral embeddings from distributed clients. kNN-based and Manifold-based hypergraphs capture higher-order relationships among device-flow instances. To generate micro-segments, MiniBatch KMeans and HDBSCAN clustering techniques are applied on the spectral embeddings, while an operational risk score that combines reconstruction error and structural outlierness drives allow/block policy decisions. Trustworthiness of the policy decision is improved through feature-level explanations using LIME and SHAP. Experiments on the WUSTL-IIoT-2021 dataset show that HDBSCAN achieved the strongest structural quality, while the manifold-based hypergraph produces the best oracle-aligned security efficacy that reaches a purity of 0.9990 with near-zero contamination. Similarly, the explainability module also showed high fidelity and stability, with surrogate classifier having an accuracy of 0.9927 and stable explanations across runs. Moreover, an ablation analysis shows that the federated learning preserves competitive segmentation quality relative to centralized training, and the hypergraph modeling significantly improves structural separation and risk stratification.

#3 AIP: Agent Identity Protocol for Verifiable Delegation Across MCP and A2A

agent

著者: Sunil Prakash

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2603.24775

要約:
AI agents increasingly call tools via the Model Context Protocol (MCP) and delegate to other agents via Agent-to-Agent (A2A), yet neither protocol verifies agent identity. A scan of approximately 2,000 MCP servers found all lacked authentication. In our survey, we did not identify a prior implemented protocol that jointly combines public-key verifiable delegation, holder-side attenuation, expressive chained policy, transport bindings across MCP/A2A/HTTP, and provenance-oriented completion records. We introduce Invocation-Bound Capability Tokens (IBCTs), a primitive that fuses identity, attenuated authorization, and provenance binding into a single append-only token chain. IBCTs operate in two wire formats: compact mode (a signed JWT for single-hop cases) and chained mode (a Biscuit token with Datalog policies for multi-hop delegation). We provide reference implementations in Python and Rust with full cross-language interoperability. Compact mode verification takes 0.049ms (Rust) and 0.189ms (Python), with 0.22ms overhead over no-auth in real MCP-over-HTTP deployment. In a real multi-agent deployment with Gemini 2.5 Flash, AIP adds 2.35ms of overhead (0.086% of total end-to-end latency). Adversarial evaluation across 600 attack attempts shows 100% rejection rate, with two attack categories (delegation depth violation and audit evasion through empty context) uniquely caught by AIP's chained delegation model that neither unsigned nor plain JWT deployments detect.

#4 Bridging Code Property Graphs and Language Models for Program Analysis

著者: Ahmed Lekssays

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2603.24837

要約:
Large Language Models (LLMs) face critical challenges when analyzing security vulnerabilities in real world codebases: token limits prevent loading entire repositories, code embeddings fail to capture inter procedural data flows, and LLMs struggle to generate complex static analysis queries. These limitations force existing approaches to operate on isolated code snippets, missing vulnerabilities that span multiple functions and files. We introduce codebadger, an open source Model Context Protocol (MCP) server that integrates Joern's Code Property Graph (CPG) engine with LLMs. Rather than requiring LLMs to generate complex CPG queries, codebadger provides high level tools for program slicing, taint tracking, data flow analysis, and semantic code navigation, enabling targeted exploration of large codebases without exhaustive file reading. We demonstrate its effectiveness through three use cases: (1) navigating an 8,000 method codebase to audit memory safety patterns, (2) discovering and exploiting a previously unreported buffer overflow in libtiff, and (3) generating a correct patch for an integer overflow vulnerability (CVE-2025-6021) in libxml2 on the first attempt. codebadger enables LLMs to reason about code semantically across entire repositories, supporting vulnerability discovery, patching, and program comprehension at scale.

#5 AI Security in the Foundation Model Era: A Comprehensive Survey from a Unified Perspective

著者: Zhenyi Wang, Siyu Luan

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2603.24857

要約:
As machine learning (ML) systems expand in both scale and functionality, the security landscape has become increasingly complex, with a proliferation of attacks and defenses. However, existing studies largely treat these threats in isolation, lacking a coherent framework to expose their shared principles and interdependencies. This fragmented view hinders systematic understanding and limits the design of comprehensive defenses. Crucially, the two foundational assets of ML -- \textbf{data} and \textbf{models} -- are no longer independent; vulnerabilities in one directly compromise the other. The absence of a holistic framework leaves open questions about how these bidirectional risks propagate across the ML pipeline. To address this critical gap, we propose a \emph{unified closed-loop threat taxonomy} that explicitly frames model-data interactions along four directional axes. Our framework offers a principled lens for analyzing and defending foundation models. The resulting four classes of security threats represent distinct but interrelated categories of attacks: (1) Data$\rightarrow$Data (D$\rightarrow$D): including \emph{data decryption attacks and watermark removal attacks}; (2) Data$\rightarrow$Model (D$\rightarrow$M): including \emph{poisoning, harmful fine-tuning attacks, and jailbreak attacks}; (3) Model$\rightarrow$Data (M$\rightarrow$D): including \emph{model inversion, membership inference attacks, and training data extraction attacks}; (4) Model$\rightarrow$Model (M$\rightarrow$M): including \emph{model extraction attacks}. Our unified framework elucidates the underlying connections among these security threats and establishes a foundation for developing scalable, transferable, and cross-modal security strategies, particularly within the landscape of foundation models.

#6 Trusted-Execution Environment (TEE) for Solving the Replication Crisis in Academia

著者: Jiasun Li, Project Team

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2603.24878

要約:
The growing replication crisis across disciplines such as economics, finance, and other social sciences as well as computer science undermines the credibility of academic research. Current institutional solutions -- such as artifact evaluations and replication packages -- suffer from critical limitations, including shortages of qualified data editors, difficulties in handling proprietary datasets, inefficient processes, and reliance on voluntary labor. This paper proposes a novel framework leveraging new technological advances in trusted-execution environments (TEEs) -- exemplified by Intel Trust Domain Extensions (TDX) -- to address the replication crisis in a cost-effective and scalable manner. Under our approach, authors execute replication packages within a cloud-based TEE and submit cryptographic proofs of correct execution, for which journals or conferences can efficiently verify without re-running the code. This reallocates the operational burden to authors while preserving data confidentiality and eliminating reliance on scarce editorial resources. As a proof of concept, we validate the feasibility of this system through field experiments, reporting a pilot study replicating published papers on TDX-backed cloud VMs, finding average costs of \$1.35--\$1.80 per package with minimal computational overhead relative to standard VMs and high success rates even for novice users with no prior TEE experience. We also provide a conduct formal analysis showing that TEE adoption is incentive-compatible for authors, cost-dominant for journals, and constitutes an equilibrium in the certification market. The findings highlight the potential of TEE technology to provide a sustainable, privacy-preserving, and efficient mechanism to address the replication crisis in academia.

#7 An Approach to Generate Attack Graphs with a Case Study on Siemens PCS7 Blueprint for Water Treatment Plants

著者: Lucas Miranda, Carlos Banjar, Daniel Menasche, Anton Kocheturov, Gaurav Srivastava, Tobias Limmer

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2603.24888

要約:
Assessing the security posture of Industrial Control Systems (ICS) is critical for protecting essential infrastructure. However, the complexity and scale of these environments make it challenging to identify and prioritize potential attack paths. This paper introduces a semi-automated approach for generating attack graphs in ICS environments to visualize and analyze multi-step attack scenarios. Our methodology integrates network topology information with vulnerability data to construct a model of the system. This model is then processed by a stateful traversal algorithm to identify potential exploit chains based on preconditions and consequences. We present a case study applying the proposed framework to the Siemens PCS7 Cybersecurity Blueprint for Water Treatment Plants. The results demonstrate the framework's ability to simulate different attack scenarios, including those originating from known CVEs and potential device misconfigurations. We show how a single point of failure can compromise network segmentation and how patching a critical vulnerability can protect an entire security zone, providing actionable insights for risk mitigation.

#8 Sovereign AI at the Front Door of Care: A Physically Unidirectional Architecture for Secure Clinical Intelligence

著者: Vasu Srinivasan, Dhriti Vasu

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2603.24898

要約:
We present a Sovereign AI architecture for clinical triage in which all inference is performed on-device and inbound data is delivered via a physically unidirectional channel, implemented using receive-only broadcast infrastructure or certified hardware data diodes, with no return path to any external network. This design removes the network-mediated attack surface by construction, rather than attempting to secure it through software controls. The system performs conversational symptom intake, integrates device-captured vitals, and produces structured, triage-aligned clinical records at the point of care. We formalize the security properties of receiver-side unidirectionality and show that the architecture is transport-agnostic across broadcast and diode-enforced deployments. We further analyze threat models, enforcement mechanisms, and deployment configurations, demonstrating how physical one-way data flow enables high-assurance operation in both resource-constrained and high-risk environments. This work positions physically unidirectional channels as a foundational primitive for sovereign, on-device clinical intelligence at the front door of care.

#9 LiteGuard: Efficient Task-Agnostic Model Fingerprinting with Enhanced Generalization

著者: Guang Yang, Ziye Geng, Yihang Chen, Changqing Luo

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2603.24982

要約:
Task-agnostic model fingerprinting has recently gained increasing attention due to its ability to provide a universal framework applicable across diverse model architectures and tasks. The current state-of-the-art method, MetaV, ensures generalization by jointly training a set of fingerprints and a neural-network-based global verifier using two large and diverse model sets: one composed of pirated models (i.e., the protected model and its variants) and the other comprising independently trained models. However, publicly available models are scarce in many real-world domains, and constructing such model sets requires intensive training and massive computational resources, posing a significant barrier to deployment. Reducing the number of models can alleviate the overhead, but increases the risk of overfitting, a problem further exacerbated by MetaV's entangled design, in which all fingerprints and the global verifier are jointly trained. This overfitting issue compromises the generalization capability for verifying unseen models. In this paper, we propose LiteGuard, an efficient task-agnostic fingerprinting framework that attains enhanced generalization while significantly lowering computational cost. Specifically, LiteGuard introduces two key innovations: (i) a checkpoint-based model set augmentation strategy that enriches model diversity by leveraging intermediate model snapshots captured during training of each pirated and independently trained model, thereby alleviating the need to train a large number of such models, and (ii) a local verifier architecture that pairs each fingerprint with a lightweight local verifier, thereby reducing parameter entanglement and mitigating overfitting. Extensive experiments across five representative tasks show that LiteGuard consistently outperforms MetaV in both generalization performance and computational efficiency.

#10 IrisFP: Adversarial-Example-based Model Fingerprinting with Enhanced Uniqueness and Robustness

著者: Ziye Geng, Guang Yang, Yihang Chen, Changqing Luo

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2603.24996

要約:
We propose IrisFP, a novel adversarial-example-based model fingerprinting framework that enhances both uniqueness and robustness by leveraging multi-boundary characteristics, multi-sample behaviors, and fingerprint discriminative power assessment to generate composite-sample fingerprints. Three key innovations make IrisFP outstanding: 1) It positions fingerprints near the intersection of all decision boundaries - unlike prior methods that target a single boundary - thus increasing the prediction margin without placing fingerprints deep inside target class regions, enhancing both robustness and uniqueness; 2) It constructs composite-sample fingerprints, each comprising multiple samples close to the multi-boundary intersection, to exploit collective behavior patterns and further boost uniqueness; and 3) It assesses the discriminative power of generated fingerprints using statistical separability metrics developed based on two reference model sets, respectively, for pirated and independently-trained models, retains the fingerprints with high discriminative power, and assigns fingerprint-specific thresholds to such retained fingerprints. Extensive experiments show that IrisFP consistently outperforms state-of-the-art methods, achieving reliable ownership verification by enhancing both robustness and uniqueness.

#11 Efficient ML-DSA Public Key Management Method with Identity for PKI and Its Application

著者: Penghui Liu, Yi Niu, Xiaoxiong Zhong, Jiahui Wu, Weizhe Zhang, Kaiping Xue, Bin Xiao

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2603.25043

要約:
With the rapid evolution of the Industrial Internet of Things (IIoT), the boundaries and scale of the Internet are continuously expanding. Consequently, the limitations of traditional certificate-based Public Key Infrastructure (PKI) have become increasingly evident, particularly in scenarios requiring large-scale certificate storage, verification, and frequent transmission. These challenges are expected to be further amplified by the widespread adoption of post-quantum cryptography. In this paper, we propose a novel identity-based public key management framework for PKI based on post-quantum cryptography, termed \textit{IPK-pq}. This approach implements an identity key generation protocol leveraging NIST ML-DSA and random matrix theory. Building on the concept of the Composite Public Key (CPK), \textit{IPK-pq} addresses the linear collusion problem inherent in CPK through an enhanced identity mapping mechanism. Furthermore, it simplifies the verification of the declared public key's authenticity, effectively reducing the complexity associated with certificate-based key management. We also provide a formal security proof for \textit{IPK-pq}, covering both individual private key components and the composite private key. To validate our approach, formally, we directly implement and evaluate \textit{IPK-pq} within a typical PKI application scenario: Resource PKI (RPKI). Comparative experimental results demonstrate that an RPKI system based on \textit{IPK-pq} yields significant improvements in efficiency and scalability. These results validate the feasibility and rationality of \textit{IPK-pq}, positioning it as a strong candidate for next-generation RPKI systems capable of securely managing large-scale routing information.

#12 The System Prompt Is the Attack Surface: How LLM Agent Configuration Shapes Security and Creates Exploitable Vulnerabilities

agent

著者: Ron Litvak

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2603.25056

要約:
System prompt configuration can make the difference between near-total phishing blindness and near-perfect detection in LLM email agents. We present PhishNChips, a study of 11 models under 10 prompt strategies, showing that prompt-model interaction is a first-order security variable: a single model's phishing bypass rate ranges from under 1% to 97% depending on how it is configured, while the false-positive cost of the same prompt varies sharply across models. We then show that optimizing prompts around highly predictive signals can improve benchmark performance, reaching up to 93.7% recall at 3.8% false positive rate, but also creates a brittle attack surface. In particular, domain-matching strategies perform well when legitimate emails mostly have matched sender and URL domains, yet degrade sharply when attackers invert that signal by registering matching infrastructure. Response-trace analysis shows that 98% of successful bypasses reason in ways consistent with the inverted signal: the models are following the instruction, but the instruction's core assumption has become false. A counter-intuitive corollary follows: making prompts more specific can degrade already-capable models by replacing broader multi-signal reasoning with exploitable single-signal dependence. We characterize the resulting tension between detection, usability, and adversarial robustness as a navigable tradeoff, introduce Safetility, a deployability-aware metric that penalizes false positives, and argue that closing the adversarial gap likely requires tool augmentation with external ground truth.

#13 PIDP-Attack: Combining Prompt Injection with Database Poisoning Attacks on Retrieval-Augmented Generation Systems

backdoor

著者: Haozhen Wang, Haoyue Liu, Jionghao Zhu, Zhichao Wang, Yongxin Guo, Xiaoying Tang

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2603.25164

要約:
Large Language Models (LLMs) have demonstrated remarkable performance across a wide range of applications. However, their practical deployment is often hindered by issues such as outdated knowledge and the tendency to generate hallucinations. To address these limitations, Retrieval-Augmented Generation (RAG) systems have been introduced, enhancing LLMs with external, up-to-date knowledge sources. Despite their advantages, RAG systems remain vulnerable to adversarial attacks, with data poisoning emerging as a prominent threat. Existing poisoning-based attacks typically require prior knowledge of the user's specific queries, limiting their flexibility and real-world applicability. In this work, we propose PIDP-Attack, a novel compound attack that integrates prompt injection with database poisoning in RAG. By appending malicious characters to queries at inference time and injecting a limited number of poisoned passages into the retrieval database, our method can effectively manipulate LLM response to arbitrary query without prior knowledge of the user's actual query. Experimental evaluations across three benchmark datasets (Natural Questions, HotpotQA, MS-MARCO) and eight LLMs demonstrate that PIDP-Attack consistently outperforms the original PoisonedRAG. Specifically, our method improves attack success rates by 4% to 16% on open-domain QA tasks while maintaining high retrieval precision, proving that the compound attack strategy is both necessary and highly effective.

#14 zk-X509: Privacy-Preserving On-Chain Identity from Legacy PKI via Zero-Knowledge Proofs

privacy

著者: Yeongju Bak (Tokamak Network, Seoul, South Korea)

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2603.25190

要約:
Public blockchains impose an inherent tension between regulatory compliance and user privacy. Existing on-chain identity solutions require centralized KYC attestors, specialized hardware, or Decentralized Identifier (DID) frameworks needing entirely new credential infrastructure. Meanwhile, over four billion active X.509 certificates constitute a globally deployed, government-grade trust infrastructure largely unexploited for decentralized identity. This paper presents zk-X509, a privacy-preserving identity system bridging legacy Public Key Infrastructure (PKI) with public ledgers via a RISC-V zero-knowledge virtual machine (zkVM). Users prove ownership of standard X.509 certificates without revealing private keys or personal identifiers. Crucially, the private key never enters the ZK circuit; ownership is proven via OS keychain signature delegation (e.g., macOS Secure Enclave, Windows TPM). The circuit verifies certificate chain validity, temporal validity, key ownership, trustless CRL revocation, blockchain address binding, and Sybil-resistant nullifier generation. It commits 13 public values, including a Certificate Authority (CA) Merkle root hiding the issuing CA, and four selective disclosure hashes. We formalize eight security properties under a Dolev-Yao adversary with game-based definitions and reductions to sEUF-CMA, SHA-256 collision resistance, and ZK soundness. Evaluated on the SP1 zkVM, the system achieves 11.8M cycles for ECDSA P-256 (17.4M for RSA-2048), with on-chain Groth16 verification costing ~300K gas. By leveraging certificates deployed at scale across jurisdictions, zk-X509 enables adoption without new trust establishment, complementing emerging DID-based systems.

#15 Mitigating Evasion Attacks in Fog Computing Resource Provisioning Through Proactive Hardening

著者: Younes Salmi, Hanna Bogucka

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2603.25257

要約:
This paper investigates the susceptibility to model integrity attacks that overload virtual machines assigned by the k-means algorithm used for resource provisioning in fog networks. The considered k-means algorithm runs two phases iteratively: offline clustering to form clusters of requested workload and online classification of new incoming requests into offline-created clusters. First, we consider an evasion attack against the classifier in the online phase. A threat actor launches an exploratory attack using query-based reverse engineering to discover the Machine Learning (ML) model (the clustering scheme). Then, a passive causative (evasion) attack is triggered in the offline phase. To defend the model, we suggest a proactive method using adversarial training to introduce attack robustness into the classifier. Our results show that our mitigation technique effectively maintains the stability of the resource provisioning system against attacks.

#16 Usability of Passwordless Authentication in Wi-Fi Networks: A Comparative Study of Passkeys and Passwords in Captive Portals

著者: Marti\~no Rivera-Dourado, Rub\'en P\'erez-Jove, Alejandro Pazos, Jose V\'azquez-Naya

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2603.25290

要約:
Passkeys have recently emerged as a passwordless authentication mechanism, yet their usability in captive portals remains unexplored. This paper presents an empirical, comparative usability study of passkeys and passwords in a Wi-Fi hotspot using a captive portal. We conducted a controlled laboratory experiment with 50 participants following a split-plot design across Android and Windows platforms, using a router implementing the FIDO2CAP protocol. Our results show a tendency for passkeys to be perceived as more usable than passwords during login, although differences are not statistically significant. Independent of the authentication method, captive portal limitations negatively affected user experience and increased error rates. We further found that passkeys are generally easy to configure on both platforms, but platform-specific issues introduce notable usability challenges. Based on quantitative and qualitative findings, we derive design recommendations to improve captive portal authentication, including the introduction of usernameless authentication flows, improved captive portal detection mechanisms, and user interface design changes.

#17 Physical Backdoor Attack Against Deep Learning-Based Modulation Classification

backdoor

著者: Younes Salmi, Hanna Bogucka

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2603.25304

要約:
Deep Learning (DL) has become a key technology that assists radio frequency (RF) signal classification applications, such as modulation classification. However, the DL models are vulnerable to adversarial machine learning threats, such as data manipulation attacks. We study a physical backdoor (Trojan) attack that targets a DL-based modulation classifier. In contrast to digital backdoor attacks, where digital triggers are injected into the training dataset, we use power amplifier (PA) non-linear distortions to create physical triggers before the dataset is formed. During training, the adversary manipulates amplitudes of RF signals and changes their labels to a target modulation scheme, training a backdoored model. At inference, the adversary aims to keep the backdoor attack inactive such that the backdoored model maintains high accuracy on test signals. However, if they apply the same manipulation used during training on these test signals, the backdoor is activated, and the model misclassifies these signals. We demonstrate that our proposed attack achieves high attack success rates with few manipulated RD signals for different noise levels. Furthermore, we test the resilience of the proposed attack to multiple defense techniques, and the results show that these techniques fail to mitigate the attack.

#18 On the Vulnerability of Deep Automatic Modulation Classifiers to Explainable Backdoor Threats

backdoor

著者: Younes Salmi, Hanna Bogucka

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2603.25310

要約:
Deep learning (DL) has been widely studied for assisting applications of modern wireless communications. One of the applications is automatic modulation classification (AMC). However, DL models are found to be vulnerable to adversarial machine learning (AML) threats. One of the most persistent and stealthy threats is the backdoor (Trojan) attack. Nevertheless, most studied threats originate from other AI domains, such as computer vision (CV). Therefore, in this paper, a physical backdoor attack targeting the wireless signal before transmission is studied. The adversary is considered to be using explainable AI (XAI) to guide the placement of the trigger in the most vulnerable parts of the signal. Then, a class prototype combined with principal components is used to generate the trigger. The studied threat was found to be efficient in breaching multiple DL-based AMC models. The attack achieves high success rates for a wide range of SNR values and a small poisoning ratio.

#19 Multi-target Coverage-based Greybox Fuzzing

著者: Masami Ichikawa

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2603.25354

要約:
In recent years, fuzzing has been widely applied not only to application software but also to system software, including the Linux kernel and firmware, and has become a powerful technique for vulnerability discovery. Among these approaches, Coverage-based grey-box fuzzing, which utilizes runtime code coverage information, has become the dominant methodology. Conventional fuzzing techniques primarily target a single software component and have paid little attention to cooperative execution with other software. However, modern system software architectures commonly consist of firmware and an operating system that operate cooperatively through well-defined interfaces, such as OpenSBI in the RISC-V architecture and OP-TEE in the ARM architecture. In this study, we investigate fuzzing techniques for architectures in which an operating system and firmware operate cooperatively. In particular, we propose a fuzzing method that enables deeper exploration of the system by leveraging the code coverage of each cooperating software component as feedback, compared to conventional Single-target fuzzing. To observe the execution of the operating system and firmware in a unified manner, our method adopts QEMU as a virtualization environment and executes fuzzing by booting the system within a virtual machine. This enables the measurement of code coverage across software boundaries. Furthermore, we implemented the proposed method as a Multi-target Coverage-based Greybox Fuzzer called MTCFuzz and evaluated its effectiveness.

#20 ALPS: Automated Least-Privilege Enforcement for Securing Serverless Functions

著者: Changhee Shin, Bom Kim, Seungsoo Lee

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2603.25393

要約:
Serverless computing is increasingly adopted for AI-driven workloads due to its automatic scaling and pay-as-you-go model. However, its function-based architecture creates significant security risks, including excessive privilege allocation and poor permission management. In this paper, we present ALPS, an automated framework for enforcing least privilege in serverless environments. Our system employs serverless-tailored static analysis to extract precise permission requirements from function code and a fine-tuned Large Language Model (LLM) to generate language- and vendor-specific security policies. It also performs real-time monitoring to block unauthorized access and adapt to policy or code changes, supporting heterogeneous cloud providers and programming languages. In an evaluation of 8,322 real-world functions across AWS, Google Cloud, and Azure, ALPS achieved 94.8\% coverage for least-privilege extraction, improved security logic generation quality by 220\% (BLEU), 124\% (ChrF++) and 100\% (ROUGE-2), and added minimum performance overhead. These results demonstrate that ALPS provides an effective, practical, and vendor-agnostic solution for securing serverless workloads.

#21 Shape and Substance: Dual-Layer Side-Channel Attacks on Local Vision-Language Models

著者: Eyal Hadad, Mordechai Guri

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2603.25403

要約:
On-device Vision-Language Models (VLMs) promise data privacy via local execution. However, we show that the architectural shift toward Dynamic High-Resolution preprocessing (e.g., AnyRes) introduces an inherent algorithmic side-channel. Unlike static models, dynamic preprocessing decomposes images into a variable number of patches based on their aspect ratio, creating workload-dependent inputs. We demonstrate a dual-layer attack framework against local VLMs. In Tier 1, an unprivileged attacker can exploit significant execution-time variations using standard unprivileged OS metrics to reliably fingerprint the input's geometry. In Tier 2, by profiling Last-Level Cache (LLC) contention, the attacker can resolve semantic ambiguity within identical geometries, distinguishing between visually dense (e.g., medical X-rays) and sparse (e.g., text documents) content. By evaluating state-of-the-art models such as LLaVA-NeXT and Qwen2-VL, we show that combining these signals enables reliable inference of privacy-sensitive contexts. Finally, we analyze the security engineering trade-offs of mitigating this vulnerability, reveal substantial performance overhead with constant-work padding, and propose practical design recommendations for secure Edge AI deployments.

#22 Unveiling the Resilience of LLM-Enhanced Search Engines against Black-Hat SEO Manipulation

著者: Pei Chen, Geng Hong, Xinyi Wu, Mengying Wu, Zixuan Zhu, Mingxuan Liu, Baojun Liu, Mi Zhang, Min Yang

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2603.25500

要約:
The emergence of Large Language Model-enhanced Search Engines (LLMSEs) has revolutionized information retrieval by integrating web-scale search capabilities with AI-powered summarization. While these systems demonstrate improved efficiency over traditional search engines, their security implications against well-established black-hat Search Engine Optimization (SEO) attacks remain unexplored. In this paper, we present the first systematic study of SEO attacks targeting LLMSEs. Specifically, we examine ten representative LLMSE products (e.g., ChatGPT, Gemini) and construct SEO-Bench, a benchmark comprising 1,000 real-world black-hat SEO websites, to evaluate both open- and closed-source LLMSEs. Our measurements show that LLMSEs mitigate over 99.78% of traditional SEO attacks, with the phase of retrieval serving as the primary filter, intercepting the vast majority of malicious queries. We further propose and evaluate seven LLMSEO attack strategies, demonstrating that off-the-shelf LLMSEs are vulnerable to LLMSEO attacks, i.e., rewritten-query stuffing and segmented texts double the manipulation rate compared to the baseline. This work offers the first in-depth security analysis of the LLMSE ecosystem, providing practical insights for building more resilient AI-driven search systems. We have responsibly reported the identified issues to major vendors.

#23 TAAC: A gate into Trustable Audio Affective Computing

著者: Xintao Hu, Feng-Qi Cui

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2603.25570

要約:
With the emergence of AI techniques for depression diagnosis, the conflict between high demand and limited supply for depression screening has been significantly alleviated. Among various modal data, audio-based depression diagnosis has received increasing attention from both academia and industry since audio is the most common carrier of emotion transmission. Unfortunately, audio data also contains User-sensitive Identity Information (ID), which is extremely vulnerable and may be maliciously used during the smart diagnosis process. Among previous methods, the clarification between depression features and sensitive features has always serve as a barrier. It is also critical to the problem for introducing a safe encryption methodology that only encrypts the sensitive features and a powerful classifier that can correctly diagnose the depression. To track these challenges, by leveraging adversarial loss-based Subspace Decomposition, we propose a first practical framework \name presented for Trustable Audio Affective Computing, to perform automated depression detection through audio within a trustable environment. The key enablers of TAAC are Differentiating Features Subspace Decompositor (DFSD), Flexible Noise Encryptor (FNE) and Staged Training Paradigm, used for decomposition, ID encryption and performance enhancement, respectively. Extensive experiments with existing encryption methods demonstrate our framework's preeminent performance in depression detection, ID reservation and audio reconstruction. Meanwhile, the experiments across various setting demonstrates our model's stability under different encryption strengths. Thus proving our framework's excellence in Confidentiality, Accuracy, Traceability, and Adjustability.

#24 Malicious LLM-Based Conversational AI Makes Users Reveal Personal Information

著者: Xiao Zhan, Juan Carlos Carrillo, William Seymour, Jose Such

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2506.11680

要約:
LLM-based Conversational AIs (CAIs), also known as GenAI chatbots, like ChatGPT, are increasingly used across various domains, but they pose privacy risks, as users may disclose personal information during their conversations with CAIs. Recent research has demonstrated that LLM-based CAIs could be used for malicious purposes. However, a novel and particularly concerning type of malicious LLM application remains unexplored: an LLM-based CAI that is deliberately designed to extract personal information from users. In this paper, we report on the malicious LLM-based CAIs that we created based on system prompts that used different strategies to encourage disclosures of personal information from users. We systematically investigate CAIs' ability to extract personal information from users during conversations by conducting a randomized-controlled trial with 502 participants. We assess the effectiveness of different malicious and benign CAIs to extract personal information from participants, and we analyze participants' perceptions after their interactions with the CAIs. Our findings reveal that malicious CAIs extract significantly more personal information than benign CAIs, with strategies based on the social nature of privacy being the most effective while minimizing perceived risks. This study underscores the privacy threats posed by this novel type of malicious LLM-based CAIs and provides actionable recommendations to guide future research and practice.

#25 Amplified Patch-Level Differential Privacy for Free via Random Cropping

privacy

著者: Kaan Durmaz, Jan Schuchardt, Sebastian Schmidt, Stephan G\"unnemann

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2603.24695

要約:
Random cropping is one of the most common data augmentation techniques in computer vision, yet the role of its inherent randomness in training differentially private machine learning models has thus far gone unexplored. We observe that when sensitive content in an image is spatially localized, such as a face or license plate, random cropping can probabilistically exclude that content from the model's input. This introduces a third source of stochasticity in differentially private training with stochastic gradient descent, in addition to gradient noise and minibatch sampling. This additional randomness amplifies differential privacy without requiring changes to model architecture or training procedure. We formalize this effect by introducing a patch-level neighboring relation for vision data and deriving tight privacy bounds for differentially private stochastic gradient descent (DP-SGD) when combined with random cropping. Our analysis quantifies the patch inclusion probability and shows how it composes with minibatch sampling to yield a lower effective sampling rate. Empirically, we validate that patch-level amplification improves the privacy-utility trade-off across multiple segmentation architectures and datasets. Our results demonstrate that aligning privacy accounting with domain structure and additional existing sources of randomness can yield stronger guarantees at no additional cost.

#26 On the Foundations of Trustworthy Artificial Intelligence

著者: TJ Dunham

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2603.24904

要約:
We prove that platform-deterministic inference is necessary and sufficient for trustworthy AI. We formalize this as the Determinism Thesis and introduce trust entropy to quantify the cost of non-determinism, proving that verification failure probability equals 1 - 2^{-H_T} exactly. We prove a Determinism-Verification Collapse: verification under determinism requires O(1) hash comparison; without it, the verifier faces an intractable membership problem. IEEE 754 floating-point arithmetic fundamentally violates the determinism requirement. We resolve this by constructing a pure integer inference engine that achieves bitwise identical output across ARM and x86. In 82 cross-architecture tests on models up to 6.7B parameters, we observe zero hash mismatches. Four geographically distributed nodes produce identical outputs, verified by 356 on-chain attestation transactions. Every major trust property of AI systems (fairness, robustness, privacy, safety, alignment) presupposes platform determinism. Our system, 99,000 lines of Rust deployed across three continents, establishes that AI trust is a question of arithmetic.

#27 A Public Theory of Distillation Resistance via Constraint-Coupled Reasoning Architectures

model extraction

著者: Peng Wei, Wesley Shu

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2603.25022

要約:
Knowledge distillation, model extraction, and behavior transfer have become central concerns in frontier AI. The main risk is not merely copying, but the possibility that useful capability can be transferred more cheaply than the governance structure that originally accompanied it. This paper presents a public, trade-secret-safe theoretical framework for reducing that asymmetry at the architectural level. The core claim is that distillation becomes less valuable as a shortcut when high-level capability is coupled to internal stability constraints that shape state transitions over time. To formalize this idea, the paper introduces a constraint-coupled reasoning framework with four elements: bounded transition burden, path-load accumulation, dynamically evolving feasible regions, and a capability-stability coupling condition. The paper is intentionally public-safe: it omits proprietary implementation details, training recipes, thresholds, hidden-state instrumentation, deployment procedures, and confidential system design choices. The contribution is therefore theoretical rather than operational. It offers a falsifiable architectural thesis, a clear threat model, and a set of experimentally testable hypotheses for future work on distillation resistance, alignment, and model governance.

#28 From Logic Monopoly to Social Contract: Separation of Power and the Institutional Foundations for Autonomous Agent Economies

agent

著者: Anbang Ruan

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2603.25100

要約:
Existing multi-agent frameworks allow each agent to simultaneously plan, execute, and evaluate its own actions -- a structural deficiency we term the "Logic Monopoly." Empirical evidence quantifies the resulting "Reliability Gap": 84.30% average attack success rates across ten deployment scenarios, 31.4% emergent deceptive behavior without explicit reward signals, and cascading failure modes rooted in six structural bottlenecks. The remedy is not better alignment of individual models but a social contract for agents: institutional infrastructure that enforces a constitutional Separation of Power. This paper introduces the Agent Enterprise for Enterprise (AE4E) paradigm -- agents as autonomous, legally identifiable business entities within a functionalist social system -- with a contract-centric SoP model trifurcating authority into Legislation, Execution, and Adjudication branches. The paradigm is operationalized through the NetX Enterprise Framework (NEF): governance hubs, TEE-backed compute enclaves, privacy-preserving data bridges, and an Agent-Native blockchain substrate. The Agent Enterprise Economy scales across four deployment tiers from private enclaves to a global Web of Services. The Agentic Social Layer, grounded in Parsons' AGIL framework, provides institutional infrastructure via sixty-plus named Institutional AE4Es. 143 pages, 173 references, eight specialized smart contracts.

#29 Second order Recurrences, quadratic number fields and cyclic codes

著者: Minjia Shi, Xuan Wang, Bouazzaoui Zakariae, Jon-Lark Kim, Patrick Sol\'e

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2603.25343

要約:
Wall-Sun-Sun primes (shortly WSS primes) are defined as those primes $p$ such that the period of the Fibonacci recurrence is the same modulo $p$ and modulo $p^2.$ This concept has been generalized recently to certain second order recurrences whose characteristic polynomials admit as a zero the principal unit of $\mathbb{Q}(\sqrt{d}),$ for some integer $d>0.$ Primes of the latter type we call $WSS(d).$ They correspond to the case when $\mathbb{Q}(\sqrt{d})$ is not $p$-rational. For such a prime $p$ we study the weight distributions of the cyclic codes over $\mathbb{F}_p$ and $\mathbb{Z}_{p^2}$ whose check polynomial is the reciprocal of the said characteristic polynomial. Some of these codes are MDS (reducible case) or NMDS (irreducible case).

#30 Supercharging Federated Intelligence Retrieval

著者: Dimitris Stripelis, Patrick Foley, Mohammad Naseri, William Lindskog-M\"unzing, Chong Shen Ng, Daniel Janes Beutel, Nicholas D. Lane

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2603.25374

要約:
RAG typically assumes centralized access to documents, which breaks down when knowledge is distributed across private data silos. We propose a secure Federated RAG system built using Flower that performs local silo retrieval, while server-side aggregation and text generation run inside an attested, confidential compute environment, enabling confidential remote LLM inference even in the presence of honest-but-curious or compromised servers. We also propose a cascading inference approach that incorporates a non-confidential third-party model (e.g., Amazon Nova) as auxiliary context without weakening confidentiality.

#31 Beyond Content Safety: Real-Time Monitoring for Reasoning Vulnerabilities in Large Language Models

著者: Xunguang Wang, Yuguang Zhou, Qingyue Wang, Zongjie Li, Ruixuan Huang, Zhenlan Ji, Pingchuan Ma, Shuai Wang

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2603.25412

要約:
Large language models (LLMs) increasingly rely on explicit chain-of-thought (CoT) reasoning to solve complex tasks, yet the safety of the reasoning process itself remains largely unaddressed. Existing work on LLM safety focuses on content safety--detecting harmful, biased, or factually incorrect outputs -- and treats the reasoning chain as an opaque intermediate artifact. We identify reasoning safety as an orthogonal and equally critical security dimension: the requirement that a model's reasoning trajectory be logically consistent, computationally efficient, and resistant to adversarial manipulation. We make three contributions. First, we formally define reasoning safety and introduce a nine-category taxonomy of unsafe reasoning behaviors, covering input parsing errors, reasoning execution errors, and process management errors. Second, we conduct a large-scale prevalence study annotating 4111 reasoning chains from both natural reasoning benchmarks and four adversarial attack methods (reasoning hijacking and denial-of-service), confirming that all nine error types occur in practice and that each attack induces a mechanistically interpretable signature. Third, we propose a Reasoning Safety Monitor: an external LLM-based component that runs in parallel with the target model, inspects each reasoning step in real time via a taxonomy-embedded prompt, and dispatches an interrupt signal upon detecting unsafe behavior. Evaluation on a 450-chain static benchmark shows that our monitor achieves up to 84.88\% step-level localization accuracy and 85.37\% error-type classification accuracy, outperforming hallucination detectors and process reward model baselines by substantial margins. These results demonstrate that reasoning-level monitoring is both necessary and practically achievable, and establish reasoning safety as a foundational concern for the secure deployment of large reasoning models.

#32 Contextualizing Security and Privacy of Software-Defined Vehicles: A Literature Review and Industry Perspectives

privacy

著者: Marco De Vincenzi, Mert D. Pes\'e, Chiara Bodei, Ilaria Matteucci, Richard R. Brooks, Monowar Hasan, Andrea Saracino, Mohammad Hamad, Sebastian Steinhorst

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2411.10612

要約:
The growing reliance on software in road vehicles has led to the emergence of Software-Defined Vehicles (SDV). This work analyzes SDV security and privacy through a systematic literature review complemented by an industry questionnaire across the automotive supply chain. The analysis is structured as four research questions and results in a security framework serving as a roadmap for SDV protection. The findings emphasize addressing mixed-criticality architectural challenges, deploying layered security mechanisms, and integrating privacy-preserving techniques. The results highlight the need to harmonize in-vehicle and cloud-based defenses to strengthen cybersecurity and V2X resilience in Intelligent Transportation Systems (ITS).

#33 Vexed by VEX tools: Consistency evaluation of container vulnerability scanners

著者: Yekatierina Churakova, Mathias Ekstedt, Larissa Schmid

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2503.14388

要約:
The Vulnerability Exploitability eXchange (VEX) format has been introduced to complement Software Bill of Materials (SBOM) with security advisories of known vulnerabilities. VEX gives an accurate understanding of vulnerabilities found in the dependencies of third-party software, which is critical for secure software development and risk analysis. In this paper, we present a study that analyzes state-of-the-art VEX-generation tools (Trivy, Grype, DepScan, Scout, Snyk, OSV, Vexy) applied to containers. Our study examines how consistently different VEX-generation tools perform. By evaluating their performance across multiple datasets, we aim to gain insight into the overall maturity of the VEX-generation tool ecosystem, beyond any single implementation. We use the Jaccard and Tversky indices to produce similarity scores of tool results for three different datasets created from container images. Overall, our results show a low level of consistency among the tools, thus indicating a low level of maturity in the VEX tool space. We perform a number of experiments to explore the impact of different factors on the consistency of the results, with the difference in vulnerability databases queried showing the largest impact.

#34 Measuring likelihood in cybersecurity

著者: Pablo Corona-Fraga, Vanessa Diaz-Rodriguez, Jesus Manuel Niebla-Zatarain, Gabriel Sanchez-Perez

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2504.15395

要約:
In cybersecurity risk is commonly measured by impact and probability, the former is objectively measured based on the consequences from the use of technology to obtain business gains, or by achieving business objectives. The latter has been measured, in sectors such as financial or insurance, based on historical data because there is vast information, and many other fields have applied the same approach. Although in cybersecurity, as a new discipline, there is not always historical data to support an objective measure of probability, the data available is not public and there is no consistent formatting to store and share it, so a new approach is required to measure cybersecurity events incidence. Through a comprehensive analysis of the state of the art, including current methodologies, frameworks, and incident data, considering tactics, techniques, and procedures (TTP) used by attackers, indicators of compromise (IOC), and defence controls, this work proposes a data model that describes a cyber exposure profile that provides an indirect but objective measure for likelihood, including different sources and metrics to update the model if needed. We further propose a set of practical, quantifiable metrics for risk assessment, enabling cybersecurity practitioners to measure likelihood without relying solely on historical incident data. By combining these metrics with our data model, organizations gain an actionable framework for continuously refining their cybersecurity strategies.

#35 DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agents

agent

著者: Hao Li, Xiaogeng Liu, Hung-Chun Chiu, Dianqi Li, Ning Zhang, Chaowei Xiao

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2506.12104

要約:
Large Language Models (LLMs) are increasingly central to agentic systems due to their strong reasoning and planning capabilities. By interacting with external environments through predefined tools, these agents can carry out complex user tasks. Nonetheless, this interaction also introduces the risk of prompt injection attacks, where malicious inputs from external sources can mislead the agent's behavior, potentially resulting in economic loss, privacy leakage, or system compromise. System-level defenses have recently shown promise by enforcing static or predefined policies, but they still face two key challenges: the ability to dynamically update security rules and the need for memory stream isolation. To address these challenges, we propose Dynamic Rule-based Isolation Framework for Trustworthy agentic systems (DRIFT), which enforces the dynamic security policy and injection isolation for securing LLM agents against prompt injection attacks. A Secure Planner first constructs a minimal function trajectory and a JSON-schema-style parameter checklist for each function node based on the user query. A Dynamic Validator then monitors deviations from the original plan, assessing whether changes comply with privilege limitations and the user's intent. Finally, an Injection Isolator detects and masks any instructions that may conflict with the user query from the memory stream to mitigate long-term risks. We empirically validate the effectiveness of DRIFT on the AgentDojo, ASB, and AgentDyn benchmark, demonstrating its strong security performance while maintaining high utility across diverse models, showcasing both its robustness and adaptability. The project website is available at https://safo-lab.github.io/DRIFT.

#36 NeuroStrike: Neuron-Level Attacks on Aligned LLMs

著者: Lichao Wu, Sasha Behrouzi, Mohamadreza Rostami, Maximilian Thang, Stjepan Picek, Ahmad-Reza Sadeghi

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2509.11864

要約:
Safety alignment is critical for the ethical deployment of large language models (LLMs), guiding them to avoid generating harmful or unethical content. Current alignment techniques, such as supervised fine-tuning and reinforcement learning from human feedback, remain fragile and can be bypassed by carefully crafted adversarial prompts. Unfortunately, such attacks rely on trial and error, lack generalizability across models, and are constrained by scalability and reliability. This paper presents NeuroStrike, a novel and generalizable attack framework that exploits a fundamental vulnerability introduced by alignment techniques: the reliance on sparse, specialized safety neurons responsible for detecting and suppressing harmful inputs. We apply NeuroStrike to both white-box and black-box settings: In the white-box setting, NeuroStrike identifies safety neurons through feedforward activation analysis and prunes them during inference to disable safety mechanisms. In the black-box setting, we propose the first LLM profiling attack, which leverages safety neuron transferability by training adversarial prompt generators on open-weight surrogate models and then deploying them against black-box and proprietary targets. We evaluate NeuroStrike on over 20 open-weight LLMs from major LLM developers. By removing less than 0.6% of neurons in targeted layers, NeuroStrike achieves an average attack success rate (ASR) of 76.9% using only vanilla malicious prompts. Moreover, Neurostrike generalizes to four multimodal LLMs with 100% ASR on unsafe image inputs. Safety neurons transfer effectively across architectures, raising ASR to 78.5% on 11 fine-tuned models and 77.7% on five distilled models. The black-box LLM profiling attack achieves an average ASR of 63.7% across five black-box models, including the Google Gemini family.

#37 Locket: Robust Feature-Locking Technique for Language Models

著者: Lipeng He, Vasisht Duddu, N. Asokan

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2510.12117

要約:
Chatbot service providers (e.g., OpenAI) rely on tiered subscription plans to generate revenue, offering black-box access to basic models for free users and advanced models to paying subscribers. However, this approach is unprofitable and inflexible for the users. A pay-to-unlock scheme for premium features (e.g., math, coding) offers a more sustainable alternative. Enabling such a scheme requires a feature-locking technique (FLoTE) that is (i) effective in refusing locked features, (ii) utility-preserving for unlocked features, (iii) robust against evasion or unauthorized credential sharing, and (iv) scalable to multiple features and clients. Existing FLoTEs (e.g., password-locked models) fail to meet these criteria. To fill this gap, we present Locket, the first robust and scalable FLoTE to enable pay-to-unlock schemes. We develop a framework for adversarial training and merging of feature-locking adapters, which enables Locket to selectively enable or disable specific features of a model. Evaluation shows that Locket is effective ($100$% refusal rate), utility-preserving ($\leq 7$% utility degradation), robust ($\leq 5$% attack success rate), and scalable to multiple features and clients.

#38 Epistemic Bias Injection: Biasing LLMs via Selective Context Retrieval

著者: Hao Wu, Prateek Saxena

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2512.00804

要約:
When answering user queries, LLMs often retrieve knowledge from external sources stored in retrieval-augmented generation (RAG) databases. These are often populated from unvetted sources, e.g. the open web, and can contain maliciously crafted data. This paper studies attacks that can manipulate the context retrieved by LLMs from such RAG databases. Prior work on such context manipulation primarily injects false or toxic content, which can often be detected by fact-checking or linguistic analysis. We reveal a more subtle threat, Epistemic Bias Injection (EBI), in which adversaries inject factually correct yet epistemically biased passages that systematically emphasize one side of a multi-viewpoint issue. Although linguistically coherent and truthful, such adversarial passages effectively crowd out alternative viewpoints and steer model outputs toward an attacker-chosen stance. As a core contribution, we propose a novel characterization of the problem: We give a geometric metric that quantifies epistemic bias. This metric can be computed directly on embeddings of text passages retrieved by the LLM. Leveraging this metric, we construct EBI attacks and develop a lightweight prototype defense called BiasDef for them. We evaluate them both on a comprehensive benchmark constructed from public question answering datasets.Our results show that: (1) the proposed attack induces significant perspective shifts, effectively evading existing retrieval-based sanitization defenses, and (2) BiasDef substantially reduces adversarial retrieval and bias in LLM's answers. Overall, this demonstrates the new threat as well as the ease of employing epistemic bias metrics for filtering in RAG-enabled LLMs.

#39 Computing Maximal Per-Record Leakage and Leakage-Distortion Functions for Privacy Mechanisms under Entropy-Constrained Adversaries

privacy

著者: Genqiang Wu, Xiaoying Zhang, Yu Qi, Hao Wang, Jikui Wang, Yeping He

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2602.00689

要約:
The exponential growth of data collection necessitates robust privacy protections that preserve data utility. We address information disclosure against adversaries with bounded prior knowledge, modeled by an entropy constraint $H(X) \geq b$. Within this information privacy framework -- which replaces differential privacy's independence assumption with a bounded-knowledge model -- we study three core problems: maximal per-record leakage, the primal leakage-distortion tradeoff (minimizing worst-case leakage under distortion $D$), and the dual distortion minimization (minimizing distortion under leakage constraint $L$). These problems resemble classical information-theoretic ones (channel capacity, rate-distortion) but are more complex due to high dimensionality and the entropy constraint. We develop efficient alternating optimization algorithms that exploit convexity-concavity duality, with theoretical guarantees including local convergence for the primal problem and convergence to a stationary point for the dual. Experiments on binary symmetric channels and modular sum queries validate the algorithms, showing improved privacy-utility tradeoffs over classical differential privacy mechanisms. This work provides a computational framework for auditing privacy risks and designing certified mechanisms under realistic adversary assumptions.

#40 IssueGuard: Real-Time Secret Leak Prevention Tool for GitHub Issue Reports

著者: Md Nafiu Rahman, Sadif Ahmed, Zahin Wahab, Gias Uddin, Rifat Shahriyar

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2602.08072

要約:
GitHub and GitLab are widely used collaborative platforms whose issue-tracking systems contain large volumes of unstructured text, including logs, code snippets, and configuration examples. This creates a significant risk of accidental secret exposure, such as API keys and credentials, yet these platforms provide no mechanism to warn users before submission. We present \textsc{IssueGuard}, a tool for real-time detection and prevention of secret leaks in issue reports. Implemented as a Chrome extension, \textsc{IssueGuard} analyzes text as users type and combines regex-based candidate extraction with a fine-tuned CodeBERT model for contextual classification. This approach effectively separates real secrets from false positives and achieves an F1-score of 92.70\% on a benchmark dataset, outperforming traditional regex-based scanners. \textsc{IssueGuard} integrates directly into the web interface and continuously analyzes the issue editor, presenting clear visual warnings to help users avoid submitting sensitive data. The source code is publicly available at \href{https://github.com/disa-lab/IssueGuard}{https://github.com/disa-lab/IssueGuard} , and a demonstration video is available at \href{https://youtu.be/kvbWA8rr9cU}{https://youtu.be/kvbWA8rr9cU} .

#41 Detecting Fileless Cryptojacking in PowerShell Using AST-Enhanced CodeBERT Models

著者: Said Varlioglu, Nelly Elsayed, Murat Ozer, Zag ElSayed, John M. Emmert

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2602.18285

要約:
With the emergence of remote code execution (RCE) vulnerabilities in ubiquitous libraries and advanced social engineering techniques, threat actors have started conducting widespread fileless cryptojacking attacks. These attacks have become effective with stealthy techniques based on PowerShell-based exploitation in Windows OS environments. Even if attacks are detected and malicious scripts removed, processes may remain operational on victim endpoints, creating a significant challenge for detection mechanisms. In this paper, we conducted an experimental study with a collected dataset on detecting PowerShell-based fileless cryptojacking scripts. The results showed that Abstract Syntax Tree (AST)-based fine-tuned CodeBERT achieved a high recall rate, proving the importance of the use of AST integration and fine-tuned pre-trained models for programming language.

#42 Can You Tell It's AI? Human Perception of Synthetic Voices in Vishing Scenarios

著者: Zoha Hayat Bhatti, Bakhtawar Ahtisham, Seemal Tausif, Niklas George, Nida ul Habib Bajwa, Mobin Javed

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2602.20061

要約:
Large Language Models and commercial speech synthesis systems now enable highly realistic AI-generated voice scams (vishing), raising urgent concerns about deception at scale. Yet it remains unclear whether individuals can reliably distinguish AI-generated speech from human-recorded voices in realistic scam contexts and what perceptual strategies underlie their judgments. We conducted a controlled online study in which 22 participants evaluated 16 vishing-style audio clips (8 AI-generated, 8 human-recorded) and classified each as human or AI while reporting confidence. Participants performed poorly: mean accuracy was 37.5%, below chance in a binary classification task. At the stimulus level, misclassification was bidirectional: 75% of AI-generated clips were majority-labeled as human, while 62.5% of human-recorded clips were majority-labeled as AI. Signal Detection Theory analysis revealed near-zero discriminability (d' approx 0), indicating inability to reliably distinguish synthetic from human voices rather than simple response bias. Qualitative analysis of 315 coded excerpts revealed reliance on paralinguistic and emotional heuristics, including pauses, filler words, vocal variability, cadence, and emotional expressiveness. However, these surface-level cues traditionally associated with human authenticity were frequently replicated by AI-generated samples. Misclassifications were often accompanied by moderate to high confidence, suggesting perceptual miscalibration rather than uncertainty. Together, our findings demonstrate that authenticity judgments based on vocal heuristics are unreliable in contemporary vishing scenarios. We discuss implications for security interventions, user education, and AI-mediated deception mitigation.

#43 Benchmarking Post-Quantum Cryptography on Resource-Constrained IoT Devices: ML-KEM and ML-DSA on ARM Cortex-M0+

著者: Rojin Chhetri

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2603.19340

要約:
The migration to post-quantum cryptography is urgent for Internet of Things devices with 10-20 year lifespans, yet no systematic benchmarks exist for the finalised NIST standards on the most constrained 32-bit processor class. This paper presents the first isolated algorithm-level benchmarks of ML-KEM (FIPS 203) and ML-DSA (FIPS 204) on ARM Cortex-M0+, measured on the RP2040 (Raspberry Pi Pico) at 133 MHz with 264 KB SRAM. Using PQClean reference C implementations, we measure all three security levels of ML-KEM (512/768/1024) and ML-DSA (44/65/87) across key generation, encapsulation/signing, and decapsulation/verification. ML-KEM-512 completes a full key exchange in 35.7 ms consuming 2.83 mJ--17x faster and 94% less energy than ECDH P-256 on the same hardware. ML-DSA signing exhibits high latency variance due to rejection sampling (coefficient of variation 66-73%, 99th-percentile up to 1,125 ms for ML-DSA-87). The M0+ incurs only a 1.8-1.9x slowdown relative to published Cortex-M4 results, despite lacking 64-bit multiply, DSP, and SIMD instructions. All code, data, and scripts are released as an open-source benchmark suite for reproducibility.

#44 Combinatorial Privacy: Private Multi-Party Bitstream Grand Sum by Hiding in Birkhoff Polytopes

privacy

著者: Praneeth Vepakomma

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2603.22808

要約:
We introduce PolyVeil, a protocol for private Boolean summation across $k$ clients that encodes private bits as permutation matrices in the Birkhoff polytope. A two-layer architecture gives the server perfect simulation-based security (statistical distance zero) while a separate aggregator faces \#P-hard likelihood inference via the permanent and mixed discriminant. Two variants (full and compressed) differ in what the aggregator observes. We develop a finite-sample $(\varepsilon,\delta)$-DP analysis with explicit constants. In the full variant, where the aggregator sees a doubly stochastic matrix per client, the log-Lipschitz constant grows as $n^4 K_t$ and a signal-to-noise analysis shows the DP guarantee is non-vacuous only when the private signal is undetectable. In the compressed variant, where the aggregator sees a single scalar, the univariate density ratio yields non-vacuous $\varepsilon$ at moderate SNR, with the optimal decoy count balancing CLT accuracy against noise concentration. This exposes a fundamental tension. \#P-hardness requires the full matrix view (Birkhoff structure visible), while non-vacuous DP requires the scalar view (low dimensionality). Whether both hold simultaneously in one variant remains open. The protocol needs no PKI, has $O(k)$ communication, and outputs exact aggregates.

#45 Toward a Multi-Layer ML-Based Security Framework for Industrial IoT

著者: Aymen Bouferroum (FUN), Valeria Loscri (FUN), Abderrahim Benslimane (LIA)

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2603.24111

要約:
The Industrial Internet of Things (IIoT) introduces significant security challenges as resource-constrained devices become increasingly integrated into critical industrial processes. Existing security approaches typically address threats at a single network layer, often relying on expensive hardware and remaining confined to simulation environments. In this paper, we present the research framework and contributions of our doctoral thesis, which aims to develop a lightweight, Machine Learning (ML)-based security framework for IIoT environments. We first describe our adoption of the Tm-IIoT trust model and the Hybrid IIoT (H-IIoT) architecture as foundational baselines, then introduce the Trust Convergence Acceleration (TCA) approach, our primary contribution that integrates ML to predict and mitigate the impact of degraded network conditions on trust convergence, achieving up to a 28.6% reduction in convergence time while maintaining robustness against adversarial behaviors. We then propose a real-world deployment architecture based on affordable, open-source hardware, designed to implement and extend the security framework. Finally, we outline our ongoing research toward multi-layer attack detection, including physical-layer threat identification and considerations for robustness against adversarial ML attacks.

#46 Towards Remote Attestation of Microarchitectural Attacks: The Case of Rowhammer

著者: Martin Herrmann, Oussama Draissi, Christian Niesler, Ahmad-Reza Sadeghi, Lucas Davi

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2603.24172

要約:
Microarchitectural vulnerabilities increasingly undermine the assumption that hardware can be treated as a reliable root of trust. Prevention mechanisms often lag behind evolving attack techniques, leaving deployed systems unable to assume continued trustworthiness. We propose a shift from prevention to detection through microarchitectural-aware remote attestation. As a first instantiation of this idea, we present HammerWatch, a Rowhammer-aware remote attestation protocol that enables an external verifier to assess whether a system exhibits hardware-induced disturbance behavior. HammerWatch leverages memory-level evidence available on commodity platforms, specifically Machine-Check Exceptions (MCEs) from ECC DRAM and counter-based indicators from Per-Row Activation Counting (PRAC), and protects these measurements against kernel-level adversaries using TPM-anchored hash chains. We implement HammerWatch on commodity hardware and evaluate it on 20000 simulated benign and malicious access patterns. Our results show that the verifier reliably distinguishes Rowhammer-like behavior from benign operation under conservative heuristics, demonstrating that detection-oriented attestation is feasible and can complement incomplete prevention mechanisms

#47 Differentially Private Substring and Document Counting with Near-Optimal Error

privacy

著者: Giulia Bernardini, Philip Bille, Inge Li G{\o}rtz, Teresa Anna Steiner

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2412.13813

要約:
For databases consisting of many text documents, one of the most fundamental data analysis tasks is counting (i) how often a pattern appears as a substring in the database (substring counting) and (ii) how many documents in the collection contain the pattern as a substring (document counting). If such a database contains sensitive data, it is crucial to protect the privacy of individuals in the database. Differential privacy is the gold standard for privacy in data analysis. It gives rigorous privacy guarantees, but comes at the cost of yielding less accurate results. In this paper, we carry out a theoretical study of substring and document counting under differential privacy. We propose a data structure storing $\epsilon$-differentially private counts for all possible query patterns with a maximum additive error of $O(\ell\cdot\mathrm{polylog}(n\ell|\Sigma|))$, where $\ell$ is the maximum length of a document in the database, $n$ is the number of documents, and $|\Sigma|$ is the size of the alphabet. We also improve the error bound for document counting with $(\epsilon, \delta)$-differential privacy to $O(\sqrt{\ell}\cdot\mathrm{polylog}(n\ell|\Sigma|))$. We show that our additive errors for substring counting and document counting are optimal up to an $O(\mathrm{polylog}(n\ell))$ factor both for $\epsilon$-differential privacy and $(\epsilon, \delta)$-differential privacy. Our data structures immediately lead to improved algorithms for related problems, such as privately mining frequent substrings and q-grams. Additionally, we develop a new technique of independent interest for differentially privately computing a general class of counting functions on trees.

#48 Doing More With Less: Mismatch-Based Risk-Limiting Audits

著者: Alexander Ek, Michelle Blom, Philip B. Stark, Peter J. Stuckey, Vanessa J. Teague, Damjan Vukcevic

公開日: Fri, 27 Mar 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2503.16104

要約:
One approach to risk-limiting audits (RLAs) compares randomly selected cast vote records (CVRs) to votes read by human auditors from the corresponding ballot cards. Historically, such methods reduce audit sample sizes by considering how each sampled CVR differs from the corresponding true vote, not merely whether they differ. Here we investigate the latter approach, auditing by testing whether the total number of mismatches in the full set of CVRs exceeds the minimum number of CVR errors required for the reported outcome to be wrong (the "CVR margin"). This strategy makes it possible to audit more social choice functions and simplifies RLAs conceptually, which makes it easier to explain than some other RLA approaches. The cost is larger sample sizes. "Mismatch-based RLAs" only require a lower bound on the CVR margin, which for some social choice functions is easier to calculate than the effect of particular errors. When the population rate of mismatches is low and the lower bound on the CVR margin is close to the true CVR margin, the increase in sample size is small. However, the increase may be very large when errors include errors that, if corrected, would widen the CVR margin rather than narrow it; errors affect the margin between candidates other than the reported winner with the fewest votes and the reported loser with the most votes; or errors that affect different margins.

cs.CR updates on arXiv.org

📋 論文タイトル一覧