arXiv論文一覧 - cs.CR updates on arXiv.org

#1 Robust Covert Quantum Communication under Bounded Channel Uncertainty

著者: Abbas Arghavani, Alessandro V. Papadopoulos, Vahid Azimi Mousolou, Giuseppe Nebbione, Shahid Raza

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.13116

要約:
Covert quantum communication is usually analyzed under idealized assumptions that channel parameters, such as transmissivity and background noise, are perfectly known and constant. In realistic optical links, including satellite, fiber, and free-space systems, these parameters vary because of environmental fluctuations, calibration noise, and estimation errors. We study covert quantum communication over compound quantum optical channels with bounded uncertainty in both transmissivity and thermal noise, and derive guarantees that hold for all admissible channel realizations. We develop a robust framework for certifying both covertness and reliability under uncertainty. A central finding is that robustness cannot be obtained by simply inserting worst-case parameter values into known-channel bounds: the channel realizations that are most adverse for covertness and reliability generally occur at different corners of the uncertainty set. This creates a fundamental trade-off in secure system design. We derive a closed-form lower bound on the worst-case guaranteed number of covert qubits that can be transmitted reliably, identify a sharp feasibility boundary beyond which the guaranteed payload drops to zero, and quantify the security penalty caused by uncertainty. We validate the covertness term with QuTiP simulations of a four-mode bosonic model and combine it with an analytical reliability bound to evaluate the robust payload. Our results move covert quantum communication from nominal perfect-knowledge analysis to certified worst-case operation under uncertainty.

#2 Conflict-Aware Robust Design for Covert Wireless Communications

著者: Abbas Arghavani

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.13122

要約:
Covert wireless communication aims to establish a reliable link while hiding the transmission from an adversary. In wireless settings, uncertainty plays a central role in this tradeoff: it can help mask the signal from a warden, but it also complicates robust system design. This raises a basic question: under bounded uncertainty, are reliability and covertness governed by the same adverse conditions? If not, robust covert design cannot be reduced to a single worst-case environment. In this paper, we study this question in a covert wireless model with quasi-static fading, outage-based reliability at Bob and radiometric detection at Willie. Uncertainty is represented through bounded intervals for Bob's average channel strength and Willie's noise power. To obtain a tractable characterization, we adopt a conditional large-N midpoint-threshold surrogate for Willie's detector, parameterized by a Willie-side fading realization. Within this framework, we show that the reliability constraint is governed by Bob's smallest admissible channel parameter, whereas the covertness constraint is governed by Willie's smallest admissible noise level. This establishes a conflict-aware robust-design principle: the adverse realizations for reliability and covertness differ. Based on this result, we derive closed-form expressions for the robustly feasible transmit power and the corresponding robust optimal rate. Numerical results show that bounded uncertainty contracts the feasible region, monotonically reduces the robust optimal rate, and can cause substantial loss relative to the nominal design. Monte Carlo results further show that the conditional surrogate closely tracks the midpoint-threshold radiometer in the intended low-effective-SNR regime. Overall, the paper shows that even in a streamlined wireless setting, robust covert design requires different adverse-case reasoning for reliability and covertness.

#3 Neural Stringology Based Cryptanalysis of EChaCha20

著者: Victor Kebande

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.13289

要約:
Modern stream ciphers rely on strong diffusion and pseudorandom keystream generation (PKG) to resist cryptanalysis. While conventional evaluation methods such as statistical randomness tests and differential analysis provide important security assurances, they may fail to detect localized structural patterns embedded within cipher outputs. In this paper, a Neural Stringology Cryptanalysis (NSC) framework that combines classical string pattern analysis with machine learning techniques to investigate potential structural anomalies in stream cipher keystreams is introduced. The proposed approach first applies stringology-inspired feature extraction methods such as m-gram frequency analysis, substring recurrence detection, and positional pattern statistics aligned with the internal operations of Add-Rotate-XOR (ARX) based stream ciphers. These extracted features are then analyzed using a neural learning model to identify deviations from expected random behavior and to detect subtle structural patterns that may not be captured by traditional statistical tests. Experimental evaluation is conducted on keystream outputs generated by the EChaCha20 stream cipher under multiple configurations, including reduced round variants. The results demonstrate that the proposed NSC framework can identify distinguishable structural characteristics in the keystream data under controlled conditions, suggesting that integrating machine learning with stringology-based analysis provides a promising complementary methodology for evaluating the structural robustness of modern ARX-based stream cipher designs.

#4 Can Agents Secure Hardware? Evaluating Agentic LLM-Driven Obfuscation for IP Protection

agent

著者: Sujan Ghimire, Parsa Mirfasihi, Muhtasim Alam Chowdhury, Veeramani Pugazhenthi, Harish Kumar Dharavath, Farshad Firouzi, Rozhin Yasaei, Pratik Satam, Soheil Salehi

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.13298

要約:
The globalization of integrated circuit (IC) design and manufacturing has increased the exposure of hardware intellectual property (IP) to untrusted stages of the supply chain, raising concerns about reverse engineering, piracy, tampering, and overbuilding. Hardware netlist obfuscation is a promising countermeasure, but automating the generation of functionally correct and security-relevant obfuscated circuits remains challenging, particularly for benchmark-scale designs. This paper presents an agentic, large language model (LLM)-driven framework for automated hardware netlist obfuscation. The proposed framework combines retrieval-grounded planning, structured lock-plan generation, deterministic netlist compilation, functional verification, and SAT-based security evaluation. Rather than a single prompt-to-output generation step, the framework decomposes the task into specialized stages for circuit analysis, synthesis, verification, and attack evaluation. We evaluate the framework on ISCAS-85 benchmarks using functional equivalence checking and SAT-based attacks. Results show that the framework generates correct locked netlists while introducing measurable output corruption under incorrect keys, while SAT attacks remain effective. These findings highlight both the potential and current limitations of agentic LLM-driven obfuscation.

#5 Honeypot Protocol

著者: Najmul Hasan

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.13301

要約:
Trusted monitoring, the standard defense in AI control, is vulnerable to adaptive attacks, collusion, and strategic attack selection. All of these exploit the fact that monitoring is passive: it observes model behavior but never probes whether the model would behave differently under different perceived conditions. We introduce the honeypot protocol, which tests for context-dependent behavior by varying only the system prompt across three conditions (evaluation, synthetic deployment, explicit no-monitoring) while holding the task, environment, and scoring identical. We evaluate Claude Opus 4.6 in BashArena across all three conditions in both honest and attack modes. The model achieved 100% main task success and triggered zero side tasks uniformly across conditions, providing a baseline for future comparisons with stronger attack policies and additional models.

#6 Threat Modeling and Attack Surface Analysis of IoT-Enabled Controlled Environment Agriculture Systems

著者: Andrii Vakhnovskyi

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.13308

要約:
The United States designates Food and Agriculture as one of sixteen critical infrastructure sectors, yet no mandatory cybersecurity requirements exist for agricultural operations and no formal threat model has been published for Controlled Environment Agriculture (CEA) systems. This paper presents the first comprehensive threat model for IoT-enabled CEA, applying STRIDE analysis, MITRE ATT&CK for ICS mapping, and IEC 62443 zone-and-conduit decomposition to a production platform deployed across 30+ commercial facilities in 8 U.S. climate zones. We enumerate 123 unique threats across 25 data-flow-diagram elements spanning 15 communication protocols, 10 of which operate with zero authentication or encryption by design. We identify five novel attack classes unique to AI-driven CEA: stealth destabilization of neural-network-tuned PID controllers, baseline drift poisoning of anomaly detectors, cross-facility propagation via federated transfer learning, adversarial agronomic schedules that exploit crop biology rather than computational models, and reward poisoning of reinforcement-learning energy optimizers. Physical impact analysis quantifies crop loss timelines from minutes (aeroponics) to days, including worker safety hazards from CO2 injection manipulation. A survey of 10 commercial CEA vendors reveals only one CVE ever issued, zero bug bounty programs, and zero IEC 62443 certifications. We propose a defense-in-depth countermeasure framework and recommend Security Level 2 as a minimum baseline.

#7 Secure and Privacy-Preserving Vertical Federated Learning

privacy

著者: Shan Jin, Sai Rahul Rachuri, Yizhen Wang, Anderson C. A. Nascimento, Yiwei Cai

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.13474

要約:
We propose a novel end-to-end privacy-preserving framework, instantiated by three efficient protocols for different deployment scenarios, covering both input and output privacy, for the vertically split scenario in federated learning (FL), where features are split across clients and labels are not shared by all parties. We do so by distributing the role of the aggregator in FL into multiple servers and having them run secure multiparty computation (MPC) protocols to perform model and feature aggregation and apply differential privacy (DP) to the final released model. While a naive solution would have the clients delegating the entirety of training to run in MPC between the servers, our optimized solution, which supports purely global and also global-local models updates with privacy-preserving, drastically reduces the amount of computation and communication performed using multiparty computation. The experimental results also show the effectiveness of our protocols.

#8 SafeHarness: Lifecycle-Integrated Security Architecture for LLM-based Agent Deployment

agent

著者: Xixun Lin, Yang Liu, Yancheng Chen, Yongxuan Wu, Yucheng Ning, Yilong Liu, Nan Sun, Shun Zhang, Bin Chong, Chuan Zhou, Yanan Cao, Li Guo

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.13630

要約:
The performance of large language model (LLM) agents depends critically on the execution harness, the system layer that orchestrates tool use, context management, and state persistence. Yet this same architectural centrality makes the harness a high-value attack surface: a single compromise at the harness level can cascade through the entire execution pipeline. We observe that existing security approaches suffer from structural mismatch, leaving them blind to harness-internal state and unable to coordinate across the different phases of agent operation. In this paper, we introduce \safeharness{}, a security architecture in which four proposed defense layers are woven directly into the agent lifecycle to address above significant limitations: adversarial context filtering at input processing, tiered causal verification at decision making, privilege-separated tool control at action execution, and safe rollback with adaptive degradation at state update. The proposed cross-layer mechanisms tie these layers together, escalating verification rigor, triggering rollbacks, and tightening tool privileges whenever sustained anomalies are detected. We evaluate \safeharness{} on benchmark datasets across diverse harness configurations, comparing against four security baselines under five attack scenarios spanning six threat categories. Compared to the unprotected baseline, \safeharness{} achieves an average reduction of approximately 38\% in UBR and 42\% in ASR, substantially lowering both the unsafe behavior rate and the attack success rate while preserving core task utility.

#9 Where Trust Fails: Mapping Location-Data Provenance Risks in Europe

著者: Eduardo Brito, Liina Kamm

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.13668

要約:
European digital sovereignty and security increasingly depends on whether high-impact decisions can be grounded in location evidence that remains credible under adversarial pressure. This paper frames a cross-sector analysis as a location-data provenance problem: not merely what a device or service reports as location, but whether there is contestable evidence about where and when an asserted event occurred, who or what produced the assertion, and under which audit and retention guarantees. There are observable patterns across democratic processes and the information environment, trade and origin-sensitive supply chains, finance and illicit shipping flows, critical infrastructure and mobility, and harms targeting individuals' private and social domains. In these patterns we see a recurring asymmetry in which locality, presence, routing, or jurisdiction can be asserted cheaply while institutions and affected parties face costly reconstruction when disputes arise. To make this challenge actionable, this paper introduces a compact risk taxonomy that decomposes provenance failures into integrity axes and recurring failure modes, and derives design expectations for next-generation digital trust infrastructure centered on contestability under dispute, while remaining privacy- and rights-compatible. It argues for treating location as a digital primitive that should be represented as evidence-bearing claims rather than self-asserted coordinates, and positions proof-of-location (PoL) mechanisms as a candidate capability layer for producing verifiable presence claims under explicit threat and privacy assumptions. The outcome is a sector-neutral foundation for future architectural work on a next-generation digital trust infrastructure for Europe.

#10 RealVuln: Benchmarking Rule-Based, General-Purpose LLM, and Security-Specialized Scanners on Real-World Code

著者: John Pellew, Faizan Raza

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.13764

要約:
How do security scanners perform on real-world code? We present RealVuln, the first open-source benchmark comparing Rule-Based SAST, General-Purpose LLMs, and Security-Specialized scanners on 26 intentionally vulnerable Python repositories (educational and Capture-The-Flag applications) with 796 hand-labeled entries (676 vulnerabilities, 120 false-positive traps). We test 15 scanners (3 Rule-Based SAST, 10 General-Purpose LLM, 2 Security-Specialized) and rank them by F3 score (beta=3, weighting recall 9x over precision). A clear three-tier ranking emerges under all metrics. Under F3, the Security-Specialized scanner Kolega.Dev (73.0) leads, followed by the best General-Purpose LLM, Claude Sonnet 4.6 (51.7), which in turn scores nearly 3x higher than the best Rule-Based tool, Semgrep (17.7). Under F1, Sonnet 4.6 leads (60.9) with Kolega.Dev at 52.4. Rankings within tiers shift with beta, but the three-tier hierarchy holds across all weightings. All code, ground-truth data, scanner outputs, and scoring scripts are released under an open-source license. An interactive dashboard is at https://realvuln.kolega.dev/. RealVuln is a living benchmark: versioned, community-driven, with a roadmap toward multi-language coverage.

#11 MCPThreatHive: Automated Threat Intelligence for Model Context Protocol Ecosystems

著者: Yi Ting Shen, Kentaroh Toyoda, Alex Leung

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.13849

要約:
The rapid proliferation of Model Context Protocol (MCP)-based agentic systems has introduced a new category of security threats that existing frameworks are inadequately equipped to address. We present MCPThreatHive, an open-source platform that automates the end-to-end lifecycle of MCP threat intelligence: from continuous, multi-source data collection through AI-driven threat extraction and classification, to structured knowledge graph storage and interactive visualization. The platform operationalizes the MCP-38 threat taxonomy, a curated set of 38 MCP-specific threat patterns mapped to STRIDE, OWASP Top 10 for LLM Applications, and OWASP Top 10 for Agentic Applications. A composite risk scoring model provides quantitative prioritization. Through a comparative analysis of representative existing MCP security tools, we identify three critical coverage gaps that MCPThreatHive addresses: incomplete compositional attack modeling, absence of continuous threat intelligence, and lack of unified multi-framework classification.

#12 Towards Personalizing Secure Programming Education with LLM-Injected Vulnerabilities

著者: Matthew Frazier, Kostadin Damevski

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.13955

要約:
According to constructivist theory, students learn software security more effectively when examples are grounded in their own code. Generic examples often fail to connect with students' prior work, limiting engagement and understanding. Advances in LLMs are now making it possible to automatically generate personalized examples by embedding security vulnerabilities directly into student-authored code. This paper introduces a method that uses LLMs to inject instances of specific Common Weakness Enumerations (CWEs) into students' own assignment code, creating individualized instructional materials. We present an agentic AI framework, using autonomous LLM-based agents equipped with task-specific tools to orchestrate injection, evaluation, ranking, and learning outcome generation. We report the experience of deploying this system in two undergraduate computer science courses (N=71), where students reviewed code samples containing LLM-injected vulnerabilities and completed a post-project survey. We compared responses with a baseline using a widely adopted set of generic security instructional materials. Students qualitatively reported finding CWE injections into their own code more relevant, clearer, and more engaging than the textbook-style examples. However, our quantitative findings revealed limited statistically significant differences, suggesting that while students valued the personalization, further studies and refinement of the approach are needed to establish stronger empirical support.

#13 KindHML: formal verification of smart contracts based on Hennessy-Milner logic

著者: Massimo Bartoletti, Angelo Ferrando, Enrico Lipparini, Vadim Malvone

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.14038

要約:
Smart contracts deployed on blockchains such as Ethereum routinely manage large amounts of assets, making their security critical. Empirical studies show that real-world attacks often exploit flaws in the business logic of contracts that unfold across multiple transactions, such as liquidity or front-running attacks. Detecting these attacks requires reasoning about expressive temporal properties beyond the capabilities of existing analysis tools. In this paper, we present an automated approach to the formal verification of smart contracts, enabling the specification and verification of complex temporal properties. Our approach provides a fully automated encoding into Lustre -- the specification language supported by the Kind 2 model checker -- of an expressive subset of Solidity contracts and temporal specifications based on first-order Hennessy-Milner Logic. This encoding allows us to leverage Kind 2 to determine whether the contract respects the specification or not. We implement our approach in a toolchain that integrates the translation and verification steps, and we evaluate its effectiveness and performance on a benchmark of smart contracts and temporal properties capturing complex attack scenarios. Our results show that the proposed approach can effectively verify non-trivial temporal properties of smart contracts and detect violations that are beyond the reach of existing analysis tools.

#14 Temporary Power Adjusting Withholding Attack

著者: Mustafa Doger, Sennur Ulukus

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.14135

要約:
We consider the block withholding attacks on pools, more specifically the state-of-the-art Power Adjusting Withholding (PAW) attack. We propose a generalization called Temporary PAW (T-PAW) where the adversary withholds a fPoW from pool mining at most $T$-time even when no other block is mined. We show that PAW attack corresponds to $T\to\infty$ and is not optimal. In fact, the extra reward of T-PAW compared to PAW improves by an unbounded factor as adversarial hash fraction $\alpha$, pool size $\beta$ and adversarial network influence $\gamma$ decreases. For example, the extra reward of T-PAW is 22 times that of PAW when an adversary targets a pool with $(\alpha,\beta,\gamma)=(0.05,0.05,0)$. We show that honest mining is sub-optimal to T-PAW even when there is no difficulty adjustment and the adversarial revenue increase is non-trivial, e.g., for most $(\alpha,\beta)$ at least $1\%$ within $2$ weeks in Bitcoin even when $\gamma=0$ (for PAW it was at most $0.01\%$). Hence, T-PAW exposes a significant structural weakness in pooled mining-its primary participants, small miners, are not only contributors but can easily turn into potential adversaries with immediate non-trivial benefits.

#15 PatchPoison: Poisoning Multi-View Datasets to Degrade 3D Reconstruction

privacybackdoor

著者: Prajas Wadekar, Venkata Sai Pranav Bachina, Kunal Bhosikar, Ankit Gangwal, Charu Sharma

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.13153

要約:
3D Gaussian Splatting (3DGS) has recently enabled highly photorealistic 3D reconstruction from casually captured multi-view images. However, this accessibility raises a privacy concern: publicly available images or videos can be exploited to reconstruct detailed 3D models of scenes or objects without the owner's consent. We present PatchPoison, a lightweight dataset-poisoning method that prevents unauthorized 3D reconstruction. Unlike global perturbations, PatchPoison injects a small high-frequency adversarial patch, a structured checkerboard, into the periphery of each image in a multi-view dataset. The patch is designed to corrupt the feature-matching stage of Structure-from-Motion (SfM) pipelines such as COLMAP by introducing spurious correspondences that systematically misalign estimated camera poses. Consequently, downstream 3DGS optimization diverges from the correct scene geometry. On the NeRF-Synthetic benchmark, inserting a 12 X 12 pixel patch increases reconstruction error by 6.8x in LPIPS, while the poisoned images remain unobtrusive to human viewers. PatchPoison requires no pipeline modifications, offering a practical, "drop-in" preprocessing step for content creators to protect their multi-view data.

#16 Sequential Change Detection for Multiple Data Streams with Differential Privacy

privacy

著者: Lixing Zhang, Liyan Xie, Ruizhi Zhang

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.13274

要約:
Sequential change-point detection seeks to rapidly identify distributional changes in streaming data while controlling false alarms. Existing multi-stream detection methods typically rely on non-private access to raw observations or intermediate statistics, limiting their usage in privacy-sensitive settings. We study sequential change-point detection for multiple data streams under differential privacy constraints. We consider multiple independent streams undergoing a synchronized change at an unknown time and in an unknown subset of streams, and propose DP-SUM-CUSUM, a differentially private detection procedure based on the summation of per-stream CUSUM statistics with calibrated Laplace noise injection. We show that DP-SUM-CUSUM satisfies sequential $\varepsilon$-differential privacy and derive bounds on the average run length to false alarm and the worst-case average detection delay, explicitly characterizing the privacy--efficiency tradeoff. A truncation-based extension is also presented to handle distributional shifts with unbounded log-likelihood ratios. Simulations and experiments on an Internet of Things (IoT) botnet dataset validate the proposed approach.

#17 Listening Alone, Understanding Together: Collaborative Context Recovery for Privacy-Aware AI

privacy

著者: Tanmay Srivastava, Amartya Basu, Shubham Jain, Vaishnavi Ranganathan

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.13348

要約:
We introduce CONCORD, a privacy-aware asynchronous assistant-to-assistant (A2A) framework that leverages collaboration between proactive speech-based AI. As agents evolve from reactive to always-listening assistants, they face a core privacy risk (of capturing non-consenting speakers), which makes their social deployment a challenge. To overcome this, we implement CONCORD, which enforces owner-only speech capture via real-time speaker verification, producing a one-sided transcript that incurs missing context but preserves privacy. We demonstrate that CONCORD can safely recover necessary context through (1) spatio-temporal context resolution, (2) information gap detection, and (3) minimal A2A queries governed by a relationship-aware disclosure. Instead of hallucination-prone inferring, CONCORD treats context recovery as a negotiated safe exchange between assistants. Across a multi-domain dialogue dataset, CONCORD achieves 91.4% recall in gap detection, 96% relationship classification accuracy, and 97% true negative rate in privacy-sensitive disclosure decisions. By reframing always-listening AI as a coordination problem between privacy-preserving agents, CONCORD offers a practical path toward socially deployable proactive conversational agents.

#18 Look One Step Ahead: Forward-Looking Incentive Design with Strategic Privacy for Proactive Service Provisioning over Air-Ground Integrated Edge Networks

privacy

著者: Sicheng Wu, Minghui Liwang, Yangyang Gao, Deqing Wang, Wenbo Zhu, Yiguang Hong, Wei Ni, Seyyedali Hosseinalipour

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.13635

要約:
In air-ground integrated networks (AGINs), unmanned aerial vehicles (UAVs) provide on-demand edge services to ground vehicles. Realizing this vision requires carefully designed incentives to coordinate interactions among self-interested participants. This is exacerbated by the dynamic nature of AGINs, where spatio-temporal variations introduce significant uncertainty in matching UAVs and vehicles. Existing real-time service provisioning typically relies on precise trajectory information, raising privacy concerns and incurring decision latency. To address these challenges, we propose look one-step ahead (LOSA), a novel framework for efficient and privacy-aware service provisioning. By exploiting predictable vehicle travel times between intersections, LOSA decomposes the process into two coupled phases: (i) a privacy-aware look-ahead phase and (ii) a lightweight real-time execution phase. The look-ahead phase allows vehicles to adaptively adjust privacy budgets based on historical utility, balancing trajectory exposure and matching accuracy. Leveraging this, a double auction mechanism establishes binding one-step-ahead agreements (OSAAs) through trajectory similarity clustering, while constructing preference lists to hedge against mobility uncertainty. The execution phase then enforces pre-established OSAAs and preference lists, resolving real-time resource conflicts without costly re-negotiations. This design reduces computational overhead and preserves robustness. We analytically corroborate that LOSA guarantees truthfulness, individual rationality, and budget balance. Experiments on real-world datasets (DAIR-V2X, HighD, and RCooper) demonstrate that LOSA achieves superior privacy protection while lowering transaction latency compared to baseline approaches.

#19 Erlang Binary and Source Code Obfuscation

著者: Gregory Morse, Tam\'as Kozsik

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.13675

要約:
This paper studies obfuscation techniques for Erlang programs at the source, abstract syntax tree, BEAM assembly, and BEAM bytecode levels. We focus on transformations that complicate reverse engineering, decompilation, and recompilation while remaining grounded in the actual behavior of the Erlang compiler, validator, loader, and virtual machine. The paper categorizes opcode-level dependency tricks, receive-based loop encodings, irregular control-flow constructions, mutability-oriented performance obfuscation, and self-modifying code enabled by dynamic module loading. A recurring theme is that effective obfuscation in BEAM often arises not from arbitrary corruption, but from exploiting representational gaps between high-level Erlang semantics and the lower-level execution model accepted by the toolchain and runtime.

#20 Who Gets Flagged? The Pluralistic Evaluation Gap in AI Content Watermarking

intellectual property

著者: Alexander Nemecek, Osama Zafar, Yuqiao Xu, Wenbiao Li, Erman Ayday

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.13776

要約:
Watermarking is becoming the default mechanism for AI content authentication, with governance policies and frameworks referencing it as infrastructure for content provenance. Yet across text, image, and audio modalities, watermark signal strength, detectability, and robustness depend on statistical properties of the content itself, properties that vary systematically across languages, cultural visual traditions, and demographic groups. We examine how this content dependence creates modality-specific pathways to bias. Reviewing the major watermarking benchmarks across modalities, we find that, with one exception, none report performance across languages, cultural content types, or population groups. To address this, we propose three concrete evaluation dimensions for pluralistic watermark benchmarking: cross-lingual detection parity, culturally diverse content coverage, and demographic disaggregation of detection metrics. We connect these to the governance frameworks currently mandating watermarking deployment and show that watermarking is held to a lower fairness standard than the generative systems it is meant to govern. Our position is that evaluation must precede deployment, and that the same bias auditing requirements applied to AI models should extend to the verification layer.

#21 Analysis of Commit Signing on Github

著者: Abubakar Sadiq Shittu, John Sadik, Farzin Gholamrezae, Scott Ruoti

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.14014

要約:
Commit signing is widely promoted as a foundation of software supply-chain security, yet prior work has studied it through the lens of individual repositories or curated project samples, missing the broader picture of how developers behave across an entire platform. Grounded in replicability theory, we vary the sampling unit from repositories to individual developers, following 71,694 active GitHub users, defined as accounts that have authored at least one commit, across all their repositories and their entire commit history, spanning 16 million commits and 874,198 repositories. This platform-wide, user-centric view reveals a fundamental gap that repository sampling cannot detect. The ecosystem's apparent high signing adoption rate is an illusion. Once platform-generated signatures are excluded, fewer than 6% of developers have ever signed a commit themselves, and the vast majority of apparent signers have never signed outside a web browser. Among the minority who do sign locally, signing rarely persists over time or across repositories, and roughly one in eight developer-managed signatures fails verification because signing keys are never uploaded to GitHub. Examining the key registry, we find that expired keys are almost never revoked and more than a quarter of users carry at least one dead key. Together, these findings reveal that commit signing as practiced today cannot serve as a dependable provenance signal at ecosystem scale, and we offer concrete recommendations for closing that gap.

#22 A compact QUBO encoding of computational logic formulae demonstrated on cryptography constructions

著者: Gregory Morse, Tam\'as Kozsik, Oskar Mencer, Peter Rakyta

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2409.07501

要約:
We aim to advance the state-of-the-art in Quadratic Unconstrained Binary Optimization formulation with a focus on cryptography algorithms. As the minimal QUBO encoding of the linear constraints of optimization problems emerges as the solution of integer linear programming (ILP) problems, by solving special boolean logic formulas (like ANF and DNF) for their integer coefficients it is straightforward to handle any normal form, or any substitution for multi-input AND, OR or XOR operations in a QUBO form. To showcase the efficiency of the proposed approach we considered the most widespread cryptography algorithms including AES-128/192/256, MD5, SHA1 and SHA256. For each of these, we achieved QUBO instances reduced by thousands of logical variables compared to previously published results, while keeping the QUBO matrix sparse and the magnitude of the coefficients low. In the particular case of AES-256 cryptography function we obtained more than 8x reduction in variable count compared to previous results. The demonstrated reduction in QUBO sizes notably increases the vulnerability of cryptography algorithms against future quantum annealers, capable of embedding around $30$ thousands of logical variables.

#23 Automated Profile Inference with Language Model Agents

agent

著者: Yuntao Du, Zitao Li, Bolin Ding, Yaliang Li, Hanshen Xiao, Jingren Zhou, Ninghui Li

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2505.12402

要約:
Impressive progress has been made in automated problem-solving by the collaboration of large language model (LLM) based agents. However, these automated capabilities also open avenues for malicious applications. In this paper, we study a new threat that LLMs pose to online pseudonymity, called automated profile inference, where an adversary can instruct LLMs to automatically collect and extract sensitive personal attributes from publicly available user activities on pseudonymous platforms. We also introduce an automated profiling framework called AutoProfiler to demonstrate and assess the feasibility of such attacks in real-world scenarios. AutoProfiler consists of four specialized LLM agents that work collaboratively to retrieve and process user online activities and generate a profile with extracted personal information. Experimental results on two real-world datasets and one synthetic dataset show that AutoProfiler is highly effective and efficient, and the inferred attributes are both identifiable and sensitive, posing significant privacy risks. We explore mitigation strategies from different perspectives and advocate for increased public awareness of this emerging privacy threat.

#24 Activation-Guided Local Editing for Jailbreaking Attacks

著者: Jiecong Wang, Haoran Li, Hao Peng, Ziqian Zeng, Zihao Wang, Haohua Du, Zhengtao Yu

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2508.00555

要約:
Jailbreaking is an essential adversarial technique for red-teaming these models to uncover and patch security flaws. However, existing jailbreak methods face significant drawbacks. Token-level jailbreak attacks often produce incoherent or unreadable inputs and exhibit poor transferability, while prompt-level attacks lack scalability and rely heavily on manual effort and human ingenuity. We propose a concise and effective two-stage framework that combines the advantages of these approaches. The first stage performs a scenario-based generation of context and rephrases the original malicious query to obscure its harmful intent. The second stage then utilizes information from the model's hidden states to guide fine-grained edits, effectively steering the model's internal representation of the input from a malicious toward a benign one. Extensive experiments demonstrate that this method achieves state-of-the-art Attack Success Rate, with gains of up to 37.74% over the strongest baseline, and exhibits excellent transferability to black-box models. Our analysis further demonstrates that AGILE maintains substantial effectiveness against prominent defense mechanisms, highlighting the limitations of current safeguards and providing valuable insights for future defense development. Our code is available at https://github.com/SELGroup/AGILE.

#25 Between a Rock and a Hard Place: The Tension Between Ethical Reasoning and Safety Alignment in LLMs

著者: Shei Pern Chua, Zhen Leng Thai, Kai Jun Teh, Xiao Li, Qibing Ren, Xiaolin Hu

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2509.05367

要約:
Large Language Model safety alignment predominantly operates on a binary assumption that requests are either safe or unsafe. This classification proves insufficient when models encounter ethical dilemmas, where the capacity to reason through moral trade-offs creates a distinct attack surface. We formalize this vulnerability through TRIAL, a multi-turn red-teaming methodology that embeds harmful requests within ethical framings. TRIAL achieves high attack success rates across most tested models by systematically exploiting the model's ethical reasoning capabilities to frame harmful actions as morally necessary compromises. Building on these insights, we introduce ERR (Ethical Reasoning Robustness), a defense framework that distinguishes between instrumental responses that enable harmful outcomes and explanatory responses that analyze ethical frameworks without endorsing harmful acts. ERR employs a Layer-Stratified Harm-Gated LoRA architecture, achieving robust defense against reasoning-based attacks while preserving model utility.

#26 Neuro-Symbolic AI for Cybersecurity: State of the Art, Challenges, and Opportunities

著者: Safayat Bin Hakim, Muhammad Adil, Alvaro Velasquez, Shouhuai Xu, Houbing Herbert Song

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2509.06921

要約:
Cybersecurity demands both rapid pattern recognition and deliberative reasoning, yet purely neural or purely symbolic approaches each address only one side of this duality. Neuro-Symbolic (NeSy) AI bridges this gap by integrating learning and logic within a unified framework. This systematic review analyzes 103 publications across the neural-symbolic integration spectrum in cybersecurity through April 2026, organizing them via a three-tier taxonomy -- deep integration, structured interaction, and contextual baselines -- and a Grounding-Instructibility-Alignment (G-I-A) analytical lens. We find that multi-agent and structured-integration architectures across the surveyed spectrum substantially outperform single-agent approaches in complex scenarios, causal reasoning enables proactive defense beyond correlation-based detection, and knowledge-guided learning improves both data efficiency and explainability. These findings span intrusion detection, malware analysis, vulnerability discovery, and autonomous penetration testing, revealing that integration depth often correlates with capability gains across domains. A first-of-its-kind dual-use analysis further shows that autonomous offensive systems in the broader survey corpus are already achieving notable zero-day exploitation success at significantly reduced cost, fundamentally reshaping threat landscapes. However, critical barriers persist: evaluation standardization remains nascent, computational costs constrain deployment, and effective human-AI collaboration is underexplored. We distill these findings into a prioritized research roadmap emphasizing community-driven benchmarks, responsible development practices, and defensive alignment to guide the next generation of NeSy cybersecurity systems.

#27 Post-Quantum Security of Block Cipher Constructions

著者: Gorjan Alagic, Chen Bai, Christian Majenz, Kaiyan Shi

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2510.08725

要約:
Block ciphers are versatile cryptographic ingredients that are used in a wide range of applications ranging from secure Internet communications to disk encryption. While post-quantum security of public-key cryptography has received significant attention, the case of symmetric-key cryptography (and block ciphers in particular) remains a largely unexplored topic. In this work, we set the foundations for a theory of post-quantum security for block ciphers and associated constructions. Leveraging our new techniques, we provide the first post-quantum security proofs for the key-length extension scheme FX, the tweakable block ciphers LRW and XEX, and most block cipher encryption and authentication modes. Our techniques can be used for security proofs in both the plain model and the quantum ideal cipher model. Our work takes significant initial steps in establishing a rigorous understanding of the post-quantum security of practical symmetric-key cryptography.

#28 Resolving Availability and Run-time Integrity Conflicts in Real-Time Embedded Systems

著者: Adam Caulfield, Muhammad Wasif Kamran, N. Asokan

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2511.14088

要約:
Run-time integrity enforcement in real-time systems presents a fundamental conflict with availability. Existing approaches in real-time systems primarily focus on minimizing the execution-time overhead of monitoring. After a violation is detected, prior works face a trade-off: (1) prioritize availability and allow a compromised system to continue to ensure applications meet their deadlines, or (2) prioritize security by generating a fault to abort all execution. In this work, we propose PAIR, an approach that offers a middle ground between the stark extremes of this trade-off. PAIR monitors real-time tasks for run-time integrity violations and maintains an Availability Region (AR) of all tasks that are safe to continue. When a task causes a violation, PAIR triggers a non-maskable interrupt to kill the task and continue executing a non-violating task within AR. Thus, PAIR ensures only violating tasks are prevented from execution, while granting availability to remaining tasks. With its hardware approach, PAIR does not cause any run-time overhead to the executing tasks, integrates with real-time operating systems (RTOSs), and is affordable to low-end microcontroller units (MCUs) by incurring +2.3% overhead in memory and hardware usage.

#29 Persistent BitTorrent Trackers

著者: Fran\c{c}ois-Xavier Wicht, Zhengwei Tong, Shunfan Zhou, Hang Yin, Aviv Yaish

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2511.17260

要約:
Private BitTorrent trackers enforce upload-to-download ratios to prevent free-riding, but suffer from three critical weaknesses: reputation cannot move between trackers, centralized servers create single points of failure, and upload statistics are self-reported and unverifiable. When a tracker shuts down, users lose their contribution history and cannot prove their standing to new communities. We address these problems by storing reputation in smart contracts and replacing self-reports with cryptographic attestations. Peers sign receipts for received pieces; the tracker aggregates them via BLS signatures and updates reputation. If a tracker is unavailable, peers fall back to an authenticated distributed hash table (DHT): stored reputation acts as a public key infrastructure (PKI), preserving access control without the tracker. Reputation is portable across tracker failures through single-hop migration in factory-deployed contracts. We also address the privacy implications of publishing public keys and reputations tied to private trackers on a public ledger: we propose ephemeral session keys to prevent linking peer identities, zero-knowledge membership proofs for anonymous DHT participation, and confidential reputation using homomorphic commitments. We formalize the security requirements, prove four security properties under standard cryptographic assumptions, and evaluate a prototype. Measurements show that transfer receipts add less than 5\% end-to-end overhead with typical piece sizes. To minimize signing overhead, we adopt a hybrid signature scheme: ECDSA signs individual piece receipts at transfer time for low per-operation latency, while BLS serves as the overarching scheme, enabling compact aggregation of many receipts into a single proof at report time. This design reduces client-side signing cost by an order of magnitude compared to using BLS throughout.

#30 ZK-APEX: Zero-Knowledge Approximate Personalized Unlearning with Executable Proofs

privacy

著者: Mohammad M Maheri, Sunil Cotterill, Alex Davidson, Hamed Haddadi

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2512.09953

要約:
Machine unlearning aims to remove the influence of specific data points from a trained model to satisfy privacy, copyright, and safety requirements. In real deployments, providers distribute a global model to many edge devices, where each client personalizes the model using private data. When a deletion request is issued, clients may ignore it or falsely claim compliance, and providers cannot check their parameters or data. This makes verification difficult, especially because personalized models must forget the targeted samples while preserving local utility, and verification must remain lightweight on edge devices. We introduce ZK APEX, a zero-shot personalized unlearning method that operates directly on the personalized model without retraining. ZK APEX combines sparse masking on the provider side with a small Group OBS compensation step on the client side, using a blockwise empirical Fisher matrix to create a curvature-aware update designed for low overhead. Paired with Halo2 zero-knowledge proofs, it enables the provider to verify that the correct unlearning transformation was applied without revealing any private data or personalized parameters. On Vision Transformer classification tasks, ZK APEX recovers nearly all personalization accuracy while effectively removing the targeted information. Applied to the OPT125M generative model trained on code data, it recovers around seventy percent of the original accuracy. Proof generation for the ViT case completes in about two hours, more than ten million times faster than retraining-based checks, with less than one gigabyte of memory use and proof sizes around four hundred megabytes. These results show the first practical framework for verifiable personalized unlearning on edge devices.

#31 ChatGPT: Excellent Paper! Accept It. Editor: Imposter Found! Review Rejected

著者: Kanchon Gharami, Sanjiv Kumar Sarkar, Safayat Bin Hakim, Yongxin Liu, Nahid Farhady Ghalaty, Shafika Showkat Moni

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2512.20405

要約:
Large Language Models (LLMs) like ChatGPT are now widely used in writing and reviewing scientific papers. While this trend accelerates publication growth and reduces human workload, it also introduces serious risks. Papers written or reviewed by LLMs may lack real novelty, contain fabricated or biased results, or mislead downstream research that others depend on. Such issues can damage reputations, waste resources, and even endanger lives when flawed studies influence medical or safety-critical systems. This research explores both the offensive and defensive sides of this growing threat. On the attack side, we demonstrate how an author can inject hidden prompts inside a PDF that secretly guide or "jailbreak" LLM reviewers into giving overly positive feedback and biased acceptance. On the defense side, we propose an "inject-and-detect" strategy for editors, where invisible trigger prompts are embedded into papers; if a review repeats or reacts to these triggers, it reveals that the review was generated by an LLM, not a human. This method turns prompt injections from vulnerability into a verification tool. We outline our design, expected model behaviors, and ethical safeguards for deployment. The goal is to expose how fragile today's peer-review process becomes under LLM influence and how editorial awareness can help restore trust in scientific evaluation.

#32 Safe-FedLLM: Delving into the Safety of Federated Large Language Models

著者: Mingxiang Tao, Yu Tian, Wenxuan Tu, Yue Yang, Xue Yang, Xiangyan Tang

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2601.07177

要約:
Federated learning (FL) addresses privacy and data-silo issues in the training of large language models (LLMs). Most prior work focuses on improving the efficiency of federated learning for LLMs (FedLLM). However, security in open federated environments, particularly defenses against malicious clients, remains underexplored. To investigate the security of FedLLM, we conduct a preliminary study to analyze potential attack surfaces and defensive characteristics from the perspective of LoRA updates. We find two key properties of FedLLM: 1) LLMs are vulnerable to attacks from malicious clients in FL, and 2) LoRA updates exhibit distinct behavioral patterns that can be effectively distinguished by lightweight classifiers. Based on these properties, we propose Safe-FedLLM, a probe-based defense framework for FedLLM, which constructs defenses across three levels: Step-Level, Client-Level, and Shadow-Level. The core concept of Safe-FedLLM is to perform probe-based discrimination on each client's local LoRA updates, treating them as high-dimensional behavioral features and using a lightweight classifier to determine whether they are malicious. Extensive experiments demonstrate that Safe-FedLLM effectively improves FedLLM's robustness against malicious clients while maintaining competitive performance on benign data. Notably, our method effectively suppresses the impact of malicious data without significantly affecting training speed, and remains effective even under high malicious client ratios.

#33 Rigorous and Generalized Proof of Security of Bitcoin Protocol with Bounded Network Delay

著者: Christopher Blake, Chen Feng, Xuechao Wang, Qianyu Yu

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2601.09082

要約:
A proof of the security of the Bitcoin protocol is made rigorous, and simplified in certain parts. A computational model in which an adversary can delay transmission of blocks by time $\Delta$ is considered. The protocol is generalized to allow blocks of different scores and a proof within this more general model is presented. An approach used in a previous paper that used random walk theory is shown through a counterexample to be incorrect; an approach involving a punctured block arrival process is shown to remedy this error. Thus, it is proven that with probability one, the Bitcoin protocol will have infinitely many honest blocks so long as the fully-delayed honest mining rate exceeds the adversary mining rate.

#34 In-Context Autonomous Network Incident Response: An End-to-End Large Language Model Agent Approach

agent

著者: Yiran Gao, Kim Hammar, Tao Li

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2602.13156

要約:
Rapidly evolving cyberattacks demand incident response systems that can autonomously learn and adapt to changing threats. Prior work has extensively explored the reinforcement learning approach, which involves learning response strategies through extensive simulation of the incident. While this approach can be effective, it requires handcrafted modeling of the simulator and suppresses useful semantics from raw system logs and alerts. To address these limitations, we propose to leverage large language models' (LLM) pre-trained security knowledge and in-context learning to create an end-to-end agentic solution for incident response planning. Specifically, our agent integrates four functionalities, perception, reasoning, planning, and action, into one lightweight LLM (14b model). Through fine-tuning and chain-of-thought reasoning, our LLM agent is capable of processing system logs and inferring the underlying network state (perception), updating its conjecture of attack models (reasoning), simulating consequences under different response strategies (planning), and generating an effective response (action). By comparing LLM-simulated outcomes with actual observations, the LLM agent repeatedly refines its attack conjecture and corresponding response, thereby demonstrating in-context adaptation. Our agentic approach is free of modeling and can run on commodity hardware. When evaluated on incident logs reported in the literature, our agent achieves recovery up to 23% faster than those of frontier LLMs.

#35 The Mirror Design Pattern: Strict Data Geometry over Model Scale for Prompt Injection Detection

著者: J Alex Corll

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2603.11875

要約:
Prompt injection defenses are often framed as semantic understanding problems and delegated to increasingly large neural detectors. For the first screening layer, however, the requirements are different: the detector runs on every request and therefore must be fast, deterministic, non-promptable, and auditable. We introduce Mirror, a data-curation design pattern that organizes prompt injection corpora into matched positive and negative cells so that a classifier learns control-plane attack mechanics rather than incidental corpus shortcuts. Using 5,000 strictly curated open-source samples -- the largest corpus supportable under our public-data validity contract -- we define a 32-cell mirror topology, fill 31 of those cells with public data, train a sparse character n-gram linear SVM, compile its weights into a static Rust artifact, and obtain 95.97\% recall and 92.07\% F1 on a 524-case holdout at sub-millisecond latency with no external model runtime dependencies. On the same holdout, our next line of defense, a 22-million-parameter Prompt Guard~2 model reaches 44.35\% recall and 59.14\% F1 at 49\,ms median and 324\,ms p95 latency. Linear models still leave residual semantic ambiguities such as use-versus-mention for later pipeline layers, but within that scope our results show that for L1 prompt injection screening, strict data geometry can matter more than model scale.

#36 A Large-Scale Study of Telegram Bots

著者: Taro Tsuchiya, Haoxiang Yu, Tina Marjanov, Alice Hutchings, Nicolas Christin, Alejandro Cuevas

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2603.24302

要約:
Telegram, initially a messaging app, has evolved into a platform where users can interact with various services through programmable applications, bots. Bots provide a wide range of uses, from moderating groups, helping with online shopping, to even executing trades in financial markets. However, Telegram has been increasingly associated with various illicit activities -- financial scams, stolen data, non-consensual image sharing, among others, raising concerns bots may be facilitating these operations. This paper is the first to characterize Telegram bots at scale, through the following contributions. First, we offer the largest general-purpose message dataset and the first bot dataset. Through snowball sampling from two published datasets, we uncover over 67,000 additional channels, 492 million messages, and 32,000 bots. Second, we develop a system to automatically interact with bots in order to extract their functionality. Third, based on their description, chat responses, and the associated channels, we classify bots into several domains. Fourth, we investigate the communities each bot serves, by analyzing supported languages, usage patterns (e.g., duration, reuse), and network topology. While our analysis discovers useful applications such as crowdsourcing, we also identify malicious bots (e.g., used for financial scams, illicit underground services) serving as payment gateways, referral systems, and malicious AI endpoints. By exhorting the research community to look at bots as software infrastructure, this work hopes to foster further research useful to content moderators, and to help interventions against illicit activities.

#37 SelfGrader: Stable Jailbreak Detection for Large Language Models using Token-Level Logits

著者: Zikai Zhang, Rui Hu, Olivera Kotevska, Jiahao Xu

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.01473

要約:
Large Language Models (LLMs) are powerful tools for answering user queries, yet they remain highly vulnerable to jailbreak attacks. Existing guardrail methods typically rely on internal features or textual responses to detect malicious queries, which either introduce substantial latency or suffer from the randomness in text generation. To overcome these limitations, we propose SelfGrader, a lightweight guardrail method that formulates jailbreak detection as a numerical grading problem using token-level logits. Specifically, SelfGrader evaluates the safety of a user query within a compact set of numerical tokens (NTs) (e.g., 0-9) and interprets their logit distribution as an internal safety signal. To align these signals with human intuition of maliciousness, SelfGrader introduces a dual-perspective scoring rule that considers both the maliciousness and benignness of the query, yielding a stable and interpretable score that reflects harmfulness and reduces the false positive rate simultaneously. Extensive experiments across diverse jailbreak benchmarks, multiple LLMs, and state-of-the-art guardrail baselines demonstrate that SelfGrader achieves up to a 22.66% reduction in ASR on LLaMA-3-8B, while maintaining significantly lower memory overhead (up to 173x) and latency (up to 26x).

#38 LLM-Enabled Open-Source Systems in the Wild: An Empirical Study of Vulnerabilities in GitHub Security Advisories

著者: Fariha Tanjim Shifat, Hariswar Baburaj, Ce Zhou, Jaydeb Sarker, Mia Mohammad Imran

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.04288

要約:
Large language models (LLMs) are increasingly embedded in open-source software (OSS) ecosystems, creating complex interactions among natural language prompts, probabilistic model outputs, and execution-capable components. However, it remains unclear whether traditional vulnerability disclosure frameworks adequately capture these model-mediated risks. To investigate this, we analyze 295 GitHub Security Advisories published between January 2025 and January 2026 that reference LLM-related components, and we manually annotate a sample of 100 advisories using the OWASP Top 10 for LLM Applications 2025. We find no evidence of new implementation-level weakness classes specific to LLM systems. Most advisories map to established CWEs, particularly injection and deserialization weaknesses. At the same time, the OWASP-based analysis reveals recurring architectural risk patterns, especially Supply Chain, Excessive Agency, and Prompt Injection, which often co-occur across multiple stages of execution. These results suggest that existing advisory metadata captures code-level defects but underrepresents model-mediated exposure. We conclude that combining the CWE and OWASP perspectives provides a more complete and necessary view of vulnerabilities in LLM-integrated systems.

#39 Evaluating Differential Privacy Against Membership Inference in Federated Learning: Insights from the NIST Genomics Red Team Challenge

privacy

著者: Gustavo de Carvalho Bertoli

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.12737

要約:
While Federated Learning (FL) mitigates direct data exposure, the resulting trained models remain susceptible to membership inference attacks (MIAs). This paper presents an empirical evaluation of Differential Privacy (DP) as a defense mechanism against MIAs in FL, leveraging the environment of the 2025 NIST Genomics Privacy-Preserving Federated Learning (PPFL) Red Teaming Event. To improve inference accuracy, we propose a stacking attack strategy that ensembles seven black-box estimators to train a meta-classifier on prediction probabilities and cross-entropy losses. We evaluate this methodology against target models under three privacy configurations: an unprotected convolutional neural network (CNN, $\epsilon=\infty$), a low-privacy DP model ($\epsilon=200$), and a high-privacy DP model ($\epsilon=10$). The attack outperforms all baselines in the No DP and Low Privacy settings and, critically, maintains measurable membership leakage at $\epsilon=200$ where a single-signal LiRA baseline collapses. Evaluated on an independent third-party benchmark, these results provide an empirical characterisation of how stacking-based inference degrades across calibrated DP tiers in FL.

#40 Random Walk Learning and the Pac-Man Attack

著者: Xingran Chen, Parimal Parag, Rohit Bhagat, Zonghong Liu, Salim El Rouayheb

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2508.05663

要約:
Random walk (RW)-based algorithms have long been popular in distributed systems due to low overheads and scalability, with recent growing applications in decentralized learning. However, their reliance on local interactions makes them inherently vulnerable to malicious behavior. In this work, we investigate an adversarial threat that we term the ``Pac-Man'' attack, in which a malicious node probabilistically terminates any RW that visits it. This stealthy behavior gradually eliminates active RWs from the network, effectively halting the learning process without triggering failure alarms. To counter this threat, we propose the Average Crossing (AC) algorithm--a fully decentralized mechanism for duplicating RWs to prevent RW extinction in the presence of Pac-Man. Our theoretical analysis establishes that (i) the RW population remains almost surely bounded under AC and (ii) RW-based stochastic gradient descent remains convergent under AC, even in the presence of Pac-Man, with a quantifiable deviation from the true optimum. Our extensive empirical results on both synthetic and real-world datasets corroborate our theoretical findings. Furthermore, they uncover a phase transition in the extinction probability as a function of the duplication threshold. We offer theoretical insights by analyzing a simplified variant of the AC, which sheds light on the observed phase transition.

#41 Formalizing the Safety, Security, and Functional Properties of Agentic AI Systems

agent

著者: Edoardo Allegrini, Ananth Shreekumar, Z. Berkay Celik

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2510.14133

要約:
Agentic AI systems, which leverage multiple autonomous agents and large language models (LLMs), are increasingly used to address complex, multi-step tasks. The safety, security, and functionality of these systems are critical, especially in high-stakes applications. However, the current ecosystem of inter-agent communication is fragmented, with protocols such as the Model Context Protocol (MCP) for tool access and the Agent-to-Agent (A2A) protocol for coordination being analyzed in isolation. This fragmentation creates a semantic gap that prevents the rigorous analysis of system properties and introduces risks such as architectural misalignment and exploitable coordination issues. To address these challenges, we introduce a modeling framework for agentic AI systems composed of two central models: (1) the host agent model formalizes the top-level entity that interacts with the user, decomposes tasks, and orchestrates their execution by leveraging external agents and tools; (2) the task lifecycle model details the states and transitions of individual sub-tasks from creation to completion, providing a fine-grained view of task management and error handling. Together, these models provide a unified semantic framework for reasoning about the behavior of multi-AI agent systems. Grounded in this framework, we define 16 properties for the host agent and 14 for the task lifecycle, categorized into liveness, safety, completeness, and fairness. Expressed in temporal logic, these properties enable formal verification of system behavior, detection of coordination edge cases, and prevention of deadlocks and security vulnerabilities. Through this effort, we introduce the first rigorously grounded, domain-agnostic framework for the analysis, design, and deployment of correct, reliable, and robust agentic AI systems.

#42 ZK-AMS: Credibly Anonymous Admission for Web 3.0 Platforms via Recursive Proof Aggregation

著者: Zibin Lin, Taotao Wang, Shengli Zhang, Long Shi, Boris D\"udder, Shui Yu

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2602.16130

要約:
Web 3.0 platforms need an onboarding mechanism that can admit real users at scale without forcing them to reveal identity documents or pay one on-chain verification cost per user. Existing approaches typically rely on KYC-style disclosure, per-request on-chain verification, or trusted batching, making onboarding cost and latency difficult to predict under bursty demand. We present \textbf{ZK-AMS}, a credibly anonymous admission infrastructure that maps Personhood Credentials to anonymous on-chain Soul Accounts. Rather than introducing a new primitive, ZK-AMS composes zero-knowledge credential validation, permissionless batch submission, recursive proof aggregation, and anonymous post-admission account provisioning into one end-to-end workflow. Its key design feature is a confidential batching pipeline in which admission instances of a common relation are folded off-chain under multi-key homomorphic encryption, allowing an untrusted batch submitter to coordinate aggregation without direct access to individual user witnesses during batching; the confidentiality scope is characterized explicitly in the security analysis. The resulting batch is settled on-chain with constant verification cost per batch rather than per admitted user. We implement ZK-AMS on an Ethereum testbed and evaluate admission throughput, end-to-end latency, gas consumption, and parameter trade-offs. Results show stable batch-verification gas across evaluated batch sizes, substantially lower amortized on-chain cost than the non-recursive baseline, and practical cost-latency trade-offs for high-concurrency onboarding in Web 3.0 platforms.

#43 Composition Theorems for Multiple Differential Privacy Constraints

privacy

著者: Cemre Cadir, Salim Najib, Yanina Y. Shkel

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2603.20968

要約:
The exact composition of mechanisms for which two differential privacy (DP) constraints hold simultaneously is studied. The resulting privacy region admits an exact representation as a mixture over compositions of mechanisms of heterogeneous DP guarantees, yielding a framework that naturally generalizes to the composition of mechanisms for which any number of DP constraints hold. This result is shown through a structural lemma for mixtures of binary hypothesis tests. Lastly, the developed methodology is applied to approximate $f$-DP composition.

#44 Information-Theoretic Solutions for Seedless QRNG Bootstrapping and Hybrid PQC-QKD Key Combination

著者: Juan Antonio Vieira Giestinhas, Timothy Spiller

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2603.26907

要約:
This paper considers two challenges faced by practical quantum networks: the bootstrapping of seedless Quantum Random Number Generators (QRNGs) and the resilient combination of Post-Quantum Cryptography (PQC) and Quantum Key Distribution (QKD) keys. These issues are addressed using universal hash functions as strong seeded extractors, with security foundations provided by the Quantum Leftover Hash Lemma (QLHL). First, the 'randomness loop' in QRNGs -- the requirement of an initial random seed to generate further randomness -- is resolved by proposing a bootstrapping method using raw data from two independent sources of entropy, given by seedless QRNG sources. Second, it is argued that strong seeded extractors are an alternative to XOR-based key combining that presents different characteristics. Unlike XORing, our method ensures that if the combined output and one initial key are compromised, the remaining key material retains quantifiable min-entropy and remains secure in exchange of longer keys. Furthermore, the proposed method allows to bind transcript information with key material in a natural way, providing a tool to replace computationally based combiners to extend ITS security of the initial key material to the final combined output. By modeling PQC keys as having HILL (Hastad, Impagliazzo, Levin and Luby) entropy, the framework is extended to hybrid PQC-QKD systems. This unified approach provides a mathematically rigorous and future-proof mechanism for both randomness generation and secure key management against quantum adversaries.

#45 ReproMIA: A Comprehensive Analysis of Model Reprogramming for Proactive Membership Inference Attacks

privacy

著者: Chihan Huang, Huaijin Wang, Shuai Wang

公開日: Thu, 16 Apr 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2603.28942

要約:
The pervasive deployment of deep learning models across critical domains has concurrently intensified privacy concerns due to their inherent propensity for data memorization. While Membership Inference Attacks (MIAs) serve as the gold standard for auditing these privacy vulnerabilities, conventional MIA paradigms are increasingly constrained by the prohibitive computational costs of shadow model training and a precipitous performance degradation under low False Positive Rate constraints. To overcome these challenges, we introduce a novel perspective by leveraging the principles of model reprogramming as an active signal amplifier for privacy leakage. Building upon this insight, we present \texttt{ReproMIA}, a unified and efficient proactive framework for membership inference. We rigorously substantiate, both theoretically and empirically, how our methodology proactively induces and magnifies latent privacy footprints embedded within the model's representations. We provide specialized instantiations of \texttt{ReproMIA} across diverse architectural paradigms, including LLMs, Diffusion Models, and Classification Models. Comprehensive experimental evaluations across more than ten benchmarks and a variety of model architectures demonstrate that \texttt{ReproMIA} consistently and substantially outperforms existing state-of-the-art baselines, achieving a transformative leap in performance specifically within low-FPR regimes, such as an average of 5.25\% AUC and 10.68\% TPR@1\%FPR increase over the runner-up for LLMs, as well as 3.70\% and 12.40\% respectively for Diffusion Models.

cs.CR updates on arXiv.org

📋 論文タイトル一覧