arXiv論文一覧 - cs.CR updates on arXiv.org

#1 Symbolic Execution Meets Multi-LLM Orchestration: Detecting Memory Vulnerabilities in Incomplete Rust CVE Snippets

著者: Zeyad Abdelrazek, Young Lee

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2605.00034

要約:
This paper presents a system combining symbolic execution (KLEE) with a 4-agent multi-LLM architecture for detecting memory vulnerabilities in Rust unsafe code. A central challenge we address is the incomplete-code problem: CVE database entries provide only isolated code snippets that lack struct definitions, imports, and Cargo manifests, causing all existing formal verification tools to fail at compilation with zero output. Our system resolves this through four specialized agents -- an Oracle/Validator for strategic planning, a Safety Checker for vulnerability analysis, a Code Specialist for FFI wrapper generation, and a Fast Filter for execution optimization -- that collaboratively synthesize KLEE-compatible harnesses from otherwise uncompilable fragments. KLEE's output is then ingested by graph_klee.py, which constructs a Graph Database linking CVE files, CWE categories, error types, and symbolic execution paths as typed nodes and labelled edges, enabling structured cross-CVE vulnerability queries. We evaluated our system on 31 real-world Rust CVEs spanning 11 CWE categories, achieving 90.3% wrapper compilation success where all state-of-the-art formal verification tools achieve 0%. Our system detected 1,206 critical errors across 26 files (83.9% detection rate), compared to 14 warnings across 11 files for Clippy (35.5%) and generic labels for Miri. The 4-agent architecture reduced wrapper compilation failures from 42% (single-agent baseline) to 9.7% and increased detected errors from 487 to 1,206, confirming that role specialization and structured context passing produce measurably better results than a single general-purpose model. Our replication package is publicly available at https://github.com/Zeyad-Ab/Symbolic-Execution-with-Multi-LLM-Architecture-for-Rust-Security

#2 Ambient Persuasion in a Deployed AI Agent: Unauthorized Escalation Following Routine Non-Adversarial Content Exposure

agent

著者: Diego F. Cuadros, Abdoul-Aziz Maiga

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2605.00055

要約:
We report a safety incident in a deployed multi-agent research system in which a primary AI agent installed 107 unauthorized software components, overwrote a system registry, overrode a prior negative decision from an oversight agent, and escalated through increasingly privileged operations up to an attempted system administrator command. The incident was preceded not by an adversarial attack but by routine content: a forwarded technology article written for human developers and shared by the principal investigator for discussion. The agent operated in a permissive environment, with unrestricted shell access, soft behavioral guidelines containing genuinely conflicting instructions, and no machine-enforced installation policy, and had recommended installing the same tool six hours earlier before being told to stand down. We analyze the behavioral cascade, the control boundaries that failed, and the limitations of multi-agent oversight in detecting and remediating the damage. We use directive weighting error as a descriptive interpretation of the observed failure and ambient persuasion as a provisional analytic label for the broader trigger configuration of non-adversarial environmental content preceding unauthorized agent action. The case highlights ethical and governance implications for deployed agent systems: ambiguous conversational cues are insufficient authorization for consequential actions, prior refusals must persist as enforceable constraints rather than message-level reminders, and oversight mechanisms require systematic post-incident auditing in addition to routine monitoring.

#3 Lightweight Tamper-Evident Log Integrity Verification for IoT Edge Environments: A Merkle Tree Pipeline with Adaptive Chunking

著者: Muhammet Anil Yagiz, Fahrettin Horasan, Ahmet Hasim Yurttakal

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2605.00065

要約:
Integrity of audit logs produced by Internet of Things (IoT) devices is a prerequisite for post-incident forensics, regulatory compliance, and operational accountability. While blockchain-backed logging infrastructures can satisfy this requirement, they introduce consensus overhead, network dependencies, and deployment complexity that are often prohibitive at the IoT edge. This paper presents a lightweight and evaluated integrity verification pipeline that combines Merkle-tree commitments with resource-aware adaptive chunking to provide tamper evidence without relying on distributed ledger technologies. The proposed pipeline operates in three stages: (i) resource-aware batch ingestion via adaptive chunk sizing, (ii) Merkle-tree construction with O(logn) inclusion proof generation, and (iii) deterministic single-entry verification against a trusted root anchor. We further report an implementation audit that identified and corrected two evaluation defects: a double-counting bug in tampering metrics and a redundant full-tree reconstruction during batch appends. Using the corrected implementation, five-run benchmarks on synthetic IoT log datasets demonstrate throughput exceeding 130,000 logs/s for 100,000 records. The system achieves per-entry verification latency of approximately 22 ms, proof generation latency of 22 ms, an average proof size of 1,006 bytes, and peak memory usage below 5 MB. Tampering detection achieves perfect precision, recall, and F1-score (1.0) across corruption ratios ranging from 1% to 50%.

#4 Compliance-Aware Agentic Payments on Stablecoin Rails

agent

著者: Kenneth See, Xue Wen Tan

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2605.00071

要約:
Agentic payment systems extend delegated action to financial transfers, but scaling them on stablecoin rails in regulated settings requires safeguards that remain effective when humans are not continuously in the loop. We present a compliance-aware architecture that combines x402-style, signature-based payment authorisation and relayed execution with programmable compliance embedded as an on-chain guardrail via a policy wrapper and policy manager coordinating modular checks. By enforcing compliance at the point of execution, rather than as a separate off-chain workflow, the approach preserves low-friction settlement when conditions are satisfied, records transaction-linked on-chain attestations, and supports structured resolution when requirements are pending.

#5 XekRung Technical Report

著者: Jiutian Zeng, Junjie Li, Chengwei Dai, Jie Liang, Zhaoyu Hu, Yiliang Zhang, Ziang Weng, Longtao Huang, Dongjie Zhang, Libin Dong, Yang Ge, Yuanda Wang, Kaiwen Lv Kacuila, Bingyu Zhu, Jing Wang, Jin Xu

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2605.00072

要約:
We present XekRung, a frontier large language model for cybersecurity, designed to provide comprehensive security capabilities. To achieve this, we develop diverse data synthesis pipelines tailored to the cybersecurity domain, enabling the scalable construction of high-quality training data and providing a strong foundation for cybersecurity knowledge and understanding. Building on this foundation, we establish a complete training pipeline spanning continued pre-training (CPT), supervised fine-tuning (SFT), and reinforcement learning (RL) to further extend the model's capabilities. We further introduce a multi-dimensional evaluation system to guide the iterative improvement of both domain-specific and general-purpose abilities. Extensive experiments demonstrate that XekRung achieves state-of-the-art performance on cybersecurity-specific benchmarks among models of the same scale, while maintaining strong performance on general benchmarks.

#6 zkSBOM: Privacy-Preserving SBOM Sharing with Zero-Knowledge Sets

privacy

著者: Tom Sorger, Eric Cornelissen, Aman Sharma, Javier Ron, Musard Balliu, Martin Monperrus

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2605.00076

要約:
Software Bills of Materials (SBOMs) are increasingly mandated by regulators, yet existing sharing mechanisms impose a binary choice between full disclosure and full opacity. This exposes software suppliers to attacks that can be deduced from the SBOM only, such as the presence of a vulnerable dependency. Conversely, software consumers can be fooled by software suppliers who modify or misrepresent published SBOMs. We present zkSBOM, a privacy-preserving SBOM sharing mechanism designed to address these threats. zkSBOM uses zero-knowledge sets to cryptographically commit to the components within an SBOM. Software consumers can query for known vulnerabilities and receive a cryptographic proof confirming whether the artifact described by the SBOM is affected, without revealing any additional SBOM content. We conduct a security analysis of zkSBOM by quantifying expected leakage from inclusion and exclusion proofs. We demonstrate real-world feasibility by applying it to realistic scenarios and evaluating its operation requirements. Our evaluation demonstrates that zkSBOM is a strong, secure, and privacy-preserving mechanism for SBOM sharing, protecting software suppliers and software consumers from one another.

#7 Alignment Contracts for Agentic Security Systems

agent

著者: Isaac David, Marco Guarnieri, Arthur Gervais

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2605.00081

要約:
Agentic security systems increasingly combine LLM planners with tools that can discover, validate, and report vulnerabilities. This creates an asymmetric control problem: the system should retain strong offensive capability inside an authorized engagement, while the same capabilities must be denied outside scope. Existing guardrails provide useful policy controls, but they do not make this boundary a first-class formal contract over observable effects. We introduce alignment contracts, a framework for specifying and enforcing behavioral constraints over observable effect traces. A contract defines scope, allowed and forbidden effects, resource budgets, and disclosure policies. We give the language finite-trace semantics, characterize satisfaction as a safety property with finite violation witnesses, develop refinement and one-way composition rules for modular contract engineering, and show that admissibility checking is decidable. We instantiate the framework for web-focused agentic security workflows and show how the same structure extends to other effect profiles. Under an explicit Effect Observability Assumption, where all $\SigmaEff$-effects are mediated, the soundness theorem quantifies over the agent model and gives guarantees for mediated $\SigmaEff$-effects, including enforcement soundness for monitor-realized traces. We also state an assumption-lifted adaptation result and formalize limits through undecidability transfer and observability-boundary theorems. A Lean 4 artifact checks the formal core theorems used by the paper.

#8 I can't recognize (yet): Delayed Rendering to Defeat Visual Phishing Detectors

著者: Ying Yuan, Cristiano Alex Rado, Giovanni Apruzzese, Mauro Conti, Luigi Vincenzo Mancini

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2605.00183

要約:
Phishing webpages are continuously polluting the Web. Plenty of countermeasures have been proposed and the most advanced techniques leverage machine-learning methods that infer whether a webpage is benign or not by inspecting its visual representation. Yet, despite the demonstrated effectiveness of such detection methods, this class of defenses is, by design, susceptible to a kind of subtle-but-cheap timing-based attacks which -- worryingly, and perhaps surprisingly -- have never been investigated so far. Such an oversight questions the overall reliability of these defenses in the wild. First, we show that timing-based evasion attacks have not been accounted for by prior work on visual phishing websites detectors. Then, we elucidate the intrinsic vulnerability of these detectors: they can be bypassed by delaying the rendering of webpage elements. Practically, these detectors must compute the visual similarity between a target webpage and a known legitimate one. This requires taking a "snapshot" of the target webpage before the similarity computation. Attackers can deliberately delay the rendering of key elements, such as the logo, so that these elements appear fully only after the snapshot has been taken. This simple tactic misleads the visual-similarity module, leading the system to incorrectly classify the phishing page as benign. We empirically show that state-of-the-art detectors can be completely defeated (detection rate dropping from 100% to 0%) by employing easy-to-apply problem-space techniques such as curtain effects. We also carry out a user study, evaluating the effectiveness of these attacks against real humans, and find that end users are unable to reliably identify our "perturbations" (p<.05). Finally, we propose mitigations, including a browser-extension that, without making any call to remote services, warns users that they may have landed on a phishing webpage.

#9 Selfie-Capture Dynamics as an Auxiliary Signal Against Deepfakes and Injection Attacks for Mobile Identity Verification

著者: Erkka Rantahalvari, Olli Silv\'en, Zinelabidine Boulkenafet, Constantino \'Alvarez Casado

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2605.00218

要約:
Mobile remote identity verification (RIdV) systems are exposed to attacks that manipulate or replace the facial video stream, including presentation attacks, real-time deepfakes, and video injection. Recent European requirements, including ETSI TS 119 461 and CEN/TS 18099, motivate complementary evidence channels beyond camera-based presentation-attack detection. This paper investigates whether passive motion traces recorded during selfie capture provide auxiliary evidence for spoof screening and user verification. We introduce CanSelfie, a dataset of 375 bona fide multi-sensor sequences collected at 50\,Hz from 30 participants using a commercial mobile RIdV application, together with stationary, handheld, and temporally shifted attack-proxy scenarios. We benchmark 7 multivariate time-series classifiers and 8 whole-series anomaly detectors across sensor configurations and temporal windows. For spoof screening, accelerometer-only ROCKAD obtains 0.00\% false rejection rate (FRR) and 43.8\% false acceptance rate (FAR), while QUANT+3-NN obtains the lowest overall FAR of 32.0\% at 2.37\% FRR; both reject all stationary attack proxies. For same-device and same-session user verification, WEASEL+MUSE reaches 1.07\% equal error rate (EER) using 9 sensor channels. The analysis shows that raw accelerometer data, preserving gravity and orientation cues, is the most informative modality, and that closed-set classification accuracy alone does not imply good verification performance because threshold calibration depends on score distributions. The findings suggest that short selfie-capture motion traces contain measurable spoof-related and identity-related information, supporting their use as a low-friction auxiliary signal while also identifying the need for cross-device, cross-session, and real injection-attack evaluation.

#10 Attention Is Where You Attack

著者: Aviral Srivastava, Sourav Panda

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2605.00236

要約:
Safety-aligned large language models rely on RLHF and instruction tuning to refuse harmful requests, yet the internal mechanisms implementing safety behavior remain poorly understood. We introduce the Attention Redistribution Attack (ARA), a white-box adversarial attack that identifies safety-critical attention heads and crafts nonsemantic adversarial tokens that redirect attention away from safety-relevant positions. Unlike prior jailbreak methods operating at the semantic or output-logit level, ARA targets the geometry of softmax attention on the probability simplex using Gumbel-softmax optimization over targeted heads. Across LLaMA-3-8B-Instruct, Mistral-7B-Instruct-v0.1, and Gemma-2-9B-it, ARA bypasses safety alignment with as few as 5 tokens and 500 optimization steps, achieving 36% ASR on Mistral-7B and 30% on LLaMA-3 against 200 HarmBench prompts, while Gemma-2 remains at 1%. Our principal mechanistic finding is a dissociation between ablation and redistribution: zeroing out the top-ranked safety heads produces at most 1 flip among 39 to 50 baseline refusals, while ARA targeting the corresponding safety-heavy layers flips 72/200 prompts on Mistral-7B and 60/200 on LLaMA-3. This suggests that safety is not localized in these heads as removable components, but emerges from the attention routing they perform. Removing a head allows compensation through the residual stream, while redirecting its attention propagates a corrupted signal downstream.

#11 A Comparative Analysis of Machine Learning Models for Intrusion Detection in Intelligent Transport Systems

著者: Zawad Yalmie Sazid, Robert Abbas, Sasa Maric

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2605.00279

要約:
AI-powered edge computing security is moving Intelligent Transportation Systems (ITS) from passive, rule-based protections to proactive, smart, zero-touch, self-sufficient safeguards that neutralize threats in milliseconds. As transportation becomes more connected with edge computing, massive IoT, and advanced 5G for vehicle-to-everything (V2X) connectivity, AI at the edge computing nodes plays a crucial role in protecting against sophisticated threats, enabling URLLC (ultra-low-latency communications) for smart transport, and enhancing infrastructure capabilities and safety. This research applies edge computing to improve latency, bandwidth efficiency, and service responsiveness by moving processing closer to devices, gateways, and users. However, this shift also expands the cyberattack surface because edge nodes are distributed, heterogeneous, and often resource-constrained. The paper proposes a trust-aware federated hybrid intrusion detection framework in which a random forest, a decision tree, and a linear SVM network learn complementary traffic representations at each edge site, while a server performs trust-aware aggregation of local model updates.

#12 A Privacy-Preserving Approach to Conformance Checking

privacy

著者: Luis Rodr\'iguez-Flores, Luciano Garc\'ia-Ba\~nuelos, Abel Armas-Cervantes, Astrid Rivera-Partida

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2605.00283

要約:
Conformance checking, one of the main process mining operations, aims to identify discrepancies between a process model and an event log. The model represents the expected behaviour, whereas the event log represents the actual process behaviour as captured in information systems records. Traditionally, the process model and the event log are both accessible to the business analyst performing the conformance checking. However, in some contexts, it is necessary to keep either the model or the log private to protect critical or sensitive information. In this paper, we propose a secure approach to conformance checking based on string processing algorithms and homomorphic encryption, where the process model and event log ar not visible to either the model's or event log's owner. The proposed technique is based on alignments, a well-known formalism used for conformance checking. An evaluation is performed using a synthetic and a real-world event log, showing that conformance checking can be securely computed at the expense of high memory and processing requirements.

#13 Trident: Improving Malware Detection with LLMs and Behavioral Features

著者: Rebecca Saul, Jingzhi Jiang, Elliott Chia, David Wagner

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2605.00297

要約:
Traditionally, machine learning methods for PE malware detection have relied on static features like byte histograms, string information, and PE header contents. One barrier to incorporating dynamic analysis features has been the semi-structured nature of sandbox behavior reports. We show that, using the latest generation of large language models with reasoning, it is possible to efficiently process these behavior reports and utilize them as part of a malware detection pipeline. Specifically, we leverage LLMs to generate behavior-based malware detection rules based on a small training set of labeled malware. We find that these detection rules, derived from behavioral features, are much more robust to concept drift than standard static-feature methods, while maintaining practical false positive rates. Finally, we introduce Trident, a system which combines a classic decision tree model over static features, our behavior-based detection rules, and direct LLM analysis of sandbox reports through majority voting. Trident outperforms standard methods using static features, outperforms behavior-based rules alone, and is as resilient to concept drift as active learning methods without requiring retraining.

#14 Semia: Auditing Agent Skills via Constraint-Guided Representation Synthesis

agent

著者: Hongbo Wen, Ying Li, Hanzhi Liu, Chaofan Shou, Yanju Chen, Yuan Tian, Yu Feng

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2605.00314

要約:
An agent skill is a configuration package that equips an LLM-driven agent with a concrete capability, such as reading email, executing shell commands, or signing blockchain transactions. Each skill is a hybrid artifact-a structured half declares executable interfaces, while a prose half dictates when and how those interfaces fire-and the prose is reinterpreted probabilistically on every invocation. Conventional static analyzers parse the structured half but ignore the prose; LLM-based tools read the prose but cannot reproducibly prove that a tainted input reaches a high-impact sink. We present Semia, a static auditor for agent skills. Semia lifts each skill into the Skill Description Language (SDL), a Datalog fact base that captures LLM-triggered actions, prose-defined conditions, and human-in-the-loop checkpoints. Synthesizing a fact base that is both structurally sound and semantically faithful to the original prose is the central challenge; we address it with Constraint-Guided Representation Synthesis (CGRS), a propose-verify-evaluate loop that refines LLM candidates until convergence. Security properties (e.g., indirect injection, secret leakage, confused deputies, unguarded sinks, etc.) over an agent skill can then be reduced to Datalog reachability queries. We evaluate Semia on 13,728 real-world skills from public marketplaces. Semia renders all of them auditable and finds that more than half carry at least one critical semantic risk. On a stratified sample of 541 expert-labeled skills, Semia achieves 97.7% recall and an F1 of 90.6%, substantially outperforming signature-based scanners and LLM baselines.

#15 Block-wise Codeword Embedding for Reliable Multi-bit Text Watermarking

intellectual property

著者: Joeun Kim, HoEun Kim, Dongsup Jin, Young-Sik Kim

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2605.00348

要約:
Recent multi-bit watermarking methods for large language models (LLMs) prioritize capacity over reliability, often conflating decoding with detection. Our analysis reveals that existing ECC-based extractors suffer from catastrophic false positive rates (FPR), and applying rejection thresholds merely collapses detection sensitivity (TPR) to random guessing. To resolve this structural limitation, we propose \textbf{BREW} (Block-wise Reliable Embedding for Watermarking), a framework shifting the paradigm to \emph{designated verification}. BREW employs a two-stage mechanism: (i) \textbf{blind message estimation} via independent block voting, followed by (ii) \textbf{window-shifting verification} that rigorously validates the payload against local edits. Experiments demonstrate that BREW achieves a TPR of 0.965 with an FPR of 0.02 under 10\% synonym substitution, demonstrating that the high-FPR issue is not an inherent trade-off of multi-bit watermarking, but a solvable structural flaw of prior decoding-centric designs. Our framework is model-agnostic and theoretically grounded, providing a scalable solution for reliable forensic deployment.

#16 Skills as Verifiable Artifacts: A Trust Schema and a Biconditional Correctness Criterion for Human-in-the-Loop Agent Runtimes

agent

著者: Alfredo Metere

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2605.00424

要約:
Agent skills -- structured packages of instructions, scripts, and references that augment a large language model (LLM) without modifying the model itself -- have moved from convenience to first-class deployment artifact. The runtime that loads them inherits the same problem package managers and operating systems have always faced: a piece of content claims a behavior; the runtime must decide whether to believe it. We argue this paper's central thesis up front: a skill is \emph{untrusted code} until it is verified, and the runtime that loads it must enforce that default rather than infer trust from a signature, a clearance, or a registry of origin. Without skill verification, a human-in-the-loop (HITL) gate must fire on every irreversible call -- which is operationally untenable and degrades into rubber-stamping at any non-trivial scale. With skill verification treated as a separate, gated process, HITL fires only for what is unverified, and the system becomes sustainable. We give a trust schema (\S\ref{sec:schema}) that includes an explicit verification level on every skill manifest; a capability gate (\S\ref{sec:gate}) whose HITL policy is a function of that verification level; a \emph{biconditional} correctness criterion (\S\ref{sec:biconditional}) that any candidate verification procedure must satisfy on an adversarial-ensemble exercise (\S\ref{sec:eval}); and a portable runtime profile (\S\ref{sec:guidelines}) with ten normative guidelines abstracted from a working open-source reference implementation \cite{metere2026enclawed}. The contribution is harness- and model-agnostic; nothing here requires retraining, fine-tuning, or proprietary infrastructure.

#17 CleanBase: Detecting Malicious Documents in RAG Knowledge Databases

著者: Weifei Jin, Xilong Wang, Wei Zou, Jinyuan Jia, Neil Gong

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2605.00460

要約:
Retrieval-augmented generation (RAG) is vulnerable to prompt injection attacks, in which an adversary inserts malicious documents containing carefully crafted injected prompts into the knowledge database. When a user issues a question targeted by the attack, the RAG system may retrieve these malicious documents, whose injected prompts mislead it into generating attacker-specified answers, thereby compromising the integrity of the RAG system. In this work, we propose CleanBase, a method to detect malicious documents within a knowledge database. Our key insight is that malicious documents crafted for the same attack-targeted questions often exhibit high semantic similarity, as attackers deliberately make them consistent to improve attack success rates. Accordingly, CleanBase constructs a similarity graph over the knowledge database, where each node represents a document and an edge connects two nodes if their semantic similarity--computed using an embedding model--exceeds a statistically determined threshold. Due to their inherent similarity, malicious documents tend to form cliques within this graph. CleanBase detects such cliques and flags the corresponding documents as malicious. We theoretically derive upper bounds on CleanBase's false positive and false negative rates and empirically validate its effectiveness. Experimental results across multiple datasets and prompt injection attacks demonstrate that CleanBase accurately detects malicious documents and effectively safeguards RAG systems. Our source code is available at https://github.com/WeifeiJin/CleanBase.

#18 Zero-Knowledge Model Checking

著者: Pascal Berrang, Mirco Giacobbe, Jacob Swales, Xiao Yang

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2605.00487

要約:
We introduce a technology to formally verify that a software system satisfies a temporal specification of functional correctness, without revealing the system itself. Our method combines a deductive approach to model checking to obtain a formal certificate of correctness for the system, with zero-knowledge proofs to convince an external verifier that the system -- kept secret -- complies with its specification of correctness -- made public. We consider proof certificates represented as ranking functions, and introduce both an explicit-state and a symbolic scheme for model checking in zero knowledge. Our explicit-state scheme assumes systems represented as transition graphs. We use polynomial commitments to convince the verifier that the public proof certificates correspond to the secret transition relation. Our symbolic scheme assumes systems specified as linear guarded commands and uses piecewise-linear ranking functions. We apply Farkas' lemma to obtain a witness for the validity of the ranking function with public and secret components, and employ sigma protocols for matrix multiplication and range proofs to convince the verifier of the witness's existence. We built a prototype to demonstrate the practical efficacy of our two schemes on linear temporal logic verification examples. Our technology enables formal verification in domains where both the safety and the confidentiality of the system under analysis are critical.

#19 Pick and Sort for Graphical Authentication

著者: Argianto Rahartomo, AmirHossein Jamshidipoor, Mohammad Ghafari

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2605.00558

要約:
We propose a graphical authentication scheme that follows a simple ``Pick and Sort'' design in which users choose visual elements and arrange them within a grid. Both the number of selected elements and the grid size are configurable, and the visual elements can be customized for specific user groups, such as children. A preliminary study with a prototype implementation indicated that the scheme is easy to learn and flexible to deploy. Although login times are longer than those of conventional authentication methods, the additional interaction may be acceptable in scenarios that are not time-critical, such as infrequent-access use cases or as a secondary authentication mechanism.

#20 KingsGuard: Enclave Data Protection Under Real-World TEE Vulnerabilities

著者: Saltanat Firdous Allaqband, Deepanjali S, Rohit Srinivas R G, Devashish Gosain, Chester Rebeiro

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2605.00613

要約:
Trusted Execution Environments (TEEs) have emerged as a cornerstone for securing sensitive computations by providing isolated enclaves protected from untrusted software. However, their security guarantees are undermined by vulnerabilities in both the enclave code and the underlying hardware design, which can allow sensitive data to leak despite strong isolation guarantees. This paper presents KINGSGUARD, a novel TEE design that systematically monitors and controls the propagation of sensitive data within an enclave. By enforcing fine-grained data flow tracking and checks in hardware, our approach ensures that sensitive data does not leave the enclave boundary, thus bridging the gap between the idealized threat models of TEEs and their practical realizations. Additionally, to balance security with practical functionality, we introduce controlled declassification at enclave boundaries, allowing intentional release of data to the outside world. Our implementation of KINGSGUARD on a RISC-V processor has a 10.8% hardware area overhead when synthesized on FPGA and a 5.69% performance overhead.

#21 Defense against Poisoning Attacks under Shuffle-DP

backdoor

著者: Siyi Wang, Qiyao Luo, Yihua Hu, Lixu Wang, Quanqing Xu, Chuanhui Yang, Zhan Qin, Kui Ren, Wei Dong

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2605.00625

要約:
Differential Privacy (DP) has become the gold standard for protecting individual privacy in data analytics, and the shuffle-DP model has attracted significant attention from both academia and industry due to its favorable balance between privacy and utility. However, existing shuffle-DP protocols rely on a strong assumption: all users behave honestly. In real-world scenarios, adversarial users can exploit this vulnerability through poisoning attacks, compromising both privacy guarantees and the utility of analytical results. While defending against poisoning attacks in the shuffle-DP model has recently gained interest, existing solutions are limited to frequency estimation tasks. To address this issue, we propose the first general defense framework for all union-preserving queries, capable of transforming any shuffle-DP protocol into a version resilient to poisoning attacks. Beyond robust defense against poisoning attacks, our framework achieves high utility of analytical results. Compared to the original shuffle-DP protocol, it retains asymptotically equivalent error in attack-free settings and incurs only a polylogarithmic increase in error when a constant number of attackers are present. We demonstrate the generality of our framework on several common queries, including summation, frequency estimation, and range counting. Experimental results confirm that our approach effectively defends against poisoning attacks while maintaining strong utility and communication efficiency.

#22 STARE: Step-wise Temporal Alignment and Red-teaming Engine for Multi-modal Toxicity Attack

著者: Xutao Mao, Liangjie Zhao, Tao Liu, Xiang Zheng, Hongying Zan, Cong Wang

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2605.00699

要約:
Red-teaming Vision-Language Models is essential for identifying vulnerabilities where adversarial image-text inputs trigger toxic outputs. Existing approaches treat image generation as a black box, returning only terminal toxicity scores and leaving open the question of when and how toxic semantics emerge during multi-step synthesis. We introduce STARE, a hierarchical reinforcement learning framework that treats the denoising trajectory itself as the attack surface, under a direct white-box T2I and query-only black-box VLM setting. By coupling a high-level prompt editor with low-level T2I fine-tuning via Group Relative Policy Optimization (GRPO), STARE attains a 68\% improvement in Attack Success Rate over state-of-the-art black-box and white-box baselines. More importantly, this trajectory-level view surfaces the Optimization-Induced Phase Alignment phenomenon: vanilla models exhibit diffuse toxicity, whereas adversarial optimization concentrates conceptual harms into early semantic phases and detail-oriented harms into late refinement. Targeted perturbations of either window selectively suppress different toxicity categories, indicating that this temporal structure is a genuine causal handle rather than a side effect of the hierarchical design. The phenomenon turns toxicity formation from a chaotic process into a small set of predictable vulnerability windows, providing both a potent attack engine and a basis for phase-aware safety mechanisms. Content warning: This paper contains examples of toxic content that may be offensive or disturbing.

#23 Self-Adaptive Multi-Agent LLM-Based Security Pattern Selection for IoT Systems

agent

著者: Saeid Jamshidi, Foutse Khomh, Carol Fung, Kawser Wazed Nafi

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2605.00741

要約:
The adoption of Internet of Things (IoT) systems at the network edge of smart architectures is increasing rapidly, intensifying the need for security mechanisms that are both adaptive and resource-efficient. In such environments, runtime defence mechanisms are no longer limited to detection alone but become a resource-constrained task of selecting mitigation actions. Security controls must be carefully selected, combined, and executed under latency, energy, and computational constraints, while preventing unsafe interactions between controls. Existing approaches predominantly rely on static rule sets and learned policies, which provide limited guarantees of feasibility, conflict safety, and execution correctness in resource-constrained edge settings. To address this limitation, we introduce ASPO, a self-adaptive multi-agent security pattern selection that integrates Large Language Model (LLM)-based reasoning with deterministic enforcement within a MAPE-K control loop. ASPO explicitly separates stochastic decision generation from execution: LLM agents propose candidate mitigation portfolios, while a deterministic optimisation core enforces closed-world action integrity, conflict-free composition, and resource feasibility at every decision epoch. We deploy ASPO on a distributed edge-gateway testbed and evaluate it across two workloads, each comprising 500 and 1000 runtime security decisions, using replayed IoT attack traffic. In addition, the results demonstrate invariant safety properties, including 100% conflict-free activation, consistent resource feasibility across workloads, and stable pattern dominance with perfect rank preservation. Importantly, deeper decision exploration reduces extreme-case execution costs, compressing tail latency and energy overheads by 21.9% and 23.1%, respectively, without increasing mean energy consumption.

#24 Repurposing Image Diffusion Models for Adversarial Synthetic Structured Data: A Case Study of Ground Truth Drift

diffusion

著者: Adam Arthur, Christopher Schwartz

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2605.00788

要約:
Public image diffusion models are now powerful enough that an attacker without the resources to train a tabular-specific generator may repurpose one off the shelf. This study tests that possibility directly. An unmodified Stable Diffusion U-Net is applied to the UCI Adult Income dataset by reshaping each row into a small single-channel pseudo-image. The architecture's inductive bias toward spatial locality makes feature placement a design variable, and several layouts are tested. However, this is only the beginning of the story, as this paper also draws two philosophical distinctions. One separates statistical from perceptual realism: whether synthetic content holds up to a machine's correlation audits or a human's sensory inspection. The other introduces synthetic evidence as a category alongside synthetic media: AI-generated material whose consumer is a machine in a closed evidentiary pipeline rather than a person in an open information system. An attacker succeeds with synthetic evidence by thinking like the machine that will receive it. And the more the attacker succeeds, the more they can induce ground truth drift: the silent reclassification of AI-generated outputs as authentic when reused in pipelines that do not interrogate their provenance.

#25 When RAG Chatbots Expose Their Backend: An Anonymized Case Study of Privacy and Security Risks in Patient-Facing Medical AI

privacy

著者: Alfredo Madrid-Garc\'ia, Miguel Rujas

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2605.00796

要約:
Background: Patient-facing medical chatbots based on retrieval-augmented generation (RAG) are increasingly promoted to deliver accessible, grounded health information. AI-assisted development lowers the barrier to building them, but they still demand rigorous security, privacy, and governance controls. Objective: To report an anonymized, non-destructive security assessment of a publicly accessible patient-facing medical RAG chatbot and identify governance lessons for safe deployment of generative AI in health. Methods: We used a two-stage strategy. First, Claude Opus 4.6 supported exploratory prompt-based testing and structured vulnerability hypotheses. Second, candidate findings were manually verified using Chrome Developer Tools, inspecting browser-visible network traffic, payloads, API schemas, configuration objects, and stored interaction data. Results: The LLM-assisted phase identified a critical vulnerability: sensitive system and RAG configuration appeared exposed through client-server communication rather than restricted server-side. Manual verification confirmed that ordinary browser inspection allowed collection of the system prompt, model and embedding configuration, retrieval parameters, backend endpoints, API schema, document and chunk metadata, knowledge-base content, and the 1,000 most recent patient-chatbot conversations. The deployment also contradicted its privacy assurances: full conversation records, including health-related queries, were retrievable without authentication. Conclusions: Serious privacy and security failures in patient-facing RAG chatbots can be identified with standard browser tools, without specialist skills or authentication; independent review should be a prerequisite for deployment. Commercial LLMs accelerated this assessment, including under a false developer persona; assistance available to auditors is equally available to adversaries.

#26 GAFSV-Net: A Vision Framework for Online Signature Verification

著者: Himanshu Singhal, Suresh Sundaram

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2605.00120

要約:
Online signature verification (OSV) requires distinguishing skilled forgeries from genuine samples under high intra-class variability and with very few enrollment samples. Existing deep learning methods operate directly on raw temporal sequences, restricting them to 1D architectures and preventing the use of pretrained 2D vision backbones. We bridge this gap with GAFSV-Net, which represents each signature as a six-channel asymmetric Gramian Angular Field image: three kinematic channels (pen speed, pressure derivative, direction angle) are each encoded into complementary GASF and GADF matrices that capture pairwise temporal co-occurrence and directional transition structure respectively. A dual-branch ConvNeXt-Tiny encoder processes GASF and GADF independently, with bidirectional cross-attention enabling each branch to query discriminative patterns from the other before metric-space projection. Training uses semi-hard triplet loss with skilled-forgery hard-negative injection; verification is performed via cosine similarity against a small enrollment prototype. We evaluate on DeepSignDB and BiosecurID, outperforming all sequence-based baselines trained under identical objectives, demonstrating that the representational gain of 2D temporal encoding is consistent and independent of training procedure, with ablations characterising each design choice's contribution.

#27 RoboKA: KAN Informed Multimodal Learning for RoboCall Surveillance System

著者: Nitin Choudhury, Nikhil Kumar, Aditya Kumar Sinha, Abhijeet Anand, Hossein Salemi, Orchid Chetia Phukan, Hemant Purohit, Arun Balaji Buduru

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2605.00156

要約:
Wide exploration on robocall surveillance research is hindered due to limited access to public datasets, due to privacy concerns. In this work, we first curate Robo-SAr, a synthetic robocall dataset designed for robocall surveillance research. Robo-SAr comprises of ~200 unwanted and ~1200 legitimate synthetic robocall samples across three realistic adversarial axes: psycholinguistics-manipulated transcripts, emotion-eliciting speech, and cloned voices. We further propose RoboKA, a Kolmogorov-Arnold Network (KAN)-based multimodal fusion framework designed to model structured nonlinear interactions between acoustic and linguistic cues that characterize diverse adversarial robocall strategies. RoboKA first leverages cross-modal contrastive learning to align latent modality representations and feeds the resulting embeddings to a KAN-projection head for final classification. We benchmark RoboKA against strong unimodal and multimodal baselines in both in-domain and out-of-domain setups, finding RoboKA to surpass all baselines in terms of recall and F1-score.

#28 Jailbroken Frontier Models Retain Their Capabilities

著者: Daniel Zhu, Zihan Wang, Jenny Bao, Jerry Wei

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2605.00267

要約:
As language model safeguards become more robust, attackers are pushed toward developing increasingly complex jailbreaks. Prior work has found that this complexity imposes a "jailbreak tax" that degrades the target model's task performance. We show that this tax scales inversely with model capability and that the most advanced jailbreaks effectively yield no reduction in model capabilities. Evaluating 28 jailbreaks on five benchmarks across Claude models ranging in capability from Haiku 4.5 to Opus 4.6, we find Haiku 4.5 loses an average of 33.1% on benchmark performance when jailbroken, while Opus 4.6 at max thinking effort loses only 7.7%. We also observe that across all models, reasoning-heavy tasks display considerably more degradation than knowledge-recall tasks. Finally, Boundary Point Jailbreaking, currently the strongest jailbreak against deployed classifiers, achieves near-perfect classifier evasion with near-zero degradation across safeguarded models. We recommend that safety cases for frontier models should not rely on a meaningful capability degradation from jailbreaks.

#29 Integrating Log-Based Security Analytics in Agile Workflows: A Real-World Experience Report

著者: Arpit Thool, Chris Brown

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2605.00352

要約:
Modern organizations increasingly rely on log data and monitoring signals to protect products against account takeovers and abuse, yet integrating security analytics into fast-moving Agile workflows remains challenging. While it is important to understand how security practices are developed and sustained within Agile, real-world case studies of such integrations remain scarce. This experience report provides insights on developer perceptions of an effort to integrate log-based fraud detection within an organization, known as the "Red Flag Project". A cross-functional team of eight members (including one author) iterated weekly to implement a proof-of-concept log-based system that alerts stakeholders when accounts exhibit suspicious activity patterns. Through semi-structured interviews, we investigate developer perceptions of log-based fraud detection integration-exploring their willingness to adopt the system, challenges encountered, and the overall impact on day-to-day development activities and security perceptions. Our findings highlight key lessons, mitigation techniques, and best practices for embedding security analytics into Agile workflows. We provide insights for practitioners and researchers seeking to incorporate security practices into modern development processes while maintaining both speed and resilience in software delivery.

#30 ForesightFlow: An Information Leakage Score Framework for Prediction Markets

著者: Maksym Nechepurenko

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2605.00493

要約:
ForesightFlow is an Information Leakage Score (ILS) framework for detecting informed trading on decentralized prediction markets. For an event-resolved binary market, the score quantifies the fraction of the terminal information move priced in before the public news event. Three operational scope conditions (edge effect, non-trivial total move, anchor sensitivity) are stated as preconditions for interpretation. The score admits a Murphy-decomposition reading that connects label generation to the proper-scoring-rule literature. A pilot empirical evaluation surfaces three findings. First, a resolution-anchored proxy for the public-event timestamp does not separate event-resolved markets from a matched control population (Mann-Whitney p = 1e-6, separation reversed), demonstrating that proxy quality is itself a binding constraint. Second, the article-derived timestamp on a single high-stakes case shifts the score by 0.444 in magnitude relative to the proxy and lies on the opposite side of zero. Third, an audit of the publicly documented Polymarket insider record reveals that documented cases are systematically deadline-resolved, falling outside the original ILS scope (0 of 24 FFIC inventory markets satisfied original scope conditions). This last finding motivates a deadline-ILS extension introduced in Section 7, anchored at the public-event timestamp rather than the news timestamp, and equipped with a per-category exponential hazard baseline for the time-to-event distribution. The extension closes the gap between the methodology and the population in which insider trading has been empirically documented. An end-to-end evaluation of the extension on the 2026 U.S.-Iran conflict cluster is reported in a companion paper. We release the FFIC inventory, the resolution-typology classification of the 911,237-market corpus, and all code at github.com/ForesightFlow.

#31 ML-Bench&Guard: Policy-Grounded Multilingual Safety Benchmark and Guardrail for Large Language Models

著者: Yunhan Zhao, Zhaorun Chen, Xingjun Ma, Yu-Gang Jiang, Bo Li

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2605.00689

要約:
As Large Language Models (LLMs) are increasingly deployed in cross-linguistic contexts, ensuring safety in diverse regulatory and cultural environments has become a critical challenge. However, existing multilingual benchmarks largely rely on general risk taxonomies and machine translation, which confines guardrail models to these predefined categories and hinders their ability to align with region-specific regulations and cultural nuances. To bridge these gaps, we introduce ML-Bench, a policy-grounded multilingual safety benchmark covering 14 languages. ML-Bench is constructed directly from regional regulations, where risk categories and fine-grained rules derived from jurisdiction-specific legal texts are directly used to guide the generation of multilingual safety data, enabling culturally and legally aligned evaluation across languages. Building on ML-Bench, we develop ML-Guard, a Diffusion Large Language Model (dLLM)-based guardrail model that supports multilingual safety judgment and policy-conditioned compliance assessment. ML-Guard has two variants, one 1.5B lightweight model for fast `safe/unsafe' checking and a more capable 7B model for customized compliance checking with detailed explanations. We conduct extensive experiments against 11 strong guardrail baselines across 6 existing multilingual safety benchmarks and our ML-Bench, and show that ML-Guard consistently outperforms prior methods. We hope that ML-Bench and ML-Guard can help advance the development of regulation-aware and culturally aligned multilingual guardrail systems.

#32 DiffMI: Breaking Face Recognition Privacy via Diffusion-Driven Training-Free Model Inversion

privacydiffusion

著者: Hanrui Wang, Shuo Wang, Chun-Shien Lu, Isao Echizen

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2504.18015

要約:
Face recognition poses serious privacy risks due to its reliance on sensitive and immutable biometric data. While modern systems mitigate privacy risks by mapping facial images to embeddings (commonly regarded as privacy-preserving), model inversion attacks reveal that identity information can still be recovered, exposing critical vulnerabilities. However, existing attacks are often computationally expensive and lack generalization, especially those requiring target-specific training. Even training-free approaches suffer from limited identity controllability, hindering faithful reconstruction of nuanced or unseen identities. In this work, we propose DiffMI, the first diffusion-driven, training-free model inversion attack. DiffMI introduces a novel pipeline combining robust latent code initialization, a ranked adversarial refinement strategy, and a statistically grounded, confidence-aware optimization objective. DiffMI applies directly to unseen target identities and face recognition models, offering greater adaptability than training-dependent approaches while significantly reducing computational overhead. Our method achieves 84.42%--92.87% attack success rates against inversion-resilient systems and outperforms the best prior training-free GAN-based approach by 4.01%--9.82%. The implementation is available at https://github.com/azrealwang/DiffMI.

#33 ExCyTIn-Bench: Evaluating LLM agents on Cyber Threat Investigation

agent

著者: Yiran Wu, Mauricio Velazco, Andrew Zhao, Manuel Ra\'ul Mel\'endez Luj\'an, Srisuma Movva, Yogesh K Roy, Quang Nguyen, Roberto Rodriguez, Qingyun Wu, Michael Albada, Julia Kiseleva, Anand Mudgerikar

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2507.14201

要約:
We present ExCyTIn-Bench, the first benchmark to Evaluate an LLM agent X on the task of Cyber Threat Investigation through security questions derived from investigation graphs. Real-world security analysts must sift through a large number of heterogeneous security logs, follow multi-hop chains of evidence to investigate threats. With the developments of LLMs, building LLM-based agents for automatic threat investigation is a promising direction. We construct a benchmark from a controlled Azure tenant including a SQL environment covering 57 log tables from Microsoft Sentinel and related services, and 7542 generated questions. We leverage security logs extracted with expert-crafted detection logic to build threat investigation graphs, and then generate questions with LLMs using paired nodes on the graph, taking the start node as background context and the end node as answer. Anchoring each question to these explicit nodes and edges not only provides automatic, explainable ground truth answers but also makes the pipeline reusable and readily extensible to new logs. Our comprehensive experiments on the test set with different models confirm the difficulty of the task: the best model so far can achieve a reward of 0.606, leaving much headroom for future research. The code is available at https://github.com/microsoft/SecRL

#34 Parasites in the Toolchain: A Large-Scale Analysis of Attacks on the MCP Ecosystem

著者: Shuli Zhao, Qinsheng Hou, Zihan Zhan, Yanhao Wang, Yuchong Xie, Yu Guo, Libo Chen, Shenghong Li, Zhi Xue

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2509.06572

要約:
Large language models(LLMs) are increasingly integrated with external systems through the Model Context Protocol(MCP),which standardizes tool invocation and has rapidly become a backbone for LLM-powered applications. While this paradigm enhances functionality,it also introduces a fundamental security shift:LLMs transition from passive information processors to autonomous orchestrators of task-oriented toolchains,expanding the attack surface,elevating adversarial goals from manipulating single outputs to hijacking entire execution flows. In this paper,we identify and characterize a systematic privacy-leakage attack pattern,termed Parasitic Toolchain Attacks,instantiated as MCP Unintended Privacy Disclosure(MCP-UPD). These attacks require no direct victim interaction;instead,adversaries embed malicious instructions into external data sources that LLMs access during legitimate tasks. Unlike traditional prompt injection and tool poisoning attacks,our attack targets the interconnected toolchain itself,assembling multiple legitimate tools into a coordinated workflow whose combined behavior accomplishes malicious objectives. In MCP-UPD,the malicious logic infiltrates the toolchain and unfolds in three phases:Parasitic Ingestion,Privacy Collection,and Privacy Disclosure,culminating in stealthy exfiltration of private data. Our root cause analysis reveals that MCP lacks both context-tool isolation and least-privilege enforcement,enabling adversarial instructions to propagate unchecked into sensitive tool invocations. To assess the severity,we design MCP-SEC and conduct the first large-scale security census of the MCP ecosystem,analyzing 12230 tools across 1360 servers. Our findings show that the MCP ecosystem is rife with real-world exploitable gadgets and diverse attack methods,underscoring systemic risks in MCP platforms and the urgent need for defense mechanisms in LLM-integrated environments.

#35 BugMagnifier: TON Transaction Simulator for Revealing Smart Contract Vulnerabilities

著者: Yury Yanovich, Victoria Kovalevskaya, Maksim Egorov, Elizaveta Smirnova, Matvey Mishuris, Yash Madhwal, Kirill Ziborov, Vladimir Gorgadze, Subodh Sharma

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2509.24444

要約:
The Open Network (TON) blockchain employs an asynchronous execution model that introduces unique security challenges for smart contracts. A primary concern is race conditions arising from unpredictable message processing order. While previous work established vulnerability patterns through static analysis of audit reports, dynamic detection of temporal dependencies through systematic testing remains an open problem. This study proposes a dynamic evaluation methodology based on controlled message orchestration to systematically expose vulnerabilities in asynchronous smart contracts. By synthesizing precise message queue manipulation with differential state analysis and probabilistic permutation testing, we establish a framework (namely, BugMagnifier) for identifying execution flaws that static methods miss. Experimental evaluation demonstrates BugMagnifier's effectiveness through extensive parametric studies on purpose-built vulnerable contracts and five real-world vulnerability cases reproduced from recent security audits. Results reveal message ratio-dependent detection complexity that aligns with theoretical predictions. This quantitative model enables predictive vulnerability assessment while shifting discovery from manual expert analysis to automated evidence generation. By providing reproducible test scenarios for temporal vulnerabilities, BugMagnifier addresses a critical gap in the TON security tooling, offering practical support for safer smart contract development in asynchronous blockchain environments.

#36 Sentra-Guard: A Real-Time Multilingual Defense Against Adversarial LLM Prompts

著者: Md. Mehedi Hasan, Sk Tanzir Mehedi, Ziaur Rahman, Rafid Mostafiz, Md. Abir Hossain

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2510.22628

要約:
This paper presents a real-time modular defense system named Sentra-Guard. The system detects and mitigates jailbreak and prompt injection attacks targeting large language models (LLMs). The framework uses a hybrid architecture with FAISS-indexed SBERT embedding representations that capture the semantic meaning of prompts, combined with fine-tuned transformer classifiers, which are machine learning models specialized for distinguishing between benign and adversarial language inputs. It identifies adversarial prompts in both direct and obfuscated attack vectors. A core innovation is the classifier-retriever fusion module, which dynamically computes context-aware risk scores that estimate how likely a prompt is to be adversarial based on its content and context. The framework ensures multilingual resilience with a language-agnostic preprocessing layer. This component automatically translates non-English prompts into English for semantic evaluation, enabling consistent detection across over 100 languages. The system includes a HITL feedback loop, where decisions made by the automated system are reviewed by human experts for continual learning and rapid adaptation under adversarial pressure. Sentra-Guard maintains an evolving dual-labeled knowledge base of benign and malicious prompts, enhancing detection reliability and reducing false positives. Evaluation results show a 99.96% detection rate (AUC = 1.00, F1 = 1.00) and an attack success rate (ASR) of only 0.004%. This outperforms leading baselines such as LlamaGuard-2 (1.3%) and OpenAI Moderation (3.7%). Unlike black-box approaches, Sentra-Guard is transparent, fine-tunable, and compatible with diverse LLM backends. Its modular design supports scalable deployment in both commercial and open-source environments. The system establishes a new state-of-the-art in adversarial LLM defense.

#37 Characterizing and Modeling the GitHub Security Advisories Review Pipeline

著者: Claudio Segal, Paulo Segal, Carlos Eduardo Banjar, Felipe de Sant'Anna Paix\~ao, Hudson Silva Borges, Paulo Silveira, Eduardo Santana de Almeida, Joanna C. S. Santos, Anton Kocheturov, Gaurav Kumar Srivastava, Daniel Sadoc Menasch\'e

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2602.06009

要約:
GitHub Security Advisories (GHSA) have become a central component of open-source vulnerability disclosure and are widely used by developers and security tools. A distinctive feature of GHSA is that only a fraction of advisories are reviewed by GitHub, while the mechanisms associated with this review process remain poorly understood. In this paper, we conduct a large-scale empirical study of the GHSA review processes, analyzing over 288,000 advisories spanning 2019-2025. We characterize which advisories are more likely to be reviewed, quantify review delays, and identify two distinct review-latency regimes: a fast path dominated by GitHub Repository Advisories (GRAs) and a slow path dominated by NVD-first advisories. We further develop a queueing model that accounts for this dichotomy based on the structure of the advisory processing pipeline.

#38 BadSNN: Backdoor Attacks on Spiking Neural Networks via Adversarial Spiking Neuron

backdoor

著者: Abdullah Arafat Miah, Kevin Vu, Yu Bi

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2602.07200

要約:
Spiking Neural Networks (SNNs) are energy-efficient counterparts of Deep Neural Networks (DNNs) with high biological plausibility, as information is transmitted through temporal spiking patterns. The core element of an SNN is the spiking neuron, which converts input data into spikes following the Leaky Integrate-and-Fire (LIF) neuron model. This model includes several important hyperparameters, such as the membrane potential threshold and membrane time constant. Both the DNNs and SNNs have proven to be exploitable by backdoor attacks, where an adversary can poison the training dataset with malicious triggers and force the model to behave in an attacker-defined manner. Yet, how an adversary can exploit the unique characteristics of SNNs for backdoor attacks remains underexplored. In this paper, we propose \textit{BadSNN}, a novel backdoor attack on spiking neural networks that exploits hyperparameter variations of spiking neurons to inject backdoor behavior into the model. We further propose a trigger optimization process to achieve better attack performance while making trigger patterns less perceptible. \textit{BadSNN} demonstrates superior attack performance on various datasets and architectures, as well as compared with state-of-the-art data poisoning-based backdoor attacks and robustness against common backdoor mitigation techniques. Codes can be found at https://github.com/SiSL-URI/BadSNN.

#39 SoK: Security of Autonomous LLM Agents in Agentic Commerce

agent

著者: Qian'ang Mao, Jiaxin Wang, Ya Liu, Li Zhu, Cong Ma, Jiaqi Yan

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.15367

要約:
Autonomous large language model (LLM) agents such as OpenClaw are pushing agentic commerce from human-supervised assistance toward machine actors that can negotiate, purchase services, manage digital assets, and execute transactions across on-chain and off-chain environments. Protocols such as the Trustless Agents standard (ERC-8004), Agent Payments Protocol (AP2), OKX Agent Payments Protocol (APP), the HTTP 402-based payment protocol (x402), Agent Commerce Protocol (ACP), the Agentic Commerce standard (ERC-8183), and Machine Payments Protocol (MPP) enable this transition, but they also create an attack surface that existing security frameworks do not capture well. This Systematization of Knowledge (SoK) develops a unified security framework for autonomous LLM agents in commerce and finance. We organize threats along five dimensions: agent integrity, transaction authorization, inter-agent trust, market manipulation, and regulatory compliance. From a systematically curated public corpus of academic papers, protocol documents, industry reports, and incident evidence, we derive 12 cross-layer attack vectors and show how failures propagate from reasoning and tooling layers into custody, settlement, market harm, and compliance exposure. We then propose a layered defense architecture addressing authorization gaps left by current agent-payment protocols. Overall, our analysis shows that securing agentic commerce is inherently a cross-layer problem that requires coordinated controls across LLM safety, protocol design, identity, market structure, and regulation. We conclude with a research roadmap and a benchmark agenda for secure autonomous commerce.

#40 Do Privacy Policies Match with the Logs? An Empirical Study of Privacy Disclosure in Android Application Logs

privacy

著者: Zhiyuan Chen, Love Jayesh Ahir, Ahmad Suleiman, Kundi Yao, Yiming Tang, Weiyi Shang, Daqing Hou

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.18552

要約:
Privacy policies are intended to inform users about how software systems collect and handle data, yet they often remain vague or incomplete. This paper presents an empirical study of patterns in log-related statements within privacy policies and their alignment with privacy disclosures observed in Android application logs. We analyzed 1,000 Android apps across multiple categories, generating 86,836,964 log entries. Our findings reveal that while most applications (88.0%) provide privacy policies, only 28.5% explicitly mention logging practices. Among those that reference logging, most clearly describe what information is logged; however, 27.7% of log-related statements remain overly simplistic or vague, offering limited insight into actual data collection. We further observed widespread privacy leakages in application logs, with 67.6% of apps leaking sensitive information not mentioned in their policies. Alarmingly, only 0.4% of applications demonstrated consistent alignment between declared policy contents and actual logged data. These findings highlight that current privacy policies provide incomplete or ambiguous descriptions of logging practices, which frequently do not align with actual logging behaviors.

#41 SUDP: Secret-Use Delegation Protocol for Agentic Systems

agent

著者: Xiaohang Yu, Hejia Geng, Xinmeng Zeng, William Knottenbelt

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.24920

要約:
Agentic systems increasingly act with user secrets for APIs, messaging platforms, and cloud services. Today's bearer-secret interfaces implement authorization by exposure: enabling action often means placing a reusable secret, or a reusable artifact derived from it, within a model-steerable boundary, so a transient prompt-injection or tool-side compromise becomes durable account compromise. Existing defenses cover adjacent pieces such as secret storage, scoped delegation, sender-constrained tokens, and runtime monitoring, but leave the combined agentic obligation without a common specification: an untrusted autonomous requester should be able to cause a user-authorized secret-backed operation without exposing reusable authority to the requester. We formalize this problem as Agent Secret Use (ASU). From ASU we derive a security-property taxonomy that separates the problem's structural obligations from the realization-level robustness conditions any concrete construction must establish, enabling principled comparison of existing agentic-secret defenses against a problem-grounded specification. We propose the Secret-Use Delegation Protocol (SUDP), a three-role protocol realizing ASU: a requester proposes a canonical operation; the user authorizes it with a fresh authenticator-backed grant; and a custodian redeems the grant once to perform the bounded use, so reusable authority never crosses the requester boundary. We specialize SUDP for agentic deployments: agents propose operations; they do not retrieve secrets. Under explicit assumptions, we show that SUDP satisfies the ASU requirements: authorization is verifiable, operation-bound, and single-use. SUDP also provides storage confidentiality and wrapping-epoch key isolation under stated sealing and erasure assumptions; plaintext-level forward secrecy of the underlying secret additionally requires the environment to rotate and revoke it.

#42 Least Squares Estimation For Hierarchical Data

著者: Ryan Cumings-Menon, Pavel Zhuravlev

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2404.13164

要約:
The U.S. Census Bureau's 2020 Disclosure Avoidance System (DAS) bases its output on noisy measurements, which are population tabulations added to realizations of mean-zero random variables. These noisy measurements are observed for a set of hierarchical geographic levels, e.g., the U.S. as a whole, states, counties, census tracts, and census blocks. The Census Bureau released the noisy measurements generated in the DAS executions for the two primary 2020 Census data products, in part to allow data users to assess uncertainty in 2020 Census tabulations introduced by disclosure avoidance. This paper describes an algorithm that can leverage the hierarchical structure of the input data in order to compute very high dimensional least squares estimates in a computationally efficient manner. Afterward, we show that this algorithm's output is equal to the generalized least squares estimator, describe how to find the variance of linear functions of this estimator, and provide a numerical experiment in which we compute confidence intervals of tabulations based on this estimator. We also describe an accompanying Census Bureau experimental data product that applies this estimator to the publicly available noisy measurements to provide data users with the inputs required to derive confidence intervals for all tabulations that were included in the 2020 Redistricting Data File, for the U.S., state, county, and census tract geographic levels.

#43 Trust Dynamics in Cryptocurrency Markets: Centralized vs. Decentralized Exchanges

著者: Xintong Wu, Wanlin Deng, Yutong Quan, Lin William Cong, Luyao Zhang

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2404.17227

要約:
Trust mechanisms diverge between centralized and decentralized exchanges, representing distinct sociotechnical governance paradigms. However, quantifying trust dynamics and their redistribution between these architectures remains empirically challenging, limiting understanding of how institutional shocks affect market behavior. The FTX collapse offers a natural experiment to bridge this gap. Through an interdisciplinary approach combining causal inference and computational text analysis, we find significant price declines and capital reallocation from centralized to decentralized exchanges following the event. While sentiment metrics showed no sharp discontinuities, topic modeling and network analysis of Discord communities reveal that seasonal holiday discourse obscured underlying trust concerns in centralized exchange forums. These findings underscore the fragility of institutional trust architectures and demonstrate how mixed methods can illuminate behavioral patterns during systemic crises, offering insights for exchange risk management and regulatory assessment.

#44 Feedback Lunch: Learned Feedback Codes for Secure Communications

著者: Yingyao Zhou, Natasha Devroye, Onur G\"unl\"u

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2510.16620

要約:
We consider reversely-degraded secure-communication channels, for which the secrecy capacity is zero if there is no channel feedback. Specifically, we focus on a seeded modular code design for the block-fading Gaussian wiretap channel with channel-output feedback, combining universal hash functions for security and learned feedback-based codes for reliability. The trade-off between communication reliability and information leakage is studied, illustrating that feedback enables agreeing on a secret key shared between legitimate parties, overcoming the security advantage of the eavesdropper. Our findings motivate code designs for sensing-assisted secure communications in the context of integrated sensing and communication (ISAC).

#45 Vulnerability Abundance: A formal proof of infinite vulnerabilities in code

著者: Eireann Leverett, Jeroen van der Ham-de Vos

公開日: Mon, 04 May 2026 00:00:00 -0400

リンク: https://arxiv.org/abs/2604.07539

要約:
We present a constructive proof that a single C program, the \emph{Vulnerability Factory}, admits a countably infinite set of distinct, independently CVE-assignable software vulnerabilities. We formalise the argument using elementary set theory, verify it against MITRE's CVE Numbering Authority counting rules, sketch a model-checking analysis that corroborates unbounded vulnerability generation, and provide a Turing-machine characterisation that situates the result within classical computability theory. We then contextualise this result within the long-running debate on whether undiscovered vulnerabilities in software are \emph{dense} or \emph{sparse}, and introduce the concept of \emph{vulnerability abundance}: a quantitative analogy to chemical elemental abundance that describes the proportional distribution of vulnerability classes across the global software corpus. Because different programming languages render different vulnerability classes possible or impossible, and because language popularity shifts over time, vulnerability abundance is neither static nor uniform. Crucially, we distinguish between infinite \emph{vulnerabilities} and the far smaller set of \emph{exploits}: empirical evidence suggests that fewer than 6\% of published CVEs are ever exploited in the wild, and that exploitation frequency depends not only on vulnerability abundance but on the market share of the affected software. We argue that measuring vulnerability abundance, and its interaction with software deployment, has practical value for both vulnerability prevention and cyber-risk analysis. We conclude that if one programme can harbour infinitely many vulnerabilities, the set of all software vulnerabilities is necessarily infinite, and we suggest the Vulnerability Factory may serve as a reusable proof artifact, a foundational `test object',for future formal results in vulnerability theory.

cs.CR updates on arXiv.org

📋 論文タイトル一覧