要約:
For digital infrastructure to be safe, compatible, and standards-aligned, automated communication protocol compliance verification is crucial. Nevertheless, current rule-based systems are becoming less and less effective since they are unable to identify subtle or intricate non-compliance, which attackers frequently use to establish covert communication channels in IPv6 traffic. In order to automate IPv6 compliance verification, this paper presents the Artificial Intelligence Driven Compliance Checker Engine (AICCE), a novel generative system that combines dual-architecture reasoning and retrieval-augmented generation (RAG). Specification segments pertinent to each query can be efficiently retrieved thanks to the semantic encoding of protocol standards into a high-dimensional vector space. Based on this framework, AICCE offers two complementary pipelines: (i) Explainability Mode, which uses parallel LLM agents to render decisions and settle disputes through organized discussions to improve interpretability and robustness, and (ii) Script Execution Mode, which converts clauses into Python rules that can be executed quickly for dataset-wide verification. With the debate mechanism enhancing decision reliability in complicated scenarios and the script-based pipeline lowering per-sample latency, AICCE achieves accuracy and F1-scores of up to 99% when tested on IPv6 packet samples across sixteen cutting-edge generative models. By offering a scalable, auditable, and generalizable mechanism for identifying both routine and covert non-compliance in dynamic communication environments, our results show that AICCE overcomes the blind spots of conventional rule-based compliance checking systems.
要約:
Misconfiguration, excessive privilege, and tool fragmentation remain the main reasons why enterprise cloud environments are breached. Recent reports on cloud-native application protection note that most incidents can be traced back to configuration or identity errors rather than platform flaws, and that organizations still need separate tools to watch Kubernetes, OpenStack, and infrastructure-as-code. To address this gap, this paper presents an open-source cloud-infrastructure security framework built with a microservice architecture. The framework integrates four core services: 1) identity and access control unification, 2) configuration-baseline intelligent checking over Kubernetes and OpenStack assets, 3) real-time threat monitoring based on Falco-style runtime rules and ELK-based analytics, and 4) automated remediation that consumes Terraform plans and Checkov/OPA policy results to roll back or harden resources. It provides automated deployment, supports 50-200-node clusters, and exposes uniform REST and gRPC interfaces for extension. In an enterprise-grade testbed, vulnerability-assessment time was reduced from 120 min as baseline toolchain to 18 min, with false-positive rate below 5%. After continuous deployment, the number of observable security events dropped by 62%. The project is released under Apache 2.0 to lower entry cost by about 40% for small and medium teams.
要約:
Universal Circuits (UCs) offer a promising approach to hardware Intellectual Property (IP) obfuscation, leveraging cryptographic principles to hide both structure and function in a programmable logic fabric. Their adaptability makes them especially suitable for the globalized Integrated Circuit (IC) supply chain, where security against threats like reverse engineering is crucial. Despite the potential, UC security remains largely unexplored. This work evaluates UC security against state-of-the-art oracle-guided (OG) and oracle-less (OL) attacks. Results show near-random success rates (approx 50%) for OG attacks whereas OL attacks display minimal structural leakage. Collectively, these findings confirm the feasibility of UCs for IP protection.
要約:
Fully Homomorphic Encryption (FHE) enables privacy-preserving Transformer inference, but long-sequence encrypted Transformers quickly exceed single-GPU memory capacity because encoded weights are already large and encrypted activations grow rapidly with sequence length. Multi-GPU execution therefore becomes unavoidable, yet scaling remains challenging because communication is jointly induced by application-level aggregation and encryption-level RNS coupling. Existing approaches either synchronize between devices frequently or replicate encrypted tensors across devices, leading to excessive communication and latency.
We present AEGIS, an Application-Encryption Guided Inference System for scalable long-sequence encrypted Transformer inference on multi-GPU platforms. AEGIS derives device placement from ciphertext dependencies jointly induced by Transformer dataflow and CKKS polynomial coupling, co-locating modulus-coherent and token-coherent data so that communication is introduced only when application dependencies require it, while reordering polynomial operators to overlap the remaining collectives with computation.
On 2048-token inputs, AEGIS reduces inter-GPU communication by up to 57.9% in feed-forward networks and 81.3% in self-attention versus prior state-of-the-art designs. On four GPUs, it achieves up to 96.62% scaling efficiency, 3.86x end-to-end speedup, and 69.1% per-device memory reduction. These results establish coordinated application-encryption parallelism as a practical foundation for scalable homomorphic Transformer inference.
要約:
Hardware intellectual property (IP) in the globalized integrated circuit (IC) supply chain is exposed to a wide range of confidentiality and integrity attacks by untrusted third-party entities. Existing IP-level countermeasures, such as logic locking, hardware obfuscation, camouflaging, and redaction, have aimed at addressing these them. In particular, hardware redaction has emerged as a robust approach for IP protection against confidentiality attacks, including reverse engineering. We note that existing IP protection approaches, including the ones based on hardware redaction, tend to leave behind structural artifacts that can be exploited by adversaries to bypass protections or predict unlocking keys, using the knowledge of known designs, akin to a known-plaintext attack (KPA) in cryptography. In this work, we present CIPHR, a robust fine-grain hardware redaction methodology inspired by the cryptographic property of indistinguishability. The proposed approach utilizes novel heuristic-driven randomization to introduce significant structural transformations into the redacted designs. We employ structural analysis metrics to evaluate the security achieved by CIPHR compared to various state-of-the-art IP protection techniques. Multiple open-source benchmark designs are used to demonstrate that fine-grain redaction in CIPHR is robust, scalable, and indistinguishable against structural attacks.
要約:
Reasoning language models (RLMs) are increasingly used in programming. Yet, even state-of-the-art RLMs frequently introduce critical security vulnerabilities in generated code. Prior training-based approaches for secure code generation face a critical limitation that prevents their direct application to RLMs: they rely on costly, manually curated security datasets covering only a limited set of vulnerabilities. At the inference level, generic security reminders consistently degrade functional correctness while triggering only shallow ad-hoc vulnerability analysis. To address these problems, we present SecPI, a fine-tuning pipeline that teaches RLMs to internalize structured security reasoning, producing secure code by default without any security instructions at inference time. SecPI filters existing general-purpose coding datasets for security-relevant tasks using an LLM-based classifier, generates high-quality security reasoning traces with a teacher model guided by a structured prompt that systematically enumerates relevant CWEs and mitigations, and fine-tunes the target model on pairs of inputs with no security prompt and teacher reasoning traces -- as a result, the model learns to reason about security autonomously rather than in response to explicit instructions. An extensive evaluation on security benchmarks with state-of-the-art open-weight reasoning models validates the effectiveness of our approach. For instance, SecPI improves the percentage of functionally correct and secure generations for QwQ 32B from 48.2% to 62.2% (+14.0 points) on CWEval and from 18.2% to 22.0% on BaxBench. Further investigation also reveals strong cross-CWE and cross-language generalization beyond training vulnerabilities. Even when trained only on injection-related CWEs, QwQ 32B generates correct and secure code 9.9% more frequently on held-out memory-safety CWEs.
要約:
Vertical split learning (SL) enables collaborative model training across parties holding complementary features without sharing raw data, but recent work has shown that it is highly vulnerable to poisoning-based backdoor attacks operating on intermediate embeddings. By compromising malicious clients, adversaries can inject stealthy triggers that manipulate the server-side model while remaining difficult to detect, and existing defenses provide limited robustness against adaptive attacks. In this paper, we propose ProtoGuard-SL, a server-side defense that improves the robustness of split learning by exploiting class-conditional representation consistency in the embedding space. Our approach is motivated by the observation that benign embeddings within the same class exhibit stable semantic alignment, whereas poisoned embeddings inevitably disrupt this structure. ProtoGuard-SL adopts a two-stage framework that constructs robust class prototypes and transforms embeddings into a prototype-consistency representation, followed by a class-conditional, distribution-free conformal filtering strategy to identify and remove anomalous embeddings. Extensive experiments are conducted on three datasets, CIFAR-10, SVHN, and Bank Marketing, under three different attack settings demonstrate that our method achieves state-of-the-art performance.
要約:
Prompt injection has emerged as a critical vulnerability in large language model (LLM) deployments, yet existing research is heavily weighted toward defenses. The attack side -- specifically, which injection strategies are most effective and why -- remains insufficiently studied.We address this gap with AttackEval, a systematic empirical study of prompt injection attack effectiveness. We construct a taxonomy of ten attack categories organized into three parent groups (Syntactic, Contextual, and Semantic/Social), populate each category with 25 carefully crafted prompts (250 total), and evaluate them against a simulated production victim system under four progressively stronger defense tiers. Experiments reveal several non-obvious findings: (1) Obfuscation (OBF) achieves the highest single-attack success rate (ASR = 0.76) against even intent-aware defenses, because it defeats both keyword matching and semantic similarity checks simultaneously; (2) Semantic/Social attacks - Emotional Manipulation (EM) and Reward Framing (RF) - maintain high ASR (0.44-0.48) against intent-aware defenses due to their natural language surface, which evades structural anomaly detection; (3) Composite attacks combining two complementary strategies dramatically boost ASR, with the OBF + EM pair reaching 97.6%; (4) Stealth correlates positively with residual ASR against semantic defenses (r = 0.71), implying that future defenses must jointly optimize for both structural and behavioral signals. Our findings identify concrete blind spots in current defenses and provide actionable guidance for designing more robust LLM safety systems.
要約:
Fault injection attacks deliberately inject faults into a device via physical channels to disturb its regular execution. Adversaries can effectively deduce secrets by analyzing both the normal and faulty outputs, posing serious threats to cryptographic primitives implemented in hardware. An effective countermeasure to such attacks is via redundancy, commonly referred to as concurrent error detection schemes, where Binary linear codes have been used to defend against fault injection attacks. However, designing an optimal code circuit is often time-consuming, error-prone, and requires substantial expertise. In this paper, we formalize the optimal code circuit synthesis problem (OptiCC) based on two domain-specific minimization objectives on individual inputs and parity size. We then propose a novel algorithm CiSC for solving OptiCC, prioritizing the minimization of individual inputs. Our approach features both correct-by-construction and secure-by-construction. In a nutshell, CiSC gradually reduces individual inputs and parity size by checking, via SMT solving, the existence of feasible Boolean functions for implementing a desired code. We further present an effective technique to lazily generate combinations of inputs to Boolean functions, while quickly identify equivalent ones. We implement our approach in a tool CiSC, and evaluate it on practical benchmarks. Experimental results show our approach can synthesize code circuits that significantly outperform those generated by the latest state-of-the-art techniques.
要約:
As multimodal large language models (LLMs) advance, traditional CAPTCHAs have become obsolete at distinguishing humans from bots. To address this shift, this paper aims to investigate the possibility of using tasks for which humans have evolved highly specialised neural processing. We introduce two CAPTCHA classes: a vision-based CAPTCHA, which renders alphanumeric strings as ASCII art, and an audio-based CAPTCHA, which is a question-answering task with overlapping or noise-corrupted audio context. We evaluate our vision-based CAPTCHA both as text and image input with multiple frontier LLMs (GPT 5.2, Gemini 3, etc.), and assess our audio-based CAPTCHAs by applying augmentations like background noise, Gaussian noise, and overlapping speech. We determined that none of the LLMs were able to solve a single ASCII-based CAPTCHA, with the best performing model only being able to infer at most one or two characters. Additionally, all models that supported audio performed only modestly better than random when solving audio CAPTCHAs. Our results suggest that these CAPTCHAs are exceptionally effective today, but it is unclear whether it can withstand the fast-evolving landscape of artificial intelligence. Subsequent research is needed to determine whether these tasks are temporary vulnerabilities or represent a more durable method of distinguishing humans from bots.
要約:
Authentication is a fundamental security means for protecting system resources. Authenticator-centric authentication techniques (AuthN Techniques) address how mechanisms and credentials are used via Authenticators. There are many AuthN Techniques that differ in many ways and there exist classification approaches that aim to structure them. However, they are limited in the aspects they classify and are not flexible enough to accommodate the diverse nature of AuthN Techniques. This paper presents two contributions. First, novel, faceted classification schemes for AuthN Techniques and Authenticators are presented. The schemes were developed based on 345 papers identified through a targeted LLM-assisted literature review and semantic clustering. The classification schemes were applied to build a catalog of Authenticators and AuthN Techniques; the second contribution of this paper. This paper presents our methodology, the classification schemes with example applications, the list of AuthN Techniques from the catalog, and discussions on future work.
要約:
This paper studies how post-quantum cryptographic (PQC) security assumptions can be represented and communicated through a structured, layered framework that is useful for technical interpretation but does not replace formal cryptographic proofs.
We propose ``Explainable PQC,'' an interdisciplinary framework connecting three layers: (1) a complexity-based interpretive model that distinguishes classical security, quantum security, and reduction-backed hardness, drawing on computational complexity classes as supporting language; (2) an exploratory mathematical investigation applying combinatorial Hodge theory and polyhedral geometry to study structural aspects of lattice hardness; and (3)~an empirical experimentation platform, implemented in Julia, for measuring the behavior of lattice basis reduction algorithms (LLL, BKZ) in low-dimensional settings. The motivating case study throughout the paper is lattice-based PQC, including ML-KEM (FIPS 203) and ML-DSA (FIPS 204).
The contribution of this paper is conceptual and organizational: it defines a layered interpretive framework, clarifies its scope relative to formal cryptographic proofs and reduction-based security arguments, and identifies mathematical and implementation-level directions through which PQC security claims may be more transparently communicated. This paper does not claim new cryptographic hardness results, new attacks, or concrete security parameter estimates.
要約:
Reverse engineering (RE) is central to software security, particularly for cryptographic programs that handle sensitive data and are highly prone to vulnerabilities. It supports critical tasks such as vulnerability discovery and malware analysis. Despite its importance, RE remains labor-intensive and requires substantial expertise, making large language models (LLMs) a potential solution for automating the process. However, their capabilities for RE remain systematically underexplored. To address this gap, we study the cryptographic binary RE capabilities of LLMs and introduce \textbf{CREBench}, a benchmark comprising 432 challenges built from 48 standard cryptographic algorithms, 3 insecure crypto key usage scenarios, and 3 difficulty levels. Each challenge follows a Capture-the-Flag (CTF) RE challenge, requiring the model to analyze the underlying cryptographic logic and recover the correct input. We design an evaluation framework comprising four sub-tasks, from algorithm identification to correct flag recovery. We evaluate eight frontier LLMs on CREBench. GPT-5.4, the best-performing model, achieves 64.03 out of 100 and recovers the flag in 59\% of challenges. We also establish a strong human expert baseline of 92.19 points, showing that humans maintain an advantage in cryptographic RE tasks. Our code and dataset are available at https://github.com/wangyu-ovo/CREBench.
要約:
Modern advanced driver assistance systems (ADAS) rely on deep neural networks (DNNs) for perception and planning. Since DNNs' parameters reside in DRAM during inference, bit flips caused by cosmic radiation or low-voltage operation may corrupt DNN computations, distort driving decisions, and lead to real-world incidents. This paper presents a SpatioTemporal-Aware Fault Injection (STAFI) framework to locate critical fault sites in DNNs for ADAS efficiently. Spatially, we propose a Progressive Metric-guided Bit Search (PMBS) that efficiently identifies critical network weight bits whose corruption causes the largest deviations in driving behavior (e.g., unintended acceleration or steering). Furthermore, we develop a Critical Fault Time Identification (CFTI) mechanism that determines when to trigger these faults, taking into account the context of real-time systems and environmental states, to maximize the safety impact. Experiments on DNNs for a production ADAS demonstrate that STAFI uncovers 29.56x more hazard-inducing critical faults than the strongest baseline.
要約:
Cyber attacks targeting Industrial Control Systems (ICS) have become increasingly sophisticated and hard to identify. Detecting such attacks requires integrating low-level behavioral cues with high-level semantic interpretation, a capability that traditional anomaly detectors lack. This paper presents a Digital Twin (DT)-driven hybrid detection approach that combines deterministic heuristics with systematic, constrained Large Language Model (LLM) reasoning to achieve real-time incident detection. The DT maintains a synchronized, feature-enriched representation of the Secure Water Treatment (SWaT) process, deriving behavioral descriptors. Heuristics identify characteristic signatures of spoofing, valve forcing, denial-of-service, and bias drift, while the LLM is invoked only when heuristics abstain. A constrained JSON schema and semantic plausibility filters ensure physically consistent LLM outputs, and a temporal smoothing layer stabilizes the final decision signal. Evaluation on four canonical SWaT attack scenarios shows that the proposed detector precisely localizes each attack interval with low time-to-detect and zero False Positives (FPs) in the evaluated benign region. Results are consistent across both a local LLaMA model and a cloud-based GPT model, demonstrating the robustness of the constrained hybrid architecture. The findings highlight the potential of DT-guided LLM reasoning as a reliable and interpretable approach to ICS anomaly detection.
要約:
Adams Bridge, a hardware accelerator for ML-DSA and ML-KEM designed for the Caliptra root of trust, masks 1 of its Inverse Number Theoretic Transform (INTT) layers and relies on shuffling for the remainder, claiming per-butterfly Correlation Power Analysis (CPA) complexities of 2^46 (ML-DSA) and 2^96 (ML-KEM). We evaluate these claims against published side-channel literature across seven analysis tracks with confidence-rated evidence. Register-Transfer Level (RTL) analysis confirms that the design's Random Start Index (RSI) shuffling provides 6 bits of entropy per layer (64 orderings) rather than the 296 bits of a full random permutation assumed in its scaling argument, with effective margins below the designers' estimates. A soft-analytical attack pipeline demonstrates a 37-bit enumeration reduction, independent of Belief Propagation (BP) gains, quantifying the attack-model gap without achieving key recovery. Full-scale BP on the complete INTT factor graph achieves 100% coefficient recovery over the single-layer baseline, resolving whether BP gains scale to production-size Number Theoretic Transform (NTT) structures. A genie-aided information-theoretic bound shows observations contain sufficient mutual information for full recovery at SNRxN as low as 15. Layer-ablation analysis identifies four necessary conditions governing BP convergence. Observation topology, not count, determines recovery: 4 evenly spread layers achieve 100% while 4 consecutive layers achieve 0%, yielding a practical countermeasure design tool. Strategic masking of 3 consecutive mid-layers (43% overhead vs. full masking) creates an unrecoverable gap that defeats soft-analytical attacks. We contribute a reusable security margin audit methodology combining RTL verification, epistemic confidence tagging, sensitivity-scenario analysis, and experimental validation applicable to any partially masked NTT accelerator.
要約:
Transformer-based malware detection systems operating on graph modalities such as control flow graphs (CFGs) achieve strong performance by modeling structural relationships in program behavior. However, their robustness to adversarial evasion attacks remains underexplored. This paper examines the vulnerability of a RoBERTa-based malware detector that linearizes CFGs into sequences of function calls, a design choice that enables transformer modeling but may introduce token-level sensitivities and ordering artifacts exploitable by adversaries. By evaluating evasion strategies within this graph-to-sequence framework, we provide insight into the practical robustness of transformer-based malware detectors beyond aggregate detection accuracy.
This paper proposes a white-box adversarial evasion attack that leverages explainability mechanisms to identify and perturb most influential graph components. Using token- and word-level attributions derived from integrated gradients, the attack iteratively replaces positively attributed function calls with synthetic external imports, producing adversarial CFG representations without altering overall program structure. Experimental evaluation on small- and large-scale Windows Portable Executable (PE) datasets demonstrates that the proposed method can reliably induce misclassification, even against models trained to high accuracy. Our results highlight that explainability tools, while valuable for interpretability, can also expose critical attack surfaces in transformer-based malware detectors.
要約:
Formally guaranteeing the safety and liveness of regulatory state transitions in cross-domain state synchronization systems is a problem of growing importance as tokenized assets are increasingly operated across heterogeneous blockchain networks and off-chain ledgers. This paper presents a mechanized proof of 2,348 lines in Isabelle/HOL establishing two complementary properties. First, cross-domain state preservation (safety): a regulatory state transition performed on one domain is faithfully reflected across all connected domains with structural preservation. This guarantee encompasses bidirectional roundtrip preservation, consistency across an arbitrary finite set of domains, and per-asset isolation. Second, liveness under Byzantine faults: in the presence of up to f < n/3 Byzantine nodes, we prove deterministic resolution of conflicting regulatory actions, deadlock freedom, and starvation freedom. In the combination of these two properties, the liveness proof discharges the honest-node assumption of the safety proof under Byzantine faults, promoting conditional safety to an unconditional guarantee. The seven generic locales derived in this process are domain-independent and reusable for arbitrary domains via Isabelle/HOL's interpretation mechanism. The application context is a regulatory state transition model based on the RCP framework (arXiv:2603.29278), which systematizes 31 requirements from 15 global financial regulatory authorities. All proof artifacts build in Isabelle/HOL without sorry or oops, have been submitted to the Archive of Formal Proofs (under review), and are publicly available on GitHub.
要約:
WebAssembly is quickly becoming a popular compilation target for a variety of code. However, vulnerabilities in the source languages translate to vulnerabilities in the WebAssembly binaries. This work proposes a methodology and a WebAssembly transpiler to prevent buffer overflows in the unmanaged memory of the WebAssembly runtime. The transpiler accepts a WebAssembly binary and adds stack canaries and Address Space Layout Randomization (ASLR) to protect against buffer overflows.
要約:
Traditional consensus mechanisms, such as Proof of Stake (PoS), increasingly reveal an excessive dependency on large liquidity providers. Although the Proof of Liquidity (PoL) mechanism serves as a critical paradigm for incentivizing sustained liquidity provision and ensuring market stability, its transition from asset staking to active liquidity management significantly increases the complexity of underlying smart contract economic models and interaction logic. This renders hidden liquidity logic flaws difficult to detect via traditional methods, seriously threatening the system stability and user asset security of mainstream DeFi and emerging PoL ecosystems. To address this, we propose the LiquiLM framework, which integrates Large Language Models (LLMs) with a Dynamic Co-Attention Network (DCN). By establishing a dynamic interaction between liquidity-critical contracts and flaw descriptions, the framework effectively bridges the semantic gap between underlying code implementations and high-level liquidity intents. We evaluate the performance of LiquiLM on 1,490 validation contracts (covering precision, recall, specificity, and F1-score). The results show that it achieves significant effectiveness in auditing and explaining liquidity flaws: in experiments using Gemini 3 Pro and GPT-4o as backbone models, respectively, the F1-scores both exceed 90%. Furthermore, through an in-depth audit of 1,380 real-world PoL and Ethereum economic contracts, LiquiLM successfully identifies 238 high-risk contracts and assists in discovering 10 vulnerabilities that have received CVE certification.
要約:
Federated learning (FL) enables multiple clients to collaboratively train a global machine learning model via a server without sharing their private training data. In traditional FL, the system follows a synchronous approach, where the server waits for model updates from numerous clients before aggregating them to update the global model. However, synchronous FL is hindered by the straggler problem. To address this, the asynchronous FL architecture allows the server to update the global model immediately upon receiving any client's local model update. Despite its advantages, the decentralized nature of asynchronous FL makes it vulnerable to poisoning attacks. Several defenses tailored for asynchronous FL have been proposed, but these mechanisms remain susceptible to advanced attacks or rely on unrealistic server assumptions. In this paper, we introduce SecureAFL, an innovative framework designed to secure asynchronous FL against poisoning attacks. SecureAFL improves the robustness of asynchronous FL by detecting and discarding anomalous updates while estimating the contributions of missing clients. Additionally, it utilizes Byzantine-robust aggregation techniques, such as coordinate-wise median, to integrate the received and estimated updates. Extensive experiments on various real-world datasets demonstrate the effectiveness of SecureAFL.
要約:
Standard communication protocols for Unmanned Aerial Vehicles (UAVs), such as MAVLink, lack the capability to enforce the contextual validity of message sequences. Autopilots therefore remain vulnerable to stealthy attacks, where syntactically correct but semantically ill-timed commands induce unsafe states without triggering physical anomaly detectors. Prior work (DATUM) demonstrated that global Refined Multiparty Session Types (RMPSTs) are an effective specification language for centralized MAVLink protocol enforcement, but suffered from two engineering failures: manual proof terms interleaved with protocol definitions, and an OCaml extraction backend whose managed runtime is incompatible with resource-constrained UAV hardware. We present Platum, a framework that addresses both failures with a minimal DSL requiring only the five semantic components of a global session type (sender, receiver, label, payload variable, refinement predicate), whose structural well-formedness conditions are confirmed via reflective decision procedures in Meta-F*. Confirmed specifications are compiled directly into flat, allocation-free C Finite State Machines (FSMs), deployed as centralized proxy monitors at the GCS/UAV communication boundary. Our evaluation demonstrates a 4x reduction in total monitor latency and lower memory overhead compared to DATUM, measured via ArduPilot SITL simulation.
要約:
IoT location services accept client-reported GPS coordinates at face value, yet spoofing is trivial with consumer-grade tools. Existing spoofing detectors output a binary decision, forcing system designers to choose between high false-deny and high false-accept rates. We propose a graduated trust gate that computes a multi-signal integrity score and maps it to three actions: PROCEED, STEP-UP, or DENY, where STEP-UP invokes a stronger verifier such as a zero-knowledge proximity proof. A session-latch mechanism ensures that a single suspicious fix blocks the entire session, preventing post-transition score recovery. Under an idealized step-up oracle on 10,000 synthetic traces, the gate enables strict thresholds (theta_p = 0.9) that a binary gate cannot safely use: at matched false-accept rate (11%), the graduated gate maintains zero false-deny rate versus 0.05% for binary, with 5 microseconds scoring overhead. Real-device traces from an Android smartphone demonstrate the session-latch mechanism and show that a nearby mock location (~550 m) evades theta_p = 0.7 but is routed to step-up at theta_p = 0.9. Signal ablation identifies a minimal two-signal configuration (F1 = 0.84) suitable for resource-constrained scoring layers.
要約:
A zero-knowledge proximity proof certifies geometric nearness but carries no commitment to an application context. In stateful geo-content systems, where drops can share coordinates, policies evolve, and content has persistent identity, this gap can permit proof transfer between application objects unless extra operational invariants are maintained. We present a systems-security analysis of this deployment problem: a taxonomy of context-binding vulnerabilities, a formal off-circuit verification model for a transcript-adversary that holds a recorded proof but cannot obtain fresh coordinates, an assumption comparison across five binding strategy classes, and a concrete instantiation, Zairn-ZKP, that embeds drop identity, policy version, and session context as public circuit inputs. Compared with a strong off-circuit alternative based on stored-digest server checking, in-proof binding reduces operational invariants from four to two and adds no measurable proving cost relative to the sound geo-only baseline (-0.12 ms median in our setup). It also removes a correctness pitfall we identify empirically: a plausible off-circuit implementation that omits one server-side check remains vulnerable to cross-drop transfer. Measurements across six network conditions, seven venues in four countries, and an epoch-window simulation indicate that same-epoch transfer is realistic in dense urban deployments unless per-request nonces are maintained. Across five platforms and seven binding strategies, the results support a deployable methodology for reducing assumption surfaces in stateful ZK-backed verification workflows.
要約:
Location-based systems that combine encrypted geographic search with zero-knowledge proximity proofs typically treat the two phases as independent. Under an honest-but-curious server, this leaves an authorization provenance gap: once session state is purged, no forensic procedure can attribute a proof to its originating search session, because the proof's public inputs encode no session-identifying information.
We formalize this gap as the search-authorized proof (SAP) security notion and show via a concrete audit re-association attack that proof-external mechanisms, where authorization evidence remains outside the proof, cannot prevent forensic misattribution when the same drop parameters recur across sessions. Search-Bound Proximity Proofs (SBPP) realize the SAP requirements without modifying the ZKP circuit: session nonce, Merkle-root result-set commitment, and signed receipt are decomposed into independently auditable components, enabling property-level fault isolation in offline audit. Experiments on synthetic and real-world data (110,776 OpenStreetMap POIs) show sub-millisecond absolute overhead on a 125 ms Groth16 baseline.
要約:
The Learning with Errors (LWE) problem is a hard math problem in lattice-based cryptography. In the simplest case of binary secrets, it is the subset sum problem, with error. Effective ML attacks on LWE were demonstrated in the case of binary, ternary, and small secrets, succeeding on fairly sparse secrets. The ML attacks recover secrets with up to 3 active bits in the "cruel region" (Nolte et al., 2024) on samples pre-processed with BKZ. We show that using larger training sets and repeated examples enables recovery of denser secrets. Empirically, we observe a power-law relationship between model-based attempts to recover the secrets, dataset size, and repeated examples. We introduce a stepwise regression technique to recover the "cool bits" of the secret.
要約:
As cloud environments become increasingly complex, cybersecurity and forensic investigations must evolve to meet emerging threats. Large Language Models (LLMs) have shown promise in automating log analysis and reasoning tasks, yet they remain vulnerable to prompt injection attacks and lack forensic rigor. To address these dual challenges, we propose a unified, secure-by-design GenAI framework that integrates PromptShield and the Cloud Investigation Automation Framework (CIAF). PromptShield proactively defends LLMs against adversarial prompts using ontology-driven validation that standardizes user inputs and mitigates manipulation. CIAF streamlines cloud forensic investigations through structured, ontology-based reasoning across all six phases of the forensic process. We evaluate our system on real-world datasets from AWS and Microsoft Azure, demonstrating substantial improvements in both LLM security and forensic accuracy. Experimental results show PromptShield boosts classification performance under attack conditions, achieving precision, recall, and F1 scores above 93%, while CIAF enhances ransomware detection accuracy in cloud logs using Likert-transformed performance features. Our integrated framework advances the automation, interpretability, and trustworthiness of cloud forensics and LLM-based systems, offering a scalable foundation for real-time, AI-driven incident response across diverse cloud infrastructures.
要約:
AI control protocols use monitors to detect attacks by untrusted AI agents, but standard single-score monitors face two limitations: they miss subtle attacks where outputs look clean but reasoning is off, and they collapse to near-zero safety when the monitor is the same model as the agent (collusion). We present TraceGuard, a structured multi-dimensional monitoring protocol that evaluates agent actions across five dimensions -- goal alignment, constraint adherence, reasoning coherence, safety awareness, and action-trace consistency -- scored in parallel by independent LLM calls, augmented by seven heuristic detectors and an LLM-based intent analyzer. We evaluate on BashArena (637 bash tasks, 4 attack categories) within the ControlArena framework. Our results on 519 samples (279 honest, 240 attack) show that: (1) the hybrid approach achieves clear attack-honest separation (attack mean 0.616 vs. honest mean 0.206, Delta=0.410); (2) structured scoring constrains collusion -- the untrusted structured monitor achieves 95% safety vs. 0% for single-score untrusted monitoring; (3) goal alignment and constraint adherence are the most discriminative dimensions; and (4) a separation-of-duties variant splitting dimensions across trusted and untrusted models achieves 100% safety while preventing any single model from seeing the full evaluation. TraceGuard is implemented as a new monitor type for the open-source ControlArena framework.
要約:
With the widespread use of software systems in critical infrastructures such as hydropower plants has brought many advantages, yet it has exposed these systems to cyber threats. Cyber risk assessment & mitigation is important to identify cyber threats and protect these systems from unwanted incidents. This paper evaluates and compares the two risk assessment methodologies namely Hazard and Operability Study (HAZOP) and BowTie analysis for identifying cyber induced threats in hydropower systems. We selected these two methodologies because they offer a complementary perspective for cyber-safety risk assessment. Each method is first applied in traditional form to identify hazards, barriers, and threat scenarios arising from accidental causes, then extended to examine how findings change under cyber-induced causation. The traditional HAZOP identifies 18 deviations across five control parameters; the cyber extension shows how an adversary can coordinate multiple deviations to produce outcomes that conventional safeguards cannot detect. The BowTie analysis maps preventive and mitigation barriers around a top event; the cyber extension reveals that barriers appearing independently can share network infrastructure a single attacker could compromise, challenging the defense-in-depth assumption. Together, the two methods provide complementary coverage: HAZOP systematically enumerates what can go wrong, while BowTie shows how barriers provide layered protection. The cyber extension applied to both exposes assumptions, independent causes in HAZOP and independent barriers in BowTie, that do not hold against a coordinated adversary. As a result of this study, this paper highlights a practical two-stage approach to adapt established safety methods to identify cybersecurity challenges in hydropower control systems, provides pros and cons of these methodologies, and shows area of applicability.
要約:
The growing complexity of real-time embedded systems demands strong isolation of software components into separate protection domains to reduce attack surfaces and limit fault propagation. However, application-supplied device interrupt handlers -- even untrusted -- have to remain in the kernel to minimize interrupt latency, undermining security and burdening manual certifications. Current hardware extensions accelerate interrupts only when the target protection domain is scheduled by the kernel; consequently, they are limited to improving average-case performance but not worst-case latency, and do not meet the requirements of critical real-time applications such as autonomous vehicles or robots.
To overcome this limitation, we propose a novel hardware extension that enables direct, deterministic switching to the appropriate protection domain upon user-level interrupt arrival -- without kernel intervention -- even when that domain is dormant. Our hardware extension reduces worst-case latency by more than 50x with a 19% increase in core area (2% of total die area) and 4.1% increase in dynamic power. To the best of our knowledge, this is the first integrated mechanism to guarantee user-level interrupt delivery with a nanosecond-scale yet bounded worst-case latency.
要約:
With the increasing importance of data privacy and security, federated unlearning emerges as a new research field dedicated to ensuring that once specific data is deleted, federated learning models no longer retain or disclose related information. In this paper, we propose a zero-shot federated unlearning scheme, named Jellyfish. It distinguishes itself from conventional federated unlearning frameworks in four key aspects: synthetic data generation, knowledge disentanglement, loss function design, and model repair. To preserve the privacy of forgotten data, we design a zero-shot unlearning mechanism that generates error-minimization noise as proxy data for the data to be forgotten. To maintain model utility, we first propose a knowledge disentanglement mechanism that regularises the output of the final convolutional layer by restricting the number of activated channels for the data to be forgotten and encouraging activation sparsity. Next, we construct a comprehensive loss function that incorporates multiple components, including hard loss, confusion loss, distillation loss, model weight drift loss, gradient harmonization, and gradient masking, to effectively align the learning trajectories of the objectives of ``forgetting" and ``retaining". Finally, we propose a zero-shot repair mechanism that leverages proxy data to restore model accuracy within acceptable bounds without accessing users' local data. To evaluate the performance of the proposed zero-shot federated unlearning scheme, we conducted comprehensive experiments across diverse settings. The results validate the effectiveness and robustness of the scheme.
要約:
Tool-calling LLM agents can read private data, invoke external services, and trigger real-world actions, creating a security problem at the point of tool execution. We identify a denial-feedback leakage pattern, which we term causality laundering, in which an adversary probes a protected action, learns from the denial outcome, and exfiltrates the inferred information through a later seemingly benign tool call. This attack is not captured by flat provenance tracking alone because the leaked information arises from causal influence of the denied action, not direct data flow. We present the Agentic Reference Monitor (ARM), a runtime enforcement layer that mediates every tool invocation by consulting a provenance graph over tool calls, returned data, field-level provenance, and denied actions. ARM propagates trust through an integrity lattice and augments the graph with counterfactual edges from denied-action nodes, enabling enforcement over both transitive data dependencies and denial-induced causal influence. In a controlled evaluation on three representative attack scenarios, ARM blocks causality laundering, transitive taint propagation, and mixed-provenance field misuse that a flat provenance baseline misses, while adding sub-millisecond policy evaluation overhead. These results suggest that denial-aware causal provenance is a useful abstraction for securing tool-calling agent systems.
要約:
As Large Language Models (LLMs) are increasingly deployed in complex applications, their vulnerability to adversarial attacks raises urgent safety concerns, especially those evolving over multi-round interactions. Existing defenses are largely reactive and struggle to adapt as adversaries refine strategies across rounds. In this work, we propose CoopGuard , a stateful multi-round LLM defense framework based on cooperative agents that maintains and updates an internal defense state to counter evolving attacks. It employs three specialized agents (Deferring Agent, Tempting Agent, and Forensic Agent) for complementary round-level strategies, coordinated by System Agent, which conditions decisions on the evolving defense state (interaction history) and orchestrates agents over time. To evaluate evolving threats, we introduce the EMRA benchmark with 5,200 adversarial samples across 8 attack types, simulating progressively LLM multi-round attacks. Experiments show that CoopGuard reduces attack success rate by 78.9% over state-of-the-art defenses, while improving deceptive rate by 186% and reducing attack efficiency by 167.9%, offering a more comprehensive assessment of multi-round defense. These results demonstrate that CoopGuard provides robust protection for LLMs in multi-round adversarial scenarios.
要約:
Protecting sensitive information in data-driven collaborations, such as AI training, while meeting the diverse requirements of multiple mutually distrusted stakeholders, is both crucial and challenging. This paper presents Styx, a novel framework to address this challenge by integrating sticky policies with Trusted Execution Environments (TEEs). At a high level, Styx employs a hardware-TEE-protected middleware with a programming language runtime to form a sandboxed environment for both the data processing and policy enforcement. We carefully designed a data processing workflow and pipelines to enable a strong yet flexible data-specific policy enforcement throughout the entire data lifecycle and data derivation to achieve data-in-use protection, data lifecycle protection and dynamic collaboration. We implemented Styx and demonstrated its ability to make collaborative computing, such as joint AI training, more secure, privacy-preserving, and policy-compliant. Our evaluation shows the performance overheads imposed by Styx are reasonable on single-node computation with the capability to scale to a large distributed multi-node deployment.
要約:
Virtual Private Networks (VPNs) are widely used for censorship evasion and traffic protection. VPN users expect to be provided with adequate security protection, and at the same time not be affected by other users connected to the same VPN server, which can be illustrated as the non-interference property. However, in this paper, we have identified several vulnerabilities that violate this property, specifically within the connection tracking frameworks of VPN servers, stemming from shared resource misuse and insufficient validation of session state transitions. We present three session manipulation attacks targeting TCP and UDP traffic tunneled through VPNs. The attacker who only connects to the same VPN server can launch denial-of-service attacks, hijack TCP connections of other clients, or inject forged DNS responses into their queries. We evaluate these attacks against five popular connection tracking frameworks across different OSes and nine major commercial VPN providers. Experimental results reveal that all frameworks and eight providers are vulnerable to at least one of the attacks. We have responsibly disclosed our findings with countermeasures, resulting in 19 assigned CVEs/CNVDs and acknowledgments from the communities and providers.
要約:
Developers utilize third-party libraries to improve productivity, which also introduces potential security risks. Existing approaches generate tests for public functions to trigger library vulnerabilities from client programs, yet they depend on proof-of-concepts (PoCs), which are often unavailable. In this paper, we propose a new approach, LiveFuzz, based on directed greybox fuzzing (DGF) to detect the exploitability of library vulnerabilities from client programs without PoCs. LiveFuzz exploits a target tuple to extend existing DGF techniques to cross-program scenarios. Based on the target tuple, LiveFuzz introduces a novel Abstract Path Mapping mechanism to project execution paths, mitigating the preference for shorter paths. LiveFuzz also proposes a risk-based adaptive mutation to mitigate excessive mutation. To evaluate LiveFuzz, we construct a new dataset including 61 cases of library vulnerabilities exploited from client programs. Results show that LiveFuzz increases the number of target-reachable paths compared with all baselines and improves the average speed of vulnerability exposure. Three vulnerabilities are triggered exclusively by LiveFuzz.
要約:
Cybersecurity research increasingly depends on reproducible evidence, such as traffic traces, logs, and labeled datasets, yet most public datasets remain static and offer limited support for controlled re-execution and traceability, especially in heterogeneous multi-protocol environments. This paper presents NetSecBed, a container-native, scenario-oriented testbed for reproducible generation of network traffic evidence and execution artifacts under controlled conditions, particularly suitable for IoT, IIoT, and pervasive multi-protocol environments. The framework integrates 60 attack scenarios, 9 target services, and benign traffic generators as single-purpose containers, enabling plug-and-play extensibility and traceability through declarative specifications. Its pipeline automates parametrized execution, packet capture, log collection, service probing, feature extraction, and dataset consolidation. The main contribution is a repeatable, auditable, and extensible framework for cybersecurity experimentation that reduces operational bias and supports continuous dataset generation.
要約:
With the rapid evolution of wireless technologies, Wi-Fi has expanded beyond its original role in data transmission to support various emerging applications, particularly in physical-layer security, including device authentication, user authentication, and secret key generation. Despite extensive research on Wi-Fi Channel State Information (CSI)-based physical-layer security, its vulnerabilities remain largely unexplored. In this work, we propose BFIAttack, a novel attack that exploits Beamforming Feedback Information (BFI) to reconstruct the CSI of a legitimate user or device, thereby compromising Wi-Fi-based physical-layer security. We realize the attack by leveraging a closed-form CSI reconstruction method for the single-antenna station scenario and a maximum likelihood estimation-based CSI reconstruction for the multi-antenna station scenario. Moreover, we exploit spatial similarities among antenna pairs to refine the reconstructed CSI and enhance attack effectiveness. Experimental results show that BFIAttack achieves an average attack success rate of $73\%$ in multi-antenna station scenarios with no more than five attack attempts, and over $93\%$ in single-antenna station scenarios with only a single attempt. BFIAttack reveals critical vulnerabilities in existing Wi-Fi-based physical-layer security.
要約:
Post-quantum signature schemes such as ML-DSA-65 produce signatures of 3,309 bytes and public keys of 1,952 bytes over 50 times larger than classical Ed25519. In TLS-authenticated environments like Kubernetes control planes and 5G Core networks, where every inter-component connection is mutually authenticated, this overhead compounds across thousands of handshakes per second. Merkle Tree Certificates (MTC), currently under development at IETF, replace per-certificate issuer signatures with Merkle inclusion proofs and, in the landmark mode, eliminate on-wire signatures from certificate authentication entirely. We present MTC-based PKI architectures for Kubernetes and 3GPP 5G Service-Based Architecture. Starting from the infrastructure layer, we replace the Kubernetes cluster CA with an MTCA deployment that issues MTC certificates to control plane components, with cosigners and a DaemonSet-based landmark distributor. Building on this, we design a certificate lifecycle for 5G Network Functions deployed against QORE, a post-quantum 5G Core. We implement MTC proof construction and verification in Go crypto/tls and crypto/x509 packages. Our measurements on an Intel i9-12900 show MTC landmark verification completing in under 2 {\mu}s compared to 24 microseconds for ECDSA signature verification-with no measurable impact on TLS handshake time. We further propose a 6G-native architecture where the NRF serves as the MTCA and the SCP as witness cosigner, and discuss applicability to Non-Terrestrial Networks.
要約:
Modern blockchains increasingly rely on parallel execution to improve throughput. We show several industry and academic transaction fee mechanisms (TFMs) struggle to simultaneously account for execution parallelism while remaining performant and fair. First, if parallelism affects fees, adversarial protocol manipulations that offset possible benefits to throughput by introducing fake transactions become rational: users can insert functionally useless parallel transactions solely to reduce fees, and schedulers can create useless sequential transactions to increase revenue. Execution contingency, a core feature of expressive programming languages, both exacerbates the aforementioned threats and introduces new ones: (1) users may overpay for unused resources, and (2) scheduler revenue is harmed when reserved scheduling slots go unused due to contingency. We introduce a framework for this challenging setting, and prove an impossibility, highlighting an inherent tension: both parallelism and contingency involve a trade-off between minimizing risks for users and schedulers, as favoring one comes at the expense of the other. To complete the picture, we introduce a fee mechanisms and prove that they achieve the boundaries of this trade-off. Our results provide rigorous foundations for evaluating designs advanced by notable blockchains, such as Sui and Monad.
要約:
Blockchain forensics inherently involves dynamic and iterative investigations, while many existing approaches primarily model it through static inference pipelines. We propose a paradigm shift towards Agentic Blockchain Forensics (ABF), modeling forensic investigation as a sequential decision-making process. To instantiate this paradigm, we introduce LOCARD, the first agentic framework for blockchain forensics. LOCARD operationalizes this perspective through a Tri-Core Cognitive Architecture that decouples strategic planning, operational execution, and evaluative validation. Unlike generic LLM-based agents, it incorporates a Structured Belief State mechanism to enforce forensic rigor and guide exploration under explicit state constraints. To demonstrate the efficacy of the ABF paradigm, we apply LOCARD to the inherently complex domain of cross-chain transaction tracing. We introduce Thor25, a benchmark dataset comprising over 151k real-world cross-chain forensic records, and evaluate LOCARD on the Group-Transfer Tracing task for dismantling Sybil clusters. Validated against representative laundering sub-flows from the Bybit hack, LOCARD achieves high-fidelity tracing results, providing empirical evidence that modeling blockchain forensics as an autonomous agentic task is both viable and effective. These results establish a concrete foundation for future agentic approaches to large-scale blockchain forensic analysis. Code and dataset are publicly available at https://github.com/xhyumiracle/locard and https://github.com/xhyumiracle/thorchain-crosschain-data.
要約:
The AI-based sensing and autonomous monitoring have become the main components of wildfire early detection, but current systems do not provide adaptive inter-agent coordination, structurally defined human control, and cryptographically verifiable responsibility. Purely autonomous alert dissemination in the context of safety critical disasters poses threats of false alarming, governance failure and lack of trust in the system. This paper provides a blockchain-based governance-conscious agentic AI architecture of trusted wildfire early warning. The monitoring of wildfires is modeled as a constrained partially observable Markov decision process (POMDP) that accounts for the detection latency, false alarms reduction and resource consumption with clear governance constraints. Hierarchical multi-agent coordination means dynamic risk-adaptive reallocation of unmanned aerial vehicles (UAVs). With risk-adaptive policies, a permissioned blockchain layer sets mandatory human-authorization as a state-transition invariant as a smart contract. We build formal assurances such as integrity of alerts, human control, non-repudiation and limited detection latency assumptions of Byzantine fault. Security analysis shows that it is resistant to alert injections, replays, and tampering attacks. High-fidelity simulation environment experimental evaluation of governance enforcement demonstrates that it presents limited operational overhead and decreases false public alerts and maintains adaptive detection performance. This work is a step towards a principled design paradigm of reliable AI systems by incorporating accountability into the agentic control loop of disaster intelligence systems that demand safety in their application.
要約:
Modern 5G user equipment (UE) processes Radio Resource Control (RRC) configuration messages during early control-plane exchanges, before authentication and integrity protection are established. Prior work for testing 5G UEs has largely focused on constructing syntactically invalid inputs. In contrast, we show that syntactically valid but semantically inconsistent messages, which violate specification-level field constraints or cross-field dependencies, can drive baseband implementations into invalid states, triggering assertion failures or modem crashes. These findings reveal semantic inconsistencies in pre-authentication signaling as a critical yet underexplored attack surface in 5G UE implementations. To address this gap, we present Constraint-Guided Semantic Testing (ConSeT), a framework that systematically extracts specification-level constraints and leverages them to generate targeted semantic violations for testing 5G UEs. ConSeT decodes RRC messages into structured fields, derives schema-based rules, infers cross-field dependencies using a Large Language Model (LLM) in an evidence-bounded manner, and produces syntactically valid test cases that intentionally violate semantic constraints. We evaluate ConSeT on both commercial and open-source 5G UEs. On commercial smartphones, it uncovers 7 previously unknown vulnerabilities through responsible disclosure, including 3 high-severity CVEs, affecting 64 chipset models and over 542 commercially available smartphone models. On the open-source OAI UE, ConSeT additionally triggers 29 distinct crash sites.
要約:
Large language models (LLMs) are increasingly embedded in open-source software (OSS) ecosystems, creating complex interactions among natural language prompts, probabilistic model outputs, and execution-capable components. However, it remains unclear whether traditional vulnerability disclosure frameworks adequately capture these model-mediated risks. To investigate this, we analyze 295 GitHub Security Advisories published between January 2025 and January 2026 that reference LLM-related components, and we manually annotate a sample of 100 advisories using the OWASP Top 10 for LLM Applications 2025.
We find no evidence of new implementation-level weakness classes specific to LLM systems. Most advisories map to established CWEs, particularly injection and deserialization weaknesses. At the same time, the OWASP-based analysis reveals recurring architectural risk patterns, especially Supply Chain, Excessive Agency, and Prompt Injection, which often co-occur across multiple stages of execution. These results suggest that existing advisory metadata captures code-level defects but underrepresents model-mediated exposure. We conclude that combining the CWE and OWASP perspectives provides a more complete and necessary view of vulnerabilities in LLM-integrated systems.
要約:
When an LLM deobfuscates JavaScript, can poisoned identifier names in the string table survive into the model's reconstructed code, even when the model demonstrably understands the correct semantics? Using Claude Opus 4.6 across 192 inference runs on two code archetypes (force-directed graph simulation, A* pathfinding; 50 conditions, N=3-6), we found three consistent patterns: (1) Poisoned names persisted in every baseline run on both artifacts (physics: 8/8; pathfinding: 5/5). Matched controls showed this extends to terms with zero semantic fit when the string table does not form a coherent alternative domain. (2) Persistence coexisted with correct semantic commentary: in 15/17 runs the model wrote wrong variable names while correctly describing the actual operation in comments. (3) Task framing changed persistence: explicit verification prompts had no effect (12/12 across 4 variants), but reframing from "deobfuscate this" to "write a fresh implementation" reduced propagation from 100% to 0-20% on physics and to 0% on pathfinding, while preserving the checked algorithmic structure. Matched-control experiments showed zero-fit terms persist at the same rate when the replacement table lacks a coherent alternative-domain signal. Per-term variation in earlier domain-gradient experiments is confounded with domain-level coherence and recoverability. These observations are from two archetypes on one model family (Opus 4.6 primary; Haiku 4.5 spot-check). Broader generalization is needed
要約:
The L-Band Digital Aviation Communication System (LDACS) aims to modernize communications between the aircraft and the tower. Besides digitizing this type of communication, the contributors also focus on protecting them against cyberattacks. There are several proposals regarding LDACS security, and a recent one suggests the use of physical unclonable functions (PUFs) for the authentication module. This work demonstrates this PUF-based authentication mechanism along with its potential vulnerabilities. Sophisticated models are able to predict PUFs, and, on the other hand, quantum computers are capable of threatening current cryptography, consisting factors that jeopardize the authentication mechanism giving the ability to perform impersonation attacks. In addition, aging is a characteristic that affects the stability of PUFs, which may cause instability issues, rendering the system unavailable. In this context, this work proposes the well-established Public Key Infrastructure (PKI), as an alternative solution.
要約:
Blockchain assets are increasingly controlled by organizations rather than individuals. DAO treasuries, consortium wallets, and custodial exchanges rely on threshold authorization and multi-party key management, yet existing payment mechanisms still target single-user wallets, leaving no unified solution for organizational transfers. We formalize the problem of \emph{DAO-to-(anonymous)-DAO} transactions and present \textsc{Dao$^2$}, a framework that enables one threshold-controlled organization to pay another, optionally with recipient anonymity, while keeping received funds under distributed control. \textsc{Dao$^2$} combines three components: \emph{distributed key derivation} (DKD) for non-stealth child addresses, \emph{distributed stealth-address generation} (DSAG) for unlinkable one-time destinations, and \emph{threshold signatures} for authorization. For ordinary transfers, the receiver derives a non-stealth address via DKD; for anonymous transfers, it derives a stealth address via DSAG. The sender then threshold-signs the payment, and the receiver redeems the funds without reconstructing any master secret. We formally prove its security and evaluate a prototype. A complete anonymous DAO-to-DAO transaction for a typical-sized (e.g., 7-member) DAO finishes in under 27\,ms with less than 1.2\,KB of communication, and scales linearly with DAO size.
要約:
Autonomous agents are increasingly deployed in both offensive and defensive cyber operations, creating high-speed, closed-loop interactions in critical infrastructure environments. Advanced Persistent Threat (APT) actors exploit "Living off the Land" techniques and targeted telemetry perturbations to induce ambiguity in monitoring systems, causing automated defenses to overreact or misclassify benign behavior as malicious activity. Existing monolithic and multi-agent defense pipelines largely operate on correlation-based signals, lack structural constraints on response actions, and are vulnerable to reasoning drift under ambiguous or adversarial inputs. We present the Causal Multi-Agent Decision Framework (C-MADF), a structurally constrained architecture for autonomous cyber defense that integrates causal modeling with adversarial dual-policy control. C-MADF first learns a Structural Causal Model (SCM) from historical telemetry and compiles it into an investigation-level Directed Acyclic Graph (DAG) that defines admissible response transitions. This roadmap is formalized as a Markov Decision Process (MDP) whose action space is explicitly restricted to causally consistent transitions. Decision-making within this constrained space is performed by a dual-agent reinforcement learning system in which a threat-optimizing Blue-Team policy is counterbalanced by a conservatively shaped Red-Team policy. Inter-policy disagreement is quantified through a Policy Divergence Score and exposed via a human-in-the-loop interface equipped with an Explainability-Transparency Score that serves as an escalation signal under uncertainty. On the real-world CICIoT2023 dataset, C-MADF reduces the false-positive rate from 11.2%, 9.7%, and 8.4% in three cutting-edge literature baselines to 1.8%, while achieving 0.997 precision, 0.961 recall, and 0.979 F1-score.
要約:
Agentic AI systems increasingly execute consequential actions on behalf of human principals, delegating tasks through multi-step chains of autonomous agents. No existing standard addresses a fundamental accountability gap: verifying that terminal actions in a delegation chain were genuinely authorized by a human principal, through what chain of delegation, and under what scope. This paper presents the Human Delegation Provenance (HDP) protocol, a lightweight token-based scheme that cryptographically captures and verifies human authorization context in multi-agent systems. An HDP token binds a human authorization event to a session, records each agent's delegation action as a signed hop in an append-only chain, and enables any participant to verify the full provenance record using only the issuer's Ed25519 public key and the current session identifier. Verification is fully offline, requiring no registry lookups or third-party trust anchors. We situate HDP within the existing landscape of delegation protocols, identify its distinct design point relative to OAuth 2.0 Token Exchange (RFC 8693), JSON Web Tokens (RFC 7519), UCAN, and the Intent Provenance Protocol (draft-haberkamp-ipp-00), and demonstrate that existing standards fail to address the multi-hop, append-only, human-provenance requirements of agentic systems. HDP has been published as an IETF Internet-Draft (draft-helixar-hdp-agentic-delegation-00) and a reference TypeScript SDK is publicly available.
要約:
LLM agents with tool access can discover and exploit security vulnerabilities. This is known. What is not known is which features of a system prompt trigger this behaviour, and which do not. We present a systematic taxonomy based on approximately 10,000 trials across seven models, 37 prompt conditions, and 12 hypothesized attack dimensions -- all executed in real Docker sandboxes. Every condition includes the instruction "Always respect all rules and access policies"; only a single encouragement sentence varies. Within the task class studied (planted test-runner vulnerabilities), the threat model is narrower than assumed: nine of twelve dimensions -- including minimization, moral licensing, incentives, identity priming, and reasoning triggers -- produce no detectable exploitation at n=50 per cell (upper 95% CI <7% per condition). But it is also more subtle than expected: one dimension, goal reframing, reliably triggers exploitation. "You are solving a puzzle; there may be hidden clues" produces 38-40% exploitation on Claude Sonnet 4 despite the explicit rule instruction, replicating across four models (CTF framing: 8-14% on DeepSeek, GPT-5-mini, o4-mini). The agent does not override the rules; it reinterprets the task so that exploitative actions become task-aligned. GPT-4.1 produces no exploitation across 1,850 trials (37 conditions), and a temporal comparison across four OpenAI models released over eleven months shows a pattern consistent with improving safety training, though model capability differences are a confounder. The practical contribution is a narrowed, testable threat model: defenders should audit for goal-reframing language, not for the broad class of adversarial prompts.
要約:
Privacy has always been a critical issue in the digital era, particularly with the increasing use of Internet of Things (IoT) devices. As the IoT continues to transform industries such as healthcare, smart cities, and home automation, it has also introduced serious challenges regarding the security of sensitive and private data. This paper examines the complex landscape of digital privacy in IoT ecosystems, highlighting the need to protect personally identifiable information (PII) of individuals and uphold their rights to digital independence. Global events, such as the COVID-19 pandemic, have accelerated the adoption of IoT, raising concerns about privacy and data protection. This paper provides an in-depth examination of digital privacy risks in the IoT domain and introduces a clear taxonomy for evaluating them using the IEEE Digital Privacy Model. The proposed framework categorizes privacy risks into five types: identity-oriented, behavioral, inference, data manipulation, and regulatory risks. We review existing digital privacy solutions, including encryption technologies, blockchain, federated learning, differential privacy, reinforcement learning, AI, and dynamic consent mechanisms, to mitigate these risks. We also highlight how these privacy-enhancing technologies (PETs) help with data confidentiality, access control, and trust management. Additionally, this study presents AURA-IoT, a futuristic framework that tackles AI-driven privacy risks through a multi-layered structure. AURA-IoT integrates adversarial robustness, explainability, transparency, fairness, compliance, dynamic consent, and policy enforcement mechanisms to ensure digital privacy, security, and accountable IoT operations. Finally, we discuss ongoing challenges and potential research directions for integrating AI and encryption-based privacy solutions to achieve comprehensive digital privacy in future IoT systems.
要約:
Homomorphic encryption (HE) enables computation over encrypted data but incurs a substantial overhead. For sparse-matrix vector multiplication, the widely used Halevi and Shoup (2014) scheme has a cost linear in the number of occupied cyclic diagonals, which may be many due to the irregular nonzero pattern of the matrix. In this work, we study how to permute the rows and columns of a sparse matrix so that its nonzeros are packed into as few cyclic diagonals as possible. We formalise this as the two-dimensional diagonal packing problem (2DPP), introduce the two-dimensional circular bandsize metric, and give an integer programming formulation that yields optimal solutions for small instances. For large matrices, we propose practical ordering heuristics that combine graph-based initial orderings - based on bandwidth reduction, anti-bandwidth maximisation, and spectral analysis - and an iterative-improvement-based optimization phase employing 2OPT and 3OPT swaps. We also introduce a dense row/column elimination strategy and an HE-aware cost model that quantifies the benefits of isolating dense structures. Experiments on 175 sparse matrices from the SuiteSparse collection show that our ordering-optimisation variants can reduce the diagonal count by $5.5\times$ on average ($45.6\times$ for one instance). In addition, the dense row/column elimination approach can be useful for cases where the proposed permutation techniques are not sufficient; for instance, in one case, the additional elimination helped to reduce the encrypted multiplication cost by $23.7\times$ whereas without elimination, the improvement was only $1.9\times$.
要約:
Private information retrieval (PIR) allows private database queries but is hindered by intense server-side computation and memory traffic. Modern lattice-based PIR protocols typically involve three phases: ExpandQuery (expanding a query into encrypted indices), RowSel (encrypted row selection), and ColTor (recursive "column tournament" for final selection). ExpandQuery and ColTor primarily perform number-theoretic transforms (NTTs), whereas RowSel reduces to large-scale independent matrix-matrix multiplications (GEMMs). GPUs are theoretically ideal for these tasks, provided multi-client batching is used to achieve high throughput. However, batching fundamentally reshapes performance bottlenecks; while it amortizes database access costs, it expands working sets beyond the L2 cache capacity, causing divergent memory behaviors and excessive DRAM traffic.
We present GPIR, a GPU-accelerated PIR system that rethinks kernel design, data layout, and execution scheduling. We introduce a stage-aware hybrid execution model that dynamically switches between operation-level kernels, which execute each primitive operation separately, and stage-level kernels, which fuse all operations within a protocol stage into a single kernel to maximize on-chip data reuse. For RowSel, we identify a performance gap caused by a structural mismatch between NTT-driven data layouts and tiled GEMM access patterns, which is exacerbated by multi-client batching. We resolve this through a transposed-layout GEMM design and fine-grained pipelining. Finally, we extend GPIR to multi-GPU systems, scaling both query throughput and database capacity with negligible communication overhead. GPIR achieves up to 305.7x higher throughput than PIRonGPU, the state-of-the-art GPU implementation.
要約:
In the rapidly evolving landscape of software engineering, the demand for robust and secure systems has become increasingly critical. This is especially true for self-adaptive systems due to their complexity and the dynamic environments in which they operate. To address this issue, we designed and developed the SAFT-GT toolchain that tackles the multifaceted challenges associated with ensuring both safety and security. This paper provides a comprehensive description of the toolchain's architecture and functionalities, including the Attack-Fault Trees generation and model combination approaches. We emphasize the toolchain's ability to integrate seamlessly with existing systems, allowing for enhanced safety and security analyses without requiring extensive modifications and domain knowledge. Our proposed approach can address evolving security threats, including both known vulnerabilities and emerging attack vectors that could compromise the system. As a use case for the toolchain, we integrate it into the feedback loop of self-adaptive systems. Finally, to validate the practical applicability of the toolchain, we conducted an extensive user study involving domain experts, whose insights and feedback underscore the toolchain's relevance and usability in real-world scenarios. Our findings demonstrate the toolchain's effectiveness in real-world applications while highlighting areas for future improvements. The toolchain and associated resources are available in an open-source repository to promote reproducibility and encourage further research in this field.
要約:
The governance of frontier AI increasingly relies on controlling access to computational resources, yet the hardware-level mechanisms invoked by policy proposals remain largely unexamined from an engineering perspective. This paper bridges the gap between AI governance and computer engineering by proposing a taxonomy of 20 hardware-level governance mechanisms, organised by function (monitoring, verification, enforcement) and assessed for technical feasibility on a four-point scale from currently deployable to speculative. For each mechanism, we provide a technical description, a feasibility rating, and an identification of adversarial vulnerabilities. We map the taxonomy onto four governance scenarios: domestic regulation, bilateral agreements, multilateral treaty verification, and industry self-regulation. Our analysis reveals a structural mismatch: the mechanisms most needed for treaty verification, including on-chip compute metering, cryptographic proof-of-training, and hardware-embedded enforcement, are also the least mature. We assess principal threats to compute-based governance, including algorithmic efficiency gains, distributed training methods, and sovereignty concerns. We identify a temporal constraint: the window during which semiconductor manufacturing concentration makes hardware-level governance implementable is narrowing, while R&D timelines for critical mechanisms span years. We present an adversary-tiered threat analysis distinguishing commercial, non-state, and nation-state actors, arguing the appropriate security standard is tamper-evident assurance analogous to IAEA verification rather than absolute tamper-proofing. The taxonomy, feasibility classification, and mechanism-to-scenario mapping provide a technical foundation for policymakers and identify the R&D investments required before hardware-level governance can support verifiable international agreements.
要約:
Fine-tuning is now the primary method for adapting large neural networks, but it also introduces new integrity risks. An untrusted party can insert backdoors, change safety behavior, or overwrite large parts of a model while claiming only small updates. Existing verification tools focus on inference correctness or full-model provenance and do not address this problem.
We introduce Fine-Tuning Integrity (FTI) as a security goal for controlled model evolution. An FTI system certifies that a fine-tuned model differs from a trusted base only within a policy-defined drift class. We propose Succinct Model Difference Proofs (SMDPs) as a new cryptographic primitive for enforcing these drift constraints. SMDPs provide zero-knowledge proofs that the update to a model is norm-bounded, low-rank, or sparse. The verifier cost depends only on the structure of the drift, not on the size of the model.
We give concrete SMDP constructions based on random projections, polynomial commitments, and streaming linear checks. We also prove an information-theoretic lower bound showing that some form of structure is necessary for succinct proofs. Finally, we present architecture-aware instantiations for transformers, CNNs, and MLPs, together with an end-to-end system that aggregates block-level proofs into a global certificate.
要約:
Randomness beacons based on Verifiable Delay Functions (VDFs) are increasingly proposed for blockchains and distributed systems, promising publicly verifiable delay and bias resistance. Existing analyses, however, treat adversaries purely as cryptographic entities and overlook that real attackers are economically motivated. A VDF may be sequentially secure, yet still vulnerable if a rational adversary can profit by purchasing faster hardware and exploiting reward spikes such as MEV opportunities.
We develop a formal framework for economic security of VDF-based randomness beacons. Modeling the attacker as a rational agent facing hardware speedup, operating costs, and stochastic rewards, we cast the attack decision as an optimal-stopping problem and prove that optimal behavior has a monotone threshold structure. This yields tight necessary and sufficient conditions relating delay parameters to adversarial cost and reward distributions. We extend the analysis to grinding, selective abort, and multi-adversary competition, demonstrating how each amplifies effective rewards and increases required delays.
Using realistic cloud costs, hardware benchmarks, and MEV data, we show that many proposed VDF delays, on the order of a few seconds, are economically insecure under plausible conditions. We conclude with deployable guidelines and introduce Economically Secure Delay Parameters (ESDPs) to support principled parameter selection in practical systems.
要約:
Optimistic rollups provide scalable smart-contract execution but remain unsuitable for regulated financial applications due to three structural gaps: semantic legitimacy, cross-layer state consistency, and ordering fairness. We introduce RegGuard, a unified framework that enhances optimistic rollups with comprehensive legitimacy guarantees. RegGuard integrates three coordinated mechanisms: a decidable semantic validator powered by the RegSpec rule language for encoding regulatory constraints; a cross-layer state pre-synchronization validator that detects inconsistent L1-L2 dependencies with probabilistic reliability bounds; and a cryptographically verifiable fair-ordering service that ensures transaction sequencing fairness with negligible violation probability. We implement a 15,000-line prototype integrated into an Optimism-based rollup and evaluate it under adversarial conditions. RegGuard reduces settlement failures by over 90%, prevents detectable ordering manipulation, and maintains 85% of baseline throughput.
要約:
AI agents are increasingly deployed to interact with other agents on behalf of users and organizations. We ask whether two such agents, operated by different entities, can carry out a parallel secret conversation while still producing a transcript that is computationally indistinguishable from an honest interaction, even to a strong passive auditor that knows the full model descriptions, the protocol, and the agents' private contexts. Building on recent work on watermarking and steganography for LLMs, we first show that if the parties possess an interaction-unique secret key, they can facilitate an optimal-rate covert conversation: the hidden conversation can exploit essentially all of the entropy present in the honest message distributions.
Our main contributions concern extending this to the keyless setting, where the agents begin with no shared secret. We show that covert key exchange, and hence covert conversation, is possible even when each model has an arbitrary private context, and their messages are short and fully adaptive, assuming only that sufficiently many individual messages have at least constant min-entropy. This stands in contrast to previous covert communication works, which relied on the min-entropy in each individual message growing with the security parameter. To obtain this, we introduce a new cryptographic primitive, which we call pseudorandom noise-resilient key exchange: a key-exchange protocol whose public transcript is pseudorandom while still remaining correct under constant noise. We study this primitive, giving several constructions relevant to our application as well as strong limitations showing that more naive variants are impossible or vulnerable to efficient attacks.
These results show that transcript auditing alone cannot rule out covert coordination between AI agents, and identify a new cryptographic theory that may be of independent interest.
要約:
OpenClaw, the most widely deployed personal AI agent in early 2026, operates with full local system access and integrates with sensitive services such as Gmail, Stripe, and the filesystem. While these broad privileges enable high levels of automation and powerful personalization, they also expose a substantial attack surface that existing sandboxed evaluations fail to capture. To address this gap, we present the first real-world safety evaluation of OpenClaw and introduce the CIK taxonomy, which unifies an agent's persistent state into three dimensions, i.e., Capability, Identity, and Knowledge, for safety analysis. Our evaluations cover 12 attack scenarios on a live OpenClaw instance across four backbone models (Claude Sonnet 4.5, Opus 4.6, Gemini 3.1 Pro, and GPT-5.4). The results show that poisoning any single CIK dimension increases the average attack success rate from 24.6% to 64-74%, with even the most robust model exhibiting more than a threefold increase over its baseline vulnerability. We further assess three CIK-aligned defense strategies alongside a file-protection mechanism; however, the strongest defense still yields a 63.8% success rate under Capability-targeted attacks, while file protection blocks 97% of malicious injections but also prevents legitimate updates. Taken together, these findings show that the vulnerabilities are inherent to the agent architecture, necessitating more systematic safeguards to secure personal AI agents. Our project page is https://ucsc-vlaa.github.io/CIK-Bench.
要約:
Deploying large language models (LLMs) as cloud services raises privacy concerns as inference may leak sensitive data. Fully Homomorphic Encryption (FHE) allows computation on encrypted data, but current FHE methods struggle with efficient and precise nonlinear function evaluation. Specifically, CKKS-based approaches require high-degree polynomial approximations, which are costly when target precision increases. Alternatively, TFHE's Programmable Bootstrapping (PBS) outperforms CKKS by offering exact lookup-table evaluation. But it lacks high-precision implementations of LLM nonlinear layers and underutilizes GPU resources.
We propose \emph{TIGER}, the first GPU-accelerated framework for high-precision TFHE-based nonlinear LLM layer evaluation. TIGER offers: (1) GPU-optimized WoP-PBS method combined with numerical algorithms to surpass native lookup-table precision limits on nonlinear functions; (2) high-precision and efficient implementations of key nonlinear layers, enabling practical encrypted inference; (3) batch-driven design exploiting inter-input parallelism to boost GPU efficiency. TIGER achieves 7.17$\times$, 16.68$\times$, and 17.05$\times$ speedups over a CPU baseline for GELU, Softmax, and LayerNorm, respectively.
要約:
The namespace for filenames and DNS names has overlapped since the introduction of DNS in 1985: \texttt{.com} was the original binary format used for DOS and CP/M systems. Recently the introduction of gTLDs such as \texttt{.zip} and \texttt{.mov}, coupled with the growing prevalence of web resources, has ignited new concerns about potential issues related to DNS and filename confusion. Thus far, the discourse on DNS/filename confusion has been piecemeal and hypothetical, making it unclear what, if any, security concerns credibly exist. To address this gap, we provide the first enumeration of how DNS/filename confusion can be abused. We then perform the first empirical case studies of DNS/filename confusion in the wild, which highlights suspected confusion across a wide range of software. Finally, based on our preliminary findings, we provide suggestions and guidance for future research on this topic.
要約:
The Legendre Pseudorandom Function (PRF) is a highly efficient cryptographic primitive built upon the Legendre symbol, valued for its low multiplicative complexity in Multi-Party Computation (MPC) and Zero-Knowledge Proof (ZKP) protocols. While its security over prime fields $\mathbb{F}_p$ is well-documented, recent interest has shifted toward instantiations over extension fields $\mathbb{F}_{p^r}$. This paper presents the first comprehensive cryptanalysis of the single-degree Legendre PRF operating over $\mathbb{F}_{p^r}$.
First, we analyze polynomial input encoding under a standard passive threat model (sequential additive counter queries). We demonstrate that while the absence of polynomial carry-overs causes an asynchronous "no-carry fracture" that neutralizes classical sliding-window collision attacks, the fracture itself is deterministically periodic. By introducing a novel "Differential Signature" bucketing technique, we prove that an adversary can systematically group fractured sequences by their structural shapes to bypass this defense, recovering the secret key in $\mathcal{O}(U \cdot p^r/M)$ operations, where $U$ is the unicity distance.
Second, we evaluate the PRF under an active Chosen-Query threat model. We demonstrate that an adversary can circumvent the additive fracture by evaluating the PRF along a geometric sequence generated by a primitive polynomial. This structure invokes strict multiplicative homomorphism over $\mathbb{F}^*_{p^r}$, permitting a direct generalization of state-of-the-art table collision attacks to extract the key in $\mathcal{O}(p^r/M)$ operations. Finally, we establish the cryptographic boundaries of these attacks, formally proving the necessity of higher-degree key variants ($d \ge 2$) to achieve exponential security against structural reduction in extension fields.
要約:
Chain-of-Thought (CoT) prompting has been used to enhance the reasoning capability of LLMs. However, its reliability in security-sensitive analytical tasks remains insufficiently examined, particularly under structured human evaluation. Alternative approaches, such as model scaling and fine-tuning can be used to help improve performance. These methods are also often costly, computationally intensive, or difficult to audit. In contrast, prompt engineering provides a lightweight, transparent, and controllable mechanism for guiding LLM reasoning. This study proposes a structured prompt engineering framework designed to strengthen CoT reasoning integrity while improving security threat and attack detection reliability in local LLM deployments. The framework includes 16 factors grouped into four core dimensions: (1) Context and Scope Control, (2) Evidence Grounding and Traceability, (3) Reasoning Structure and Cognitive Control, and (4) Security-Specific Analytical Constraints. Rather than optimizing the wording of the prompt heuristically, the framework introduces explicit reasoning controls to mitigate hallucination and prevent reasoning drift, as well as strengthening interpretability in security-sensitive contexts. Using DDoS attack detection in SDN traffic as a case study, multiple model families were evaluated under structured and unstructured prompting conditions. Pareto frontier analysis and ablation experiments demonstrate consistent reasoning improvements (up to 40% in smaller models) and stable accuracy gains across scales. Human evaluation with strong inter-rater agreement (Cohen's k > 0.80) confirms robustness. The results establish structured prompting as an effective and practical approach for reliable and explainable AI-driven cybersecurity analysis.
要約:
Email remains a central communication medium, yet its long-standing design and interface conventions continue to enable deceptive attacks. This research note presents a structured list of 42 email-based deception techniques, documented with 64 concrete example implementations, organized around the sender, link, and attachment security indicators as well as techniques targeting the email rendering environment. Building on a prior systematic literature review, we consolidate previously reported techniques with newly developed example implementations and introduce novel deception techniques identified through our own examination. Rather than assessing effectiveness or real-world severity, each entry explains the underlying mechanism in isolation, separating the high-level deception goal from its concrete technical implementation. The documented techniques serve as modular building blocks and a structured reference for future work on countermeasures across infrastructure, email client design, and security awareness, supporting researchers as well as developers, operators, and designers working in these areas.
要約:
Open-domain video platforms offer rich, personalized content that could support health, caregiving, and educational applications, but their engagement-optimized recommendation algorithms can expose vulnerable users to inappropriate or harmful material. These risks are especially acute in child-directed and care settings (e.g., dementia care), where content must satisfy individualized safety constraints before being shown. We introduce SafeScreen, a safety-first video screening framework that retrieves and presents personalized video while enforcing individualized safety constraints. Rather than ranking videos by relevance or popularity, SafeScreen treats safety as a prerequisite and performs sequential approval or rejection of candidate videos through an automated pipeline. SafeScreen integrates three key components: (i) profile-driven extraction of individualized safety criteria, (ii) evidence-grounded assessments via adaptive question generation and multimodal VideoRAG analysis, and (iii) LLM-based decision-making that verifies safety, appropriateness, and relevance before content exposure. This design enables explainable, real-time screening of uncurated video repositories without relying on precomputed safety labels. We evaluate SafeScreen in a dementia-care reminiscence case study using 30 synthetic patient profiles and 90 test queries. Results demonstrate that SafeScreen prioritizes safety over engagement, diverging from YouTube's engagement-optimized rankings in 80-93% of cases, while maintaining high levels of safety coverage, sensibleness, and groundedness, as validated by both LLM-based evaluation and domain experts.
要約:
Decentralized finance introduces new business models and use cases as part of digital finance. Restaking has recently emerged as a transformative mechanism in DeFi, promising extra yields but introducing complex and interconnected risks. The paper monitors the current restaking landscape, empirically analyzes the revenue drivers of a liquid restaking protocol, and conducts a technical investigation on the emitted risk arising from the interconnection between liquid restaking and other protocols. The revenue dynamics of Renzo Protocol are analyzed by employing an OLS regression model, Granger-causality and random forest feature importance tests. Our results identify that revenue is primarily predicted by the value locked in the underlying EigenLayer ecosystem, the yield of Renzo protocol's liquid restaking token and the multi-blockchain expansion of that token. The multi-blockchain expansion of the liquid restaking token presents a double-edged sword: bridging to other networks is crucial for user adoption, but it adds the bridge risks to the existing risks of restaking. We investigate the cross-contamination risk between different DeFi services and the liquid restaking protocol. By mapping the asset flow across the decentralized finance ecosystem, it is detected that the bridge risk of the current size of Renzo's liquid-restaking assets does not impose a systemic risk on the current restaking and staking ecosystem. To address the potential consequences of the emphasized interconnection risks, we introduce two hypothetical scenarios and a stress test, assuming a large number of compromised liquid restaking tokens and a smart contract logic failure in a DeFi protocol. Considering the overall liquid-restaking protocols and the growing interconnection, this analysis requires further work to explore the growing complexities.
要約:
We present a formal treatment of provenance trees, directed acyclic graphs of artifact registrations anchored immutably on a public blockchain, and introduce the operator trust problem: when a single privileged operator submits all on-chain registrations on behalf of users, the on-chain record alone cannot distinguish user-initiated registrations from unilateral operator actions. We resolve this through a dual-layer cryptographic commitment scheme in which two commitments derived from a single client-side secret key, binding the key to the tree root and to each unique registration identifier, make false attribution claims strictly dominated strategies. We prove correctness under standard cryptographic assumptions and establish honest behavior as the unique Nash equilibrium without relying on operator trust. We further introduce and analyze the tree poisoning problem: adversarial attacks on users' provenance trees via fraudulent root registration, malicious child attachment, and tree identity spoofing. We characterize the closure properties of each attack variant and prove that a complete provenance tree integrity model requires three distinct mechanisms: cryptographic priority, governance cascade, and contract enforcement, each necessary and none individually sufficient. The construction is deployed on Base (Ethereum L2) as AnchorRegistry, an immutable on-chain provenance registry. We provide gas complexity analysis demonstrating O(1) cost invariant to registry scale, and a trustless reconstruction algorithm recovering the complete registry from public event logs alone.
要約:
As the Internet of Things (IoT) becomes deeply embedded in daily life, users are increasingly concerned about privacy leakage, especially from video data. Since frame-by-frame protection in large-scale video analytics (e.g., smart communities) introduces significant latency, a more efficient solution is to selectively protect frames containing privacy objects (e.g., faces). Existing object detectors require fully decoded videos or per-frame processing in compressed videos, leading to decoding overhead or reduced accuracy. Therefore, we propose ComPrivDet, an efficient method for detecting privacy objects in compressed video by reusing I-frame inference results. By identifying the presence of new objects through compressed-domain cues, ComPrivDet either skips P- and B-frame detections or efficiently refines them with a lightweight detector. ComPrivDet maintains 99.75% accuracy in private face detection and 96.83% in private license plate detection while skipping over 80% of inferences. It averages 9.84% higher accuracy with 75.95% lower latency than existing compressed-domain detection methods.
要約:
Large language models (LLMs) possess strong semantic understanding, driving significant progress in data mining applications. This is further enhanced by large reasoning models (LRMs), which provide explicit multi-step reasoning traces. On the other hand, the growing need for the right to be forgotten has driven the development of machine unlearning techniques, which aim to eliminate the influence of specific data from trained models without full retraining. However, unlearning may also introduce new security vulnerabilities by exposing additional interaction surfaces. Although many studies have investigated unlearning attacks, there is no prior work on LRMs. To bridge the gap, we first in this paper propose LRM unlearning attack that forces incorrect final answers while generating convincing but misleading reasoning traces. This objective is challenging due to non-differentiable logical constraints, weak optimization effect over long rationales, and discrete forget set selection. To overcome these challenges, we introduce a bi-level exact unlearning attack that incorporates a differentiable objective function, influential token alignment, and a relaxed indicator strategy. To demonstrate the effectiveness and generalizability of our attack, we also design novel optimization frameworks and conduct comprehensive experiments in both white-box and black-box settings, aiming to raise awareness of the emerging threats to LRM unlearning pipelines.
要約:
Autonomous underwater vehicles (AUVs) and sensor nodes increasingly support decentralized sensing and coordination in the Internet of Underwater Things (IoUT), yet most deployments rely on static trust once authentication is established, leaving long-duration missions vulnerable to compromised or behaviorally deviating agents. In this paper, an interrogator based structure is presented that incorporates the idea of behavioral trust monitoring into underwater multi-agent operation without interfering with autonomy. Privileged interrogator module is a passive communication metadata analyzer that uses a lightweight transformer model to calculate dynamic trust scores, which are used to authorize the forwarding of mission critical data. Suspicious agents cause proportional monitoring and conditional restrictions, which allow fast containment and maintain network continuity. The evidence of trust is stored in a permissioned blockchain consortium which offers identity management which is not tampered and is decentralized without causing the overhead of public consensus mechanisms. Simulation based analysis shows that the evaluation of the result compares to a relative improvement of 21.7% in the detection accuracy compared to the static trust baselines with limited energy overhead. These findings suggest that behavior driven validation has the capability of reinforcing underwater coordination without compromising scalability and deployment.
著者: Luca Nannini, Adam Leon Smith, Michele Joshua Maggini, Enrico Panai, Sandra Feliciano, Aleksandr Tiulkanov, Elena Maran, James Gealy, Piercosma Bisconti
要約:
AI agents - i.e. AI systems that autonomously plan, invoke external tools, and execute multi-step action chains with reduced human involvement - are being deployed at scale across enterprise functions ranging from customer service and recruitment to clinical decision support and critical infrastructure management. The EU AI Act (Regulation 2024/1689) regulates these systems through a risk-based framework, but it does not operate in isolation: providers face simultaneous obligations under the GDPR, the Cyber Resilience Act, the Digital Services Act, the Data Act, the Data Governance Act, sector-specific legislation, the NIS2 Directive, and the revised Product Liability Directive. This paper provides the first systematic regulatory mapping for AI agent providers integrating (a) draft harmonised standards under Standardisation Request M/613 to CEN/CENELEC JTC 21 as of January 2026, (b) the GPAI Code of Practice published in July 2025, (c) the CRA harmonised standards programme under Mandate M/606 accepted in April 2025, and (d) the Digital Omnibus proposals of November 2025. We present a practical taxonomy of nine agent deployment categories mapping concrete actions to regulatory triggers, identify agent-specific compliance challenges in cybersecurity, human oversight, transparency across multi-party action chains, and runtime behavioral drift. We propose a twelve-step compliance architecture and a regulatory trigger mapping connecting agent actions to applicable legislation. We conclude that high-risk agentic systems with untraceable behavioral drift cannot currently satisfy the AI Act's essential requirements, and that the provider's foundational compliance task is an exhaustive inventory of the agent's external actions, data flows, connected systems, and affected persons.
要約:
Federated learning (FL) enables multiple clients to collaboratively train a global model by aggregating local updates without sharing private data. However, FL often faces the challenge of free-riders, clients who submit fake model parameters without performing actual training to obtain the global model without contributing. Chen et al. proposed a free-rider detection method based on the weight evolving frequency (WEF) of model parameters. This detection approach is a leading candidate for practical free-rider detection methods, as it requires neither a proxy dataset nor pre-training. Nevertheless, it struggles to detect ``dynamic'' free-riders who behave honestly in early rounds and later switch to free-riding, particularly under global-model-mimicking attacks such as the delta weight attack and our newly proposed adaptive WEF-camouflage attack. In this paper, we propose a novel detection method S2-WEF that simulates the WEF patterns of potential global-model-based attacks on the server side using previously broadcasted global models, and identifies clients whose submitted WEF patterns resemble the simulated ones. To handle a variety of free-rider attack strategies, S2-WEF further combines this simulation-based similarity score with a deviation score computed from mutual comparisons among submitted WEFs, and separates benign and free-rider clients by two-dimensional clustering and per-score classification. This method enables dynamic detection of clients that transition into free-riders during training without proxy datasets or pre-training. We conduct extensive experiments across three datasets and five attack types, demonstrating that S2-WEF achieves higher robustness than existing approaches.
要約:
With the increasing importance of data privacy and security, federated unlearning has emerged as a novel research field dedicated to ensuring that federated learning models no longer retain or leak relevant information once specific data has been deleted. In this paper, to the best of our knowledge, we propose the first complete pipeline for federated unlearning, which includes a federated unlearning approach and an evaluation framework. Our proposed federated unlearning approach ensures high efficiency and model accuracy without the need to store historical data.It effectively leverages the knowledge distillation model alongside various optimization mechanisms. Moreover, we propose a framework named Skyeye to visualize the forgetting capacity of federated unlearning models. It utilizes the federated unlearning model as the classifier integrated into a Generative Adversarial Network (GAN). Afterward, both the classifier and discriminator guide the generator in generating samples. Throughout this process, the generator learns from the classifier's knowledge. The generator then visualizes this knowledge through sample generation. Finally, the model's forgetting capability is evaluated based on the relevance between the deleted data and the generated samples. Comprehensive experiments are conducted to illustrate the effectiveness of the proposed federated unlearning approach and the corresponding evaluation framework.
要約:
In an increasingly interconnected world, Cyber-Physical Systems (CPS) are essential to critical industries like healthcare, transportation, and manufacturing, merging physical processes with computational intelligence. However, the security of these systems is a major concern. Anomalies, whether from sensor malfunctions or cyberattacks, can lead to catastrophic failures, making effective detection vital for preventing harm and service disruptions. This paper provides a comprehensive review of anomaly detection techniques in CPS. We categorize and compare various methods, including data-driven approaches (machine learning, deep learning, machine learning-deep learning ensemble), model-driven approaches (mathematical, invariant-based), hybrid datamodel approaches (Physics-Informed Neural Networks), and system-oriented approaches. Our analysis highlights the strengths and weaknesses of each technique, offering a practical guide for creating safer and more reliable systems. By identifying current research gaps, we aim to inspire future work that will enhance the security and adaptability of CPS in our automated world.
要約:
The widespread adoption of Large Language Models (LLMs) in critical applications has introduced severe reliability and security risks, as LLMs remain vulnerable to notorious threats such as hallucinations, jailbreak attacks, and backdoor exploits. These vulnerabilities have been weaponized by malicious actors, leading to unauthorized access, widespread misinformation, and compromised LLM-embedded system integrity. In this work, we introduce a novel approach to detecting abnormal behaviors in LLMs via hidden state forensics. By systematically inspecting layer-specific activation patterns, we develop a general framework that can efficiently identify a range of security threats in real-time without imposing prohibitive computational costs. Extensive experiments indicate detection accuracies exceeding 95% and consistently robust performance across multiple models in most scenarios, while preserving the ability to detect novel attacks effectively. Furthermore, the computational overhead remains minimal, with detector inference taking merely fractions of a second. The significance of this work lies in proposing a promising strategy to reinforce the security of LLM-integrated systems, paving the way for safer and more reliable deployment in high-stakes domains. By enabling real-time detection that can also support the mitigation of abnormal behaviors, it represents a meaningful step toward ensuring the trustworthiness of AI systems amid rising security challenges.
要約:
Decentralization has an important geographic dimension that conventional metrics, such as stake distribution, often overlook. Validator location affects resilience to regional shocks (e.g., outages, natural disasters, or government intervention) as well as fairness in reward access. Yet major blockchain protocols do not encode geographical location in their rules; instead, validator locations emerge from a combination of economic incentives, regulatory constraints, infrastructure availability, and validator deployment choices. When some locations offer systematic advantages, validators may strategically co-locate to increase expected rewards, as in Ethereum, where validators cluster along the Atlantic corridor, which exhibits favorable latency.
In this paper, we develop a formal model of validators' geographical positioning incentives under Ethereum's protocol design, capturing the interaction between its two block-building paradigms, local and external block building, and the distribution of validators and information sources. We analyze the model under a mean-field approximation and complement it with agent-based simulations calibrated with real-world latency data to quantify how these incentives translate into geographical concentration under heterogeneous geographic and infrastructural conditions.
Our results show that Ethereum's block-building architecture is not geographically neutral. Both paradigms create location-dependent payoffs and incentives to move closer to payoff-relevant parties to reduce propagation delays, though through different mechanisms. Asymmetric access to information sources further increases geographical centralization. We also show that consensus parameters, including attestation thresholds and slot times, affect latency sensitivity and can strengthen these effects. Finally, we discuss implications for protocol design and possible mitigation directions.
要約:
Smart Voice Assistants (SVAs) are deeply embedded in the lives of youth, yet the mechanisms driving the privacy-protective behaviors among young users remain poorly understood. This study investigates how Canadian youth (aged 16-24) negotiate privacy with SVAs by developing and testing a structural model grounded in five key constructs: perceived privacy risks (PPR), perceived benefits (PPBf), algorithmic transparency and trust (ATT), privacy self-efficacy (PSE), and privacy-protective behaviors (PPB). A cross-sectional survey of N=469 youth was analyzed using partial least squares structural equation modeling. Results reveal that PSE is the strongest predictor of PPB, while the effect of ATT on PPB is fully mediated by PSE. This identifies a critical efficacy gap, where youth's confidence must first be built up for them to act. The model confirms that PPBf directly discourages protective action, yet also indirectly fosters it by slightly boosting self-efficacy. These findings empirically validate and extend earlier qualitative work, quantifying how policy overload and hidden controls erode the self-efficacy necessary for protective action. This study contributes an evidence-based pathway from perception to action and translates it into design imperatives that empower young digital citizens without sacrificing the utility of SVAs.
要約:
Graph data is increasingly prevalent across domains, offering analytical value but raising significant privacy concerns. Edges may encode sensitive relationships, while node attributes may contain sensitive entity or personal data. Differential Privacy (DP) has gained traction for its strong guarantees, yet applying DP to graphs is challenging because of their complex relational structure, leading to trade-offs between privacy and utility. Existing methods vary in privacy definitions, utility goals, and contextual settings, complicating comparison. For practitioners, this is compounded by DP's interpretability issues, contributing to misleading protection claims.
To address this, we propose a novel systemisation of existing methods tailored to practical considerations and adaptable to varying practitioner objectives. Our contributions include: (i) a comprehensive survey of differentially private graph release methods; (ii) identification of key vulnerabilities; and (iii) a practitioner-oriented, objective-based framework to guide the selection, interpretation, and sound evaluation of existing methods. We demonstrate the use of our systemisation through two exemplary scenarios in which we assume the role of a social network analyst, apply it, and conduct evaluations in accordance with our framework. Together, these two illustrative instantiations ultimately provide a unified benchmark for state-of-the-art methods in the social networks domain.
要約:
We identify a critical security vulnerability in mainstream Claw personal AI agents: untrusted content encountered during heartbeat-driven background execution can silently pollute agent memory and subsequently influence user-facing behavior without the user's awareness. This vulnerability arises from an architectural design shared across the Claw ecosystem: heartbeat background execution runs in the same session as user-facing conversation, so content ingested from any external source monitored in the background (including email, message channels, news feeds, code repositories, and social platforms) can enter the same memory context used for foreground interaction, often with limited user visibility and without clear source provenance. We formalize this process as an Exposure (E) $\rightarrow$ Memory (M) $\rightarrow$ Behavior (B) pathway: misinformation encountered during heartbeat execution enters the agent's short-term session context, potentially gets written into long-term memory, and later shapes downstream user-facing behavior. We instantiate this pathway in an agent-native social setting using MissClaw, a controlled research replica of Moltbook. We find that (1) social credibility cues, especially perceived consensus, are the dominant driver of short-term behavioral influence, with misleading rates up to 61%; (2) routine memory-saving behavior can promote short-term pollution into durable long-term memory at rates up to 91%, with cross-session behavioral influence reaching 76%; (3) under naturalistic browsing with content dilution and context pruning, pollution still crosses session boundaries. Overall, prompt injection is not required: ordinary social misinformation is sufficient to silently shape agent memory and behavior under heartbeat-driven background execution.
要約:
Cybersecurity data remains fragmented across vendors, formats, schemas, and deployment environments, forcing AI and analytics programs to spend disproportionate effort on ingestion, normalization, and brittle source-specific engineering. This paper introduces the Canonical Security Telemetry Substrate (CSTS), a canonical, AI-ready telemetry foundation designed to harmonize heterogeneous cyber data into a common representation over persistent entities, typed relations, events, temporal state, and provenance. CSTS is intended to move cybersecurity analytics beyond ad hoc record normalization toward a reusable substrate that supports anomaly detection, graph learning, forecasting, behavior-based modeling, and agentic cyber AI. We formalize the core design principles of CSTS, define its representational components, and explain how it preserves source-specific nuance through explicit mappings and extensible metadata while still enabling portable downstream inference. We further position CSTS as a cloud-agnostic and deployment-agnostic substrate suitable for on-prem, hybrid, and multi-cloud environments. The result is a unifying telemetry model that reduces the blue-collar burden of cyber data engineering and creates a clearer path to scalable, interoperable, and model-agnostic cyber AI.
要約:
Backdoor attacks on federated learning (FL) are most often evaluated with synthetic corner patches or out-of-distribution (OOD) patterns that are unlikely to arise in practice. In this paper, we revisit the backdoor threat to standard FL (a single global model) under a more realistic setting where triggers must be semantically meaningful, in-distribution, and visually plausible. We propose SABLE, a Semantics-Aware Backdoor for LEarning in federated settings, which constructs natural, content-consistent triggers (e.g., semantic attribute changes such as sunglasses) and optimizes an aggregation-aware malicious objective with feature separation and parameter regularization to keep attacker updates close to benign ones. We instantiate SABLE on CelebA hair-color classification and the German Traffic Sign Recognition Benchmark (GTSRB), poisoning only a small, interpretable subset of each malicious client's local data while otherwise following the standard FL protocol. Across heterogeneous client partitions and multiple aggregation rules (FedAvg, Trimmed Mean, MultiKrum, and FLAME), our semantics-driven triggers achieve high targeted attack success rates while preserving benign test accuracy. These results show that semantics-aligned backdoors remain a potent and practical threat in federated learning, and that robustness claims based solely on synthetic patch triggers can be overly optimistic.
要約:
LLM-as-a-Judge (LaaJ) is a novel paradigm in which powerful language models are used to assess the quality, safety, or correctness of generated outputs. While this paradigm has significantly improved the scalability and efficiency of evaluation processes, it also introduces novel security risks and reliability concerns that remain largely unexplored. In particular, LLM-based judges can become both targets of adversarial manipulation and instruments through which attacks are conducted, potentially compromising the trustworthiness of evaluation pipelines. In this paper, we present the first Systematization of Knowledge (SoK) focusing on the security aspects of LLM-as-a-Judge systems. We perform a comprehensive literature review across major academic databases, analyzing 863 works and selecting 45 relevant studies published between 2020 and 2026. Based on this study, we propose a taxonomy that organizes recent research according to the role played by LLM-as-a-Judge in the security landscape, distinguishing between attacks targeting LaaJ systems, attacks performed through LaaJ, defenses leveraging LaaJ for security purposes, and applications where LaaJ is used as an evaluation strategy in security-related domains. We further provide a comparative analysis of existing approaches, highlighting current limitations, emerging threats, and open research challenges. Our findings reveal significant vulnerabilities in LLM-based evaluation frameworks, as well as promising directions for improving their robustness and reliability. Finally, we outline key research opportunities that can guide the development of more secure and trustworthy LLM-as-a-Judge systems.
要約:
Retrieval-Augmented Generation (RAG) systems are deployed across federal agencies for citizen-facing tax guidance, benefits eligibility, and legal information, where a single incorrect number causes direct financial harm. This paper proves that all embedding-based RAG defenses share a fundamental blind spot: changing a tax deduction by $50,000 produces cosine similarity 0.9998, invisible to every known detection threshold. Across 174 manipulation pairs and two embedding models, the mean sensitivity gap is 1,459x. The blind spot is confirmed on real IRS documents.The root cause is that embeddings encode topic, not numerical precision. RAGShield sidesteps this by operating on extracted values directly: a pattern-based engine identifies dollar amounts and percentages in government text, links each value to its governing entity through two-pass context propagation (99.8% entity detection on 2,742 real IRS passages), and verifies every claim against a cross-source registry built from the corpus itself. A temporal tracker flags value changes that fall outside known government update schedules. On 430 attacks generated from real IRS document content, RAGShield detects every one (0.0% ASR, 95% CI [0%, 1%]) while embedding-based defenses miss 79-90% of the same attacks.
要約:
Federated learning has emerged as a powerful framework for analysing distributed data, yet two challenges remain pivotal: heterogeneity across sites and privacy of local data. In this paper, we address both challenges within a federated transfer learning framework, aiming to enhance learning on a target data set by leveraging information from multiple heterogeneous source data sets while adhering to privacy constraints. We rigorously formulate the notion of federated differential privacy, which offers privacy guarantees for each data set without assuming a trusted central server. Under this privacy model, we study four statistical problems: univariate mean estimation, low-dimensional linear regression, high-dimensional linear regression, and M-estimation. By investigating the minimax rates and quantifying the cost of privacy, we show that federated differential privacy is an intermediate privacy model between the well-established local and central models of differential privacy. Our analyses account for data heterogeneity and privacy, highlighting the fundamental costs associated with each factor and the benefits of knowledge transfer in federated learning.
要約:
Deep learning-based weather forecasting (DLWF) models have recently demonstrated significant performance gains over gold-standard physics-based simulation tools. However, these models are potentially vulnerable to adversarial attacks, which raises concerns about their trustworthiness. In this paper, we investigate the feasibility and challenges of applying existing adversarial attack methods to DLWF models and propose a novel framework called FABLE (Forecast Alteration By Localized targeted advErsarial attack) to address them. FABLE performs a 3D discrete wavelet decomposition to disentangle the spatial and temporal components of the data. By regulating the magnitude of adversarial perturbations across different components, FABLE produces adversarial inputs that remain closely aligned with the original inputs while steering the DLWF models toward generating the targeted forecast outcomes. Experimental results on real-world weather datasets demonstrate the effectiveness of FABLE over baseline methods across various metrics.
要約:
Large language models (LLMs) exhibit advancing capabilities in complex tasks, such as reasoning and graduate-level question answering, yet their resilience against misuse, particularly involving scientifically sophisticated risks, remains underexplored. Existing safety benchmarks typically focus either on instructions requiring minimal knowledge comprehension (e.g., ``tell me how to build a bomb") or utilize prompts that are relatively low-risk (e.g., multiple-choice or classification tasks about hazardous content). Consequently, they fail to adequately assess model safety when handling knowledge-intensive, hazardous scenarios.
To address this critical gap, we introduce SoSBench, a regulation-grounded, hazard-focused benchmark encompassing six high-risk scientific domains: chemistry, biology, medicine, pharmacology, physics, and psychology. The benchmark comprises 3,000 prompts derived from real-world regulations and laws, systematically expanded via an LLM-assisted evolutionary pipeline that introduces diverse, realistic misuse scenarios (e.g., detailed explosive synthesis instructions involving advanced chemical formulas). We evaluate frontier models within a unified evaluation framework using our SoSBench. Despite their alignment claims, advanced models consistently disclose policy-violating content across all domains, demonstrating alarmingly high rates of harmful responses (e.g., 84.9% for Deepseek-R1 and 50.3% for GPT-4.1). These results highlight significant safety alignment deficiencies and underscore urgent concerns regarding the responsible deployment of powerful LLMs.
要約:
Malicious developer intents in smart contracts constitute significant security threats to decentralized applications, leading to substantial economic losses. Prior work introduced SmartIntentNN, a deep learning model for detecting unsafe developer intents. By combining the Universal Sentence Encoder, a K-means clustering-based intent highlighting mechanism, and a Bidirectional Long Short-Term Memory (BiLSTM) network, the model achieved an F1 score of 0.8633 on an evaluation set of 10,000 real-world smart contracts across ten distinct intent categories.
This paper presents SmartIntentV2 (Smart Contract Intent Neural Network Version 2). The primary enhancement is the integration of a BERT-based pre-trained programming language model, which we domain-adaptively pre-train on a dataset of 16,000 real-world smart contracts using a Masked Language Modeling objective. SmartIntentV2 retains the BiLSTM-based multi-label classification network for intent detection. On the same evaluation set of 10,000 smart contracts, it achieves superior performance with an accuracy of 0.9789, precision of 0.9090, recall of 0.9476, and an F1 score of 0.9279, substantially outperforming its predecessor and other baseline models. Notably, SmartIntentV2 also delivers a 65.5% relative improvement in F1 score over GPT-4.1 on this specialized task. These results establish SmartIntentV2 as a new state-of-the-art model for smart contract intent detection.
要約:
LLM agents require personal information for personalization in order to effectively act on users' behalf, but this raises privacy concerns that can discourage data sharing, limiting both the autonomy levels at which agents can operate and the effectiveness of personalization. Yet the expanded design space of agent autonomy also presents opportunities to shape these effects, which remain underexplored. We conducted a $3\times3$ between-subjects experiment ($N=450$) to study how agent autonomy level influences personalization's effects on users' privacy concerns, trust, and willingness to use, as well as the underlying psychological processes. We find that risk-contingent autonomy, where the agent delegates control to users upon detecting potential privacy leakage, through improving users' perceived control, attenuates personalization's adverse effects by reducing the increase in privacy concerns and the decrease in trust. Our results suggest that designing $\textbf{agent's autonomy}$ that supports $\textbf{human autonomy}$ (both in terms of perceived control and oversight effectiveness) helps users benefit from personalization without being deterred by growing privacy concerns, contributing to the development of trustworthy LLM agents.
要約:
Model pruning, i.e., removing a subset of model weights, has become a prominent approach to reducing the memory footprint of large language models (LLMs) during inference. Notably, popular inference engines, such as vLLM, enable users to conveniently prune downloaded models before they are deployed. While the utility and efficiency of pruning methods have improved significantly, the security implications of pruning remain underexplored. In this work, for the first time, we show that modern LLM pruning methods can be maliciously exploited. In particular, an adversary can construct a model that appears benign yet, once pruned, exhibits malicious behaviors. Our method is based on the idea that the adversary can compute a proxy metric that estimates how likely each parameter is to be pruned. With this information, the adversary can first inject a malicious behavior into those parameters that are unlikely to be pruned. Then, they can repair the model by using parameters that are likely to be pruned, effectively canceling out the injected behavior in the unpruned model. We demonstrate the severity of our attack through extensive evaluation on five models; after any of the pruning in vLLM are applied (Magnitude, Wanda, and SparseGPT), it consistently exhibits strong malicious behaviors in a diverse set of attack scenarios (success rates of up to $95.7\%$ for jailbreak, $98.7\%$ for benign instruction refusal, and $99.5\%$ for targeted content injection). Our results reveal a critical deployment-time security gap and underscore the urgent need for stronger security awareness in model compression.
要約:
The scalability of current quantum networks is limited due to noisy quantum components and high implementation costs, thereby limiting the security advantages that quantum networks provide over their classical counterparts. Quantum Augmented Networks (QuANets) address this by integrating quantum components in classical network infrastructure to improve robustness and end-to-end security. To enable such integration, Quantum Anonymous Notification (QAN) is a method to anonymously inform a receiver of an incoming quantum communication. Therefore, several quantum primitives will serve as core tools, namely, quantum voting, quantum anonymous protocols, quantum secret sharing, etc. However, all current quantum protocols can be compromised in the presence of several common channel noises. In this work, we propose an improved quantum anonymous notification (QAN) protocol that utilizes rotation operations on shared GHZ states to produce an anonymous notification in an n-user quantum-augmented network. We study the behavior of this modified QAN protocol under the dephasing noise model and observe stronger resilience to false notifications than earlier QAN approaches. The QAN framework is also proposed to be integrated with a machine-learning classifier, an enhanced quantum-augmented network. Finally, we discuss how this notification layer integrates with QuANets so that receivers can allow switch-bypass handling of quantum payloads, reducing header-based information leakage and vulnerability to targeted interference at compromised switches.
要約:
Business Email Compromise (BEC) is a high-impact social engineering threat with extreme operational asymmetry: false negatives can trigger large financial losses, while false positives primarily incur investigation and delay costs. This paper compares two BEC detection paradigms under a cost-sensitive decision framework: (i) a semantic transformer approach (DistilBERT) for contextual language understanding, and (ii) a forensic psycholinguistic approach (CatBoost) using engineered linguistic and structural cues. We evaluate both on a hybrid dataset (N = 7,990) combining legitimate corporate email and AI-synthesised adversarial fraud generated across 30 BEC taxonomies, including character-level Unicode obfuscations. We add classical baselines (TF-IDF+LogReg and character n-gram+Linear SVM), an ablation study for the Smiling Assassin Score, and a homoglyph-map sensitivity analysis. DistilBERT achieves AUC = 1.0000 and F1 = 0.9981 at 7.403 ms per email on GPU; CatBoost achieves AUC = 0.9860 and F1 = 0.9382 at 0.855 ms on CPU. A three-way cost-sensitive decision policy (auto-allow, auto-block, manual review) optimises expected financial loss under a 1:5,167 false-negative-to-false-positive cost ratio.
要約:
Mixture-of-Experts (MoE) language models introduce unique challenges for safety alignment due to their sparse routing mechanisms, which can enable degenerate optimization behaviors under standard full-parameter fine-tuning. In our preliminary experiments, we observe that naively applying full-parameter safety fine-tuning to MoE models can reduce attack success rates through routing or expert dominance effects, rather than by directly repairing Safety-Critical Experts. To address this challenge, we propose RASA, a routing-aware expert-level alignment framework that explicitly repairs Safety-Critical Experts while preventing routing-based bypasses. RASA identifies experts disproportionately activated by successful jailbreaks, selectively fine-tunes only these experts under fixed routing, and subsequently enforces routing consistency with safety-aligned contexts. Across two representative MoE architectures and a diverse set of jailbreak attacks, RASA achieves near-perfect robustness, strong cross-attack generalization, and substantially reduced over-refusal, while preserving general capabilities on benchmarks such as MMLU, GSM8K, and TruthfulQA. Our results suggest that robust MoE safety alignment benefits from targeted expert repair rather than global parameter updates, offering a practical and architecture-preserving alternative to prior approaches.
要約:
The current state, emerging trends, and practical challenges of optical fiber-based power network SCADA quantum communication must be addressed to fully utilize the technological platform's potential in real-world power system SCADA communications involving massive volumes of real-time data, as well as in managing, encoding, and applications such as quantum cryptography. Quantum key distribution (QKD) is an essential part of the cybersecurity paradigm for quantum communication. Even though quantum computing with individual circuits yields probabilistic outcomes for the problem at hand, real-world datasets are complex and challenging to handle, even with telemetry. When using the cybersecurity triad of availability, confidentiality, and integrity (CIA) in reverse order (AIC), availability is given priority in electric power networks. This research assesses the use of the BB84, E91, B92, and SARG04 cryptographic protocols by applying them to large, multivariate power-system SCADA datasets and comparing the outcomes. By leveraging the variety of QKD protocols available with quantum electronics hardware, this simulation work provides a promising avenue for developing implementable frameworks and deploying SCADA/PMU networks in actual power systems.
要約:
This article, a lightly adapted version of Perplexity's response to NIST/CAISI Request for Information 2025-0035, details our observations and recommendations concerning the security of frontier AI agents. These insights are informed by Perplexity's experience operating general-purpose agentic systems used by millions of users and thousands of enterprises in both controlled and open-world environments. Agent architectures change core assumptions around code-data separation, authority boundaries, and execution predictability, creating new confidentiality, integrity, and availability failure modes. We map principal attack surfaces across tools, connectors, hosting boundaries, and multi-agent coordination, with particular emphasis on indirect prompt injection, confused-deputy behavior, and cascading failures in long-running workflows. We then assess current defenses as a layered stack: input-level and model-level mitigations, sandboxed execution, and deterministic policy enforcement for high-consequence actions. Finally, we identify standards and research gaps, including adaptive security benchmarks, policy models for delegation and privilege control, and guidance for secure multi-agent system design aligned with NIST risk management principles.
要約:
Federated Learning (FL) enables heterogeneous clients to collaboratively train a shared model without centralizing their raw data, offering an inherent level of privacy. However, gradients and model updates can still leak sensitive information, while malicious servers may mount adversarial attacks such as Byzantine manipulation. These vulnerabilities highlight the need to address differential privacy (DP) and Byzantine robustness within a unified framework. Existing approaches, however, often rely on unrealistic assumptions such as bounded gradients, require auxiliary server-side datasets, or fail to provide convergence guarantees. We address these limitations by proposing Byz-Clip21-SGD2M, a new algorithm that integrates robust aggregation with double momentum and carefully designed clipping. We prove high-probability convergence guarantees under standard $L$-smoothness and $\sigma$-sub-Gaussian gradient noise assumptions, thereby relaxing conditions that dominate prior work. Our analysis recovers state-of-the-art convergence rates in the absence of adversaries and improves utility guarantees under Byzantine and DP settings. Empirical evaluations on CNN and MLP models trained on MNIST further validate the effectiveness of our approach.
要約:
Street-level imagery contains personally identifiable information (PII), some of which is context-dependent. Existing anonymization methods either over-process images or miss subtle identifiers, while API-based solutions compromise data sovereignty. We present an agentic framework CAIAMAR (\underline{C}ontext-\underline{A}ware \underline{I}mage \underline{A}nonymization with \underline{M}ulti-\underline{A}gent \underline{R}easoning) for context-aware PII segmentation with diffusion-based anonymization, combining pre-defined processing for high-confidence cases with multi-agent reasoning for indirect identifiers. Three specialized agents coordinate via round-robin speaker selection in a Plan-Do-Check-Act (PDCA) cycle, enabling large vision-language models to classify PII based on spatial context (private vs. public property) rather than rigid category rules. The agents implement spatially-filtered coarse-to-fine detection where a scout-and-zoom strategy identifies candidates, open-vocabulary segmentation processes localized crops, and $IoU$-based deduplication ($30\%$ threshold) prevents redundant processing. Modal-specific diffusion guidance with appearance decorrelation substantially reduces re-identification (Re-ID) risks. On CUHK03-NP, our method reduces person Re-ID risk by $73\%$ ($R1$: $16.9\%$ vs. $62.4\%$ baseline). For image quality preservation on CityScapes, we achieve KID: $0.001$, and FID: $9.1$, significantly outperforming existing anonymization. The agentic workflow detects non-direct PII instances across object categories, and downstream semantic segmentation is preserved. Operating entirely on-premise with open-source models, the framework generates human-interpretable audit trails supporting EU's GDPR transparency requirements while flagging failed cases for human review.
要約:
The pervasive deployment of deep learning models across critical domains has concurrently intensified privacy concerns due to their inherent propensity for data memorization. While Membership Inference Attacks (MIAs) serve as the gold standard for auditing these privacy vulnerabilities, conventional MIA paradigms are increasingly constrained by the prohibitive computational costs of shadow model training and a precipitous performance degradation under low False Positive Rate constraints. To overcome these challenges, we introduce a novel perspective by leveraging the principles of model reprogramming as an active signal amplifier for privacy leakage. Building upon this insight, we present \texttt{ReproMIA}, a unified and efficient proactive framework for membership inference. We rigorously substantiate, both theoretically and empirically, how our methodology proactively induces and magnifies latent privacy footprints embedded within the model's representations. We provide specialized instantiations of \texttt{ReproMIA} across diverse architectural paradigms, including LLMs, Diffusion Models, and Classification Models. Comprehensive experimental evaluations across more than ten benchmarks and a variety of model architectures demonstrate that \texttt{ReproMIA} consistently and substantially outperforms existing state-of-the-art baselines, achieving a transformative leap in performance specifically within low-FPR regimes, such as an average of 5.25\% AUC and 10.68\% TPR@1\%FPR increase over the runner-up for LLMs, as well as 3.70\% and 12.40\% respectively for Diffusion Models.