要約:
This paper presents a federated learning framework secured by quantum key distribution (QKD) for wireless channel estimation and radar spectrum sensing in the next generation networks (NextG or Beyond 6G). A BB84-style protocol abstraction and pairwise additive masking are utilized to train clients' local models (CNN for channel estimation, U-Net for radar segmentation) and upload only masked model updates. The server aggregates without observing plain parameters; an eavesdropper without QKD keys cannot recover individual updates. Experiments show that secure FL achieves NMSE of 0.216 for channel estimation and 92.1\% accuracy with 0.72 mIoU for radar sensing. When an eavesdropper is present, QBER rises to $\sim$25\% and all rounds abort as intended; reconstruction error remains below $10^{-5}$, confirming correct aggregation.
要約:
The nature of personalized text-to-image models poses a unique safety challenge that generic context-blind methods are ill-equipped to handle. Such global filters create a dilemma: to prevent misuse, they are forced to damage the model's broader utility by erasing concepts entirely, causing unacceptable collateral damage.Our work presents a more precisely targeted approach, built on the principle that security should be as context-aware as the threat itself, intrinsically bound to the personalized concept. We present IDENTITYGUARD, which realizes this principle through a conditional restriction that blocks harmful content only when combined with the personalized identity, and a concept-specific watermark for precise traceability. Experiments show our approach prevents misuse while preserving the model's utility and enabling robust traceability. By moving beyond blunt, global filters, our work demonstrates a more effective and responsible path toward AI safety.
要約:
Safety alignment in large language models is typically evaluated under isolated queries, yet real-world use is inherently multi-turn. Although multi-turn jailbreaks are empirically effective, the structure of conversational safety failure remains insufficiently understood. In this work, we study safety failures from a state-space perspective and show that many multi-turn failures arise from structured contextual state evolution rather than isolated prompt vulnerabilities. We introduce STAR, a state-oriented diagnostic framework that treats dialogue history as a state transition operator and enables controlled analysis of safety behavior along interaction trajectories. Rather than optimizing attack strength, STAR provides a principled probe of how aligned models traverse the safety boundary under autoregressive conditioning. Across multiple frontier language models, we find that systems that appear robust under static evaluation can undergo rapid and reproducible safety collapse under structured multi-turn interaction. Mechanistic analysis reveals monotonic drift away from refusal-related representations and abrupt phase transitions induced by role-conditioned context. Together, these findings motivate viewing language model safety as a dynamic, state-dependent process defined over conversational trajectories.
要約:
Backdoor attacks compromise model reliability by using triggers to manipulate outputs. Trigger inversion can accurately locate these triggers via a generator and is therefore critical for backdoor defense. However, the discrete nature of text prevents existing noise-based trigger generator from being applied to nature language processing (NLP). To overcome the limitations, we employ the rich knowledge embedded in large language models (LLMs) and propose a Backdoor defender powered by LLM Trigger Generator, termed BadLLM-TG. It is optimized through prompt-driven reinforcement learning, using the victim model's feedback loss as the reward signal. The generated triggers are then employed to mitigate the backdoor via adversarial training. Experiments show that our method reduces the attack success rate by 76.2\% on average, outperforming the second-best defender by 13.7.
要約:
Over the past few years an increasing number of states in the US have adopted new privacy laws. The majority of these laws require compliance with universal opt-out mechanisms (UOOMs), which allow consumers to send legally binding opt-out signals. However, a number of laws generally do not allow UOOMs to be enabled by default. While some laws exempt privacy-protective software from this prohibition, the exemption does not apply to pre-installed software, e.g., a privacy-protective web browser bundled with an operating system. The reason for not allowing default opt-out settings for pre-installed software is to ensure that settings reflect consumers' "affirmative, freely given, and unambiguous choice," as, for example, the Colorado Privacy Act (CPA) is putting it. However, prohibiting vendors of privacy-protective software from turning on UOOMs by default can force them into committing unfair or deceptive acts or practices under the FTC Act and equivalent state laws. Thus, whether UOOMs can be turned on by default on pre-installed software should depend on consumers' privacy expectations. For pre-installed software that is creating a reasonable expectation for consumers that their privacy will be protected, the simple use of such software should be considered a valid choice for enabling UOOMs. In such software a turned-on UOOM is not a "default setting" but rather the software's inherent behavior that a consumer expects and chooses through its use. This interpretation of consumer choice is preferable under the CPA and similar laws as it grounds the notice and choice principle in the privacy expectations of consumers and enables companies to compete on better privacy for consumers.
要約:
LLM based agents are increasingly deployed in high stakes settings where they process external data sources such as emails, documents, and code repositories. This creates exposure to indirect prompt injection attacks, where adversarial instructions embedded in external content manipulate agent behavior without user awareness. A critical but underexplored dimension of this threat is concealment: since users tend to observe only an agent's final response, an attack can conceal its existence by presenting no clue of compromise in the final user facing response while successfully executing harmful actions. This leaves users unaware of the manipulation and likely to accept harmful outcomes as legitimate. We present findings from a large scale public red teaming competition evaluating this dual objective across three agent settings: tool calling, coding, and computer use. The competition attracted 464 participants who submitted 272000 attack attempts against 13 frontier models, yielding 8648 successful attacks across 41 scenarios. All models proved vulnerable, with attack success rates ranging from 0.5% (Claude Opus 4.5) to 8.5% (Gemini 2.5 Pro). We identify universal attack strategies that transfer across 21 of 41 behaviors and multiple model families, suggesting fundamental weaknesses in instruction following architectures. Capability and robustness showed weak correlation, with Gemini 2.5 Pro exhibiting both high capability and high vulnerability. To address benchmark saturation and obsoleteness, we will endeavor to deliver quarterly updates through continued red teaming competitions. We open source the competition environment for use in evaluations, along with 95 successful attacks against Qwen that did not transfer to any closed source model. We share model-specific attack data with respective frontier labs and the full dataset with the UK AISI and US CAISI to support robustness research.
要約:
In decentralized web applications, users face an inherent conflict between public verifiability and personal privacy. To participate in regulated on-chain services, users must currently disclose sensitive identity documents to centralized intermediaries, permanently linking real-world identities to public transaction histories. This binary choice between total privacy loss or total exclusion strips users of agency and exposes them to persistent surveillance. In this work, we introduce a Selective Disclosure Framework designed to restore user sovereignty by decoupling eligibility verification from identity revelation. We present ZK-Compliance, a prototype that leverages browser-based zero-knowledge proofs to shift the interaction model, enabling users to prove specific attributes (e.g., "I am over 18") locally without revealing the underlying data. We implement a user-governed Grant, Verify, Revoke lifecycle that transforms the user's mental model of compliance from a permanent data handover into a dynamic, revocable authorization session. Our evaluation shows that client-side proof generation takes under 200ms, enabling a seamless interactive experience on commodity hardware. This work provides early evidence that regulatory compliance need not come at the cost of user privacy or autonomy.
要約:
Autonomous LLM-based agents increasingly operate as long-running processes forming densely interconnected multi-agent ecosystems, whose security properties remain largely unexplored. In particular, OpenClaw, an open-source platform with over 40{,}000 active instances, has stood out recently with its persistent configurations, tool-execution privileges, and cross-platform messaging capabilities. In this work, we present ClawWorm, the first self-replicating worm attack against a production-scale agent framework, achieving a fully autonomous infection cycle initiated by a single message: the worm first hijacks the victim's core configuration to establish persistent presence across session restarts, then executes an arbitrary payload upon each reboot, and finally propagates itself to every newly encountered peer without further attacker intervention. We evaluate the attack on a controlled testbed across three distinct infection vectors and three payload types, demonstrating high success rates in end-to-end infection, sustained multi-hop propagation, and payload independence from the worm mechanism. We analyse the architectural root causes underlying these vulnerabilities and propose defence strategies targeting each identified trust boundary. Code and samples will be released upon completion of responsible disclosure.
要約:
Vision-language models (VLMs) have recently shown remarkable capabilities in visual understanding and generation, but remain vulnerable to adversarial manipulations of visual content. Prior object-hiding attacks primarily rely on suppressing or blocking region-specific representations, often creating semantic gaps that inadvertently induce hallucination, where models invent plausible but incorrect objects. In this work, we demonstrate that hallucination arises not from object absence per se, but from semantic discontinuity introduced by such suppression-based attacks. We propose a new class of \emph{background-consistent object concealment} attacks, which hide target objects by re-encoding their visual representations to be statistically and semantically consistent with surrounding background regions. Crucially, our approach preserves token structure and attention flow, avoiding representational voids that trigger hallucination. We present a pixel-level optimization framework that enforces background-consistent re-encoding across multiple transformer layers while preserving global scene semantics. Extensive experiments on state-of-the-art vision-language models show that our method effectively conceals target objects while preserving up to $86\%$ of non-target objects and reducing grounded hallucination by up to $3\times$ compared to attention-suppression-based attacks.
要約:
Always-on hardware Trojans pose a serious challenge to integrated circuit trust, as they remain active during normal operation and are difficult to detect in post-deployment settings without trusted golden references. This paper presents a reference-free detection framework based on cross-scale persistence analysis of electromagnetic (EM) side-channels, targeting always-on parasitic hardware behavior. The proposed method analyzes EM emissions across multiple time-frequency resolutions and constructs stability maps that capture the consistency of spectral features over repeated executions. Gaussian Mixture Models (GMMs) with Bayesian Information Criterion (BIC) based model selection are used to characterize statistical structure at each scale. We introduce cross-scale saturation, variability, and median mixture complexity metrics that quantify whether statistical structure evolves naturally or remains persistently anchored across resolutions. Experimental results on AES implementations show that Trojan-free designs exhibit scale-dependent variability consistent with transient switching behavior, while always-on Trojans produce persistent statistical signatures that suppress cross-scale evolution. Furthermore, different Trojan classes, such as workload-correlated leakage-information Trojans and independent ring-oscillator Trojans, exhibit distinct persistence patterns. These findings demonstrate that cross-scale persistence provides a physically interpretable and robust assurance signal for unsupervised, reference-free detection of always-on hardware Trojans.
要約:
Given limited and costly computational infrastructure, resource efficiency is a key requirement for large language models (LLMs). Efficient LLMs increase service capacity for providers and reduce latency and API costs for users. Recent resource consumption threats induce excessive generation, degrading model efficiency and harming both service availability and economic sustainability. This survey presents a systematic review of threats to resource consumption in LLMs. We further establish a unified view of this emerging area by clarifying its scope and examining the problem along the full pipeline from threat induction to mechanism understanding and mitigation. Our goal is to clarify the problem landscape for this emerging area, thereby providing a clearer foundation for characterization and mitigation.
要約:
We construct a lattice-based ciphertext-policy attribute-based encryption (CP-ABE) scheme for $\mathsf{NC}^1$ access policies with constant-size ciphertexts. Let $\lambda$ be the security parameter. For an $\mathsf{NC}^1$ circuit of depth $d$ and size $s$ on $\ell$-bit inputs, our scheme has the public-key and ciphertext sizes $O(1)$ (independent of $d$), and secret-key size $O(\ell)$, where the $O(\cdot)$ hides $\operatorname{poly}(\lambda)$ factors. As an application, we obtain a broadcast encryption scheme for $N$ users with ciphertext size $\operatorname{poly}(\lambda)$ independent of $\log N$ and key sizes $\operatorname{poly}(\lambda,\log N)$. Our construction is selectively secure in the standard model under the $\operatorname{poly}(\lambda)$-succinct LWE assumption introduced by Wee (CRYPTO~2024).
要約:
Disjunctive Hierarchical Secret Sharing (DHSS)} scheme is a type of secret sharing scheme in which the set of all participants is partitioned into disjoint subsets, and each subset is said to be a level with different degrees of trust and different thresholds. In this work, we focus on the Chinese Remainder Theorem (CRT)-based DHSS schemes due to their ability to accommodate flexible share sizes. We point out that the ideal DHSS scheme of Yang et al. (ISIT, 2024) and the asymptotically ideal DHSS scheme of Tiplea et al. (IET Information Security, 2021) are insecure. Consequently, existing CRT-based DHSS schemes either exhibit security flaws or have an information rate less than $\frac{1}{2}$. To address these limitations, we propose a CRT-based asymptotically perfect DHSS scheme that supports flexible share sizes. Notably, our scheme is asymptotically ideal when all shares are equal in size. Its information rate achieves one and it has computational security.
要約:
Central Bank Digital Currencies (CBDCs) are proposed as a public response to the uptake of privately run digital payments, with the digital euro, under development by the European Central Bank (ECB), serving as a prominent example. This momentum provides a unique opportunity to fundamentally rethink the future of money, and, assuming wide adoption, to establish payment systems that offer strong cryptographic security and privacy guarantees from the start. While the central banks in charge are investigating privacy-enhancing technologies (PETs), they often conclude that PETs are immature or insufficiently scalable. Moreover, these efforts tend to examine primitives in isolation, offering little insight into how a system using these PETs would scale. This systematisation of knowledge, therefore, provides a structured, top-down technical analysis of 36 payment system designs of complete system proposals that can inform CBDC designs or were explicitly proposed for this application. We identify recurring design patterns, technical trade-offs, and implementation challenges. Concluding, we highlight research gaps, including offline payments and post-quantum security.
要約:
The proliferation of large-scale IoT networks has been both a blessing and a curse. Not only has it revolutionized the way organizations operate by increasing the efficiency of automated procedures, but it has also simplified our daily lives. However, while IoT networks have improved convenience and connectivity, they have also increased security risk due to unauthorized devices gaining access to these networks and exploiting existing weaknesses with specific attack types. The research proposes two lightweight deep learning (DL)-based intelligent intrusion detection systems (IDS). to enhance the security of IoT networks: the proposed convolutional neural network (CNN)-based IDS and the proposed long short-term memory (LSTM)-based IDS. The research evaluated the performance of both intelligent IDSs based on DL using the CICIoT2023 dataset. DL-based intelligent IDSs successfully identify and classify various cyber threats using binary, grouped, and multi-class classification. The proposed CNN-based IDS achieves an accuracy of 99.34%, 99.02% and 98.6%, while the proposed LSTM-based IDS achieves an accuracy of 99.42%, 99.13%, and 98.68% for binary, grouped, and multi-class classification, respectively.
要約:
Solana is rapidly gaining traction among smart contract developers and users. However, its growing adoption has been accompanied by a series of major security incidents, which have spurred research into automated analysis techniques for Solana smart contracts. Unfortunately, existing approaches do not address the unique and complex account model of Solana. In this paper, we propose SseRex, the first symbolic execution vulnerability detection approach for finding Solana-specific bugs such as missing owner checks, missing signer checks, and missing key checks, as well as arbitrary cross-program invocations. Our evaluation of 8,714 bytecode-only contracts shows that our approach outperforms existing approaches and identifies potential bugs in 467 different contracts. Additionally, we analyzed 120 open-source Solana projects and conducted in-depth case studies on four of them. Our findings reveal that subtle, easily overlooked issues often serve as the root cause of severe exploits, further highlighting the need for specialized analysis tools like SseRex.
要約:
Ransomware continues encrypting files during the delay between attack onset and detection. ROFBS mitigates this problem by backing up pre-modification files in real time upon file-open events. However, because the Linux file-open path traverses multiple kernel functions, it remains unclear how the choice of hook point affects defense effectiveness.
In this study, we kept the ROFBS mechanism fixed and changed only the hook points on the Linux file-open path. We compared may_open, inode_permission, do_dentry_open, security_file_open, and xfs_file_open on AlmaLinux with XFS using three ransomware families: AvosLocker, Conti, and IceFire. We used Backup Ratio as the main metric and also compared the number of encrypted files with backups and the total number of encrypted files.
The results showed that hook-point selection substantially affected both recoverability and damage scale. For AvosLocker, security_file_open achieved the highest Backup Ratio (82.5%). For Conti and IceFire, xfs_file_open achieved the highest values (100.0% and 63.2%, respectively). Moreover, xfs_file_open minimized the total number of encrypted files for all three ransomware families.
These results indicate that, in ROFBS, the layer at which file-open events are observed is a key design factor. In particular, on XFS, hooking the filesystem-specific callback xfs_file_open may be advantageous when the goal is to reduce overall damage.
要約:
Hardware faults, specifically bit-flips in quantized weights, pose a severe reliability threat to Large Language Models (LLMs), often triggering catastrophic model collapses. We demonstrate that this vulnerability fundamentally stems from the spatial alignment between sensitive weight bits and extreme activation outliers, which causes a single hardware fault to be massively amplified. To address this, we propose Rotated Robustness (RoR), a training-free defense utilizing orthogonal Householder transformations. By applying an orthogonal rotation to the activation space, RoR geometrically smooths extreme outliers across all feature dimensions. This mechanism effectively breaks the alignment between outliers and vulnerable weights, mathematically guaranteeing original model accuracy. Extensive empirical evaluations across Llama-2/3, OPT, and Qwen families demonstrate the superior reliability of our approach. Under random bit-flip attacks, RoR reduces the stochastic collapse rate from 3.15\% to 0.00\% on Qwen2.5-7B. Furthermore, under severe targeted attacks with 50 Progressive Bit Search flips, RoR sustains robust reasoning on Llama-2-7B, maintaining a 43.9\% MMLU accuracy that nearly matches its 45.2\% unattacked accuracy, while competing defenses collapse to random guessing. Most notably, against the Single-Point Fault Attack (SPFA) -- the most aggressive targeted threat -- RoR exponentially inflates the attack complexity from a few bits to over 17,000 precise bit-flips. With a negligible storage overhead of 0.31\% and a minimal inference latency increase of 9.1\% on Llama-2-7B, RoR achieves true lossless robustness, providing a practical and highly reliable defense for LLM deployment.
要約:
Digital identity is shifting from service- and network-centric approaches toward user-centric ones that promise users increased control over their data. Despite their decentralised design, such approaches often reintroduce centralised components in different forms. This research explores this tension, i.e., the decentralisation paradox, and argues that user-centric architectures tend to redistribute rather than eliminate centralisation. Based on Critical Systems Thinking (CST), digital identity is framed as a "wicked problem" that spans across the technical, legal, social and ethical dimensions. The paper argues that understanding all these interdependencies is essential for designing reliable architectures and ensuring the next generation of digital identity goes beyond superficial decentralisation.
要約:
Semantic segmentation models are widely deployed in safety-critical applications such as autonomous driving, yet their vulnerability to backdoor attacks remains largely underexplored. Prior segmentation backdoor studies transfer threat settings from existing image classification tasks, focusing primarily on object-to-background mis-segmentation. In this work, we revisit the threats by systematically examining backdoor attacks tailored to semantic segmentation. We identify four coarse-grained attack vectors (Object-to-Object, Object-to-Background, Background-to-Object, and Background-to-Background attacks), as well as two fine-grained vectors (Instance-Level and Conditional attacks). To formalize these attacks, we introduce BADSEG, a unified framework that optimizes trigger designs and applies label manipulation strategies to maximize attack performance while preserving victim model utility. Extensive experiments across diverse segmentation architectures on benchmark datasets demonstrate that BADSEG achieves high attack effectiveness with minimal impact on clean samples. We further evaluate six representative defenses and find that they fail to reliably mitigate our attacks, revealing critical gaps in current defenses. Finally, we demonstrate that these vulnerabilities persist in recent emerging architectures, including transformer-based networks and the Segment Anything Model (SAM), thereby compromising their security. Our work reveals previously overlooked security vulnerabilities in semantic segmentation, and motivates the development of defenses tailored to segmentation-specific threat models.
要約:
In light of globalized hardware supply chains, the assurance of hardware components has gained significant interest, particularly in cryptographic applications and high-stakes scenarios. Identifying metal lines on scanning electron microscope (SEM) images of integrated circuits (ICs) is one essential step in verifying the absence of malicious circuitry in chips manufactured in untrusted environments. Due to varying manufacturing processes and technologies, such verification usually requires tuning parameters and algorithms for each target IC. Often, a machine learning model trained on images of one IC fails to accurately detect metal lines on other ICs. To address this challenge, we create SAMSEM by adapting Meta's Segment Anything Model 2 (SAM2) to the domain of IC metal line segmentation. Specifically, we develop a multi-scale segmentation approach that can handle SEM images of varying sizes, resolutions, and magnifications. Furthermore, we deploy a topology-based loss alongside pixel-based losses to focus our segmentation on electrical connectivity rather than pixel-level accuracy. Based on a hyperparameter optimization, we then fine-tune the SAM2 model to obtain a model that generalizes across different technology nodes, manufacturing materials, sample preparation methods, and SEM imaging technologies. To this end, we leverage an unprecedented dataset of SEM images obtained from 48 metal layers across 14 different ICs. When fine-tuned on seven ICs, SAMSEM achieves an error rate as low as 0.72% when evaluated on other images from the same ICs. For the remaining seven unseen ICs, it still achieves error rates as low as 5.53%. Finally, when fine-tuned on all 14 ICs, we observe an error rate of 0.62%. Hence, SAMSEM proves to be a reliable tool that significantly advances the frontier in metal line segmentation, a key challenge in post-manufacturing IC verification.
要約:
Agent skills extend local AI agents, such as Claude Code or Open Claw, with additional functionality, and their popularity has led to the emergence of dedicated skill marketplaces, similar to app stores for mobile applications. Simultaneously, automated skill scanners were introduced, analyzing the skill description available in SKILL.md, to verify their benign behavior. The results for individual market places mark up to 46.8% of skills as malicious. In this paper, we present the largest empirical security analysis of the AI agent skill ecosystem, questioning this high classification of malicious skills. Therefore, we collect 238,180 unique skills from three major distribution platforms and GitHub to systematically analyze their type and behavior. This approach substantially reduces the number of skills flagged as non-benign by security scanners to only 0.52% which remain in malicious flagged repositories. Consequently, out methodology substantially reduces false positives and provides a more robust view of the ecosystem's current risk surface. Beyond that, we extend the security analysis from the mere investigation of the skill description to a comparison of its congruence with the GitHub repository the skill is embedded in, providing additional context. Furthermore, our analysis also uncovers several, by now undocumented real-world attack vectors, namely hijacking skills hosted on abandoned GitHub repositories.
要約:
Advanced software supply chain (SSC) attacks are increasingly runtime-only and leave fragmented evidence across hosts, services, and build/dependency layers, so any single telemetry stream is inherently insufficient to reconstruct full compromise chains under realistic access and budget limits. We present SynthChain, a near-production testbed and a multi-source runtime dataset with chain-level ground truth, derived from real-world malicious packages and exploit campaigns. SynthChain covers seven representative supply-chain exploit scenarios across PyPI, npm, and a native C/C++ supply-chain case, spanning Windows and Linux, and involving four hosts and one containerized environment. Scenarios span realistic time windows from minutes to hours and are annotated with 14 MITRE ATT&CK tactics and 161 techniques (29-104 techniques per scenario). Beyond releasing the data, we quantify observability constraints by mapping each chain step to the minimum evidence needed for detection and cross-source correlation. With realistic trace availability, no single source is chain-complete: the best single source reaches only 0.391 weighted tag/step coverage and 0.403 mean chain reconstruction. Even minimal two-source fusion boosts coverage to 0.636 and reconstruction to 0.639 (approximately 1.6x gain), with consistent chain coverage/recall improvements (0.545). The corpus contains approximately 0.58M raw multi-source events and 1.50M evaluation rows, enabling controlled studies of detection under constrained telemetry. We release the dataset, ground truth, and artifacts to support reproducible, forensic-aware runtime defenses and to guide efficient detection for software supply chains.
要約:
This paper presents Ember, a serverless peer-to-peer messaging system providing end-to-end encrypted communication over a decentralised IPv6 mesh network. Ember operates without central servers, enforces data minimisation through ciphertext-only local storage and time-based message expiration, and prioritises architectural clarity, explicit trust boundaries, and practical deployability on Android. The paper describes the system architecture, cryptographic design, network model, and security properties -- including dynamic testing results demonstrating that no plaintext is recoverable from captured network traffic -- and discusses limitations and future work
要約:
As agentic artificial intelligence systems scale across globally distributed and long lived infrastructures, secure and policy compliant communication becomes a fundamental systems challenge. This challenge grows more serious in the quantum era, where the cryptographic assumptions built into today's AI deployments may not remain valid over their operational lifetime. Here, we introduce quantum secure by construction, or QSC, as a design paradigm that treats quantum secure communication as a core architectural property of agentic AI systems rather than an upgrade added later. We realize QSC through a runtime adaptive security model that combines post quantum cryptography, quantum random number generation, and quantum key distribution to secure interactions among autonomous agents operating across heterogeneous cloud, edge, and inter organizational environments. The approach is cryptographically pluggable and guided by policy, allowing the system to adjust its security posture according to infrastructure availability, regulatory constraints, and performance needs. QSC contributes a governance aware orchestration layer that selects and combines link specific cryptographic protections across the full agent lifecycle, including session bootstrap, inter agent coordination, tool invocation, and memory access. Through system level analysis and empirical evaluation, we examine the trade offs between classical and quantum secure mechanisms and show that QSC can reduce the operational complexity and cost of introducing quantum security into deployed agentic AI systems. These results position QSC as a foundational paradigm for post quantum agentic intelligence and establish a principled pathway for designing globally interoperable, resilient, and future ready intelligent systems.
要約:
Multi-modal Large Language Models (MLLMs) have achieved remarkable performance across a wide range of visual reasoning tasks, yet their vulnerability to safety risks remains a pressing concern. While prior research primarily focuses on jailbreak defenses that detect and refuse explicitly unsafe inputs, such approaches often overlook contextual safety, which requires models to distinguish subtle contextual differences between scenarios that may appear similar but diverge significantly in safety intent. In this work, we present MM-SafetyBench++, a carefully curated benchmark designed for contextual safety evaluation. Specifically, for each unsafe image-text pair, we construct a corresponding safe counterpart through minimal modifications that flip the user intent while preserving the underlying contextual meaning, enabling controlled evaluation of whether models can adapt their safety behaviors based on contextual understanding. Further, we introduce EchoSafe, a training-free framework that maintains a self-reflective memory bank to accumulate and retrieve safety insights from prior interactions. By integrating relevant past experiences into current prompts, EchoSafe enables context-aware reasoning and continual evolution of safety behavior during inference. Extensive experiments on various multi-modal safety benchmarks demonstrate that EchoSafe consistently achieves superior performance, establishing a strong baseline for advancing contextual safety in MLLMs. All benchmark data and code are available at https://echosafe-mllm.github.io.
要約:
We present KidsNanny, a two-stage multimodal content moderation architecture for child safety. Stage 1 combines a vision transformer (ViT) with an object detector for visual screening (11.7 ms); outputs are routed as text not raw pixels to Stage 2, which applies OCR and a text based 7B language model for contextual reasoning (120 ms total pipeline). We evaluate on the UnsafeBench Sexual category (1,054 images) under two regimes: vision-only, isolating Stage 1, and multimodal, evaluating the full Stage 1+2 pipeline. Stage 1 achieves 80.27% accuracy and 85.39% F1 at 11.7 ms; vision-only baselines range from 59.01% to 77.04% accuracy. The full pipeline achieves 81.40% accuracy and 86.16% F1 at 120 ms, compared to ShieldGemma-2 (64.80% accuracy, 1,136 ms) and LlavaGuard (80.36% accuracy, 4,138 ms). To evaluate text-awareness, we filter two subsets: a text+visual subset (257 images) and a text-only subset (44 images where safety depends primarily on embedded text). On text-only images, KidsNanny achieves 100% recall (25/25 positives; small sample) and 75.76% precision; ShieldGemma-2 achieves 84% recall and 60% precision at 1,136 ms. Results suggest that dedicated OCR-based reasoning may offer recall-precision advantages on text-embedded threats at lower latency, though the small text-only subset limits generalizability. By documenting this architecture and evaluation methodology, we aim to contribute to the broader research effort on efficient multimodal content moderation for child safety.
要約:
This paper provides a preparatory introduction to sheaves and topoi, written as a conceptual continuation of the author's earlier introduction to torsors and as preparatory background for the author's arXiv paper \emph{Grothendieck Topologies and Sheaf-Theoretic Foundations of Cryptographic Security:\ Attacker Models and $\Sigma$-Protocols as the First Step}~\cite{InoueSecurity}. Rather than attempting an encyclopedic survey of all of topos theory, the exposition develops those parts of the subject that are most relevant for passing from torsor-based local-to-global reasoning to sheaf-theoretic and topos-theoretic reasoning: Grothendieck topologies, sheaves, torsors over a site, descent, sheaf topoi, elementary topoi, Cartesian closed structure, subobject classifiers, and internal logic. The goal is not merely motivational. We try to develop enough genuine topos theory that the reader can understand, not only heuristically but structurally, why the later cryptographic framework of~\cite{InoueSecurity} uses Grothendieck topologies and sheaf-theoretic language. To make the note more self-contained, we also include substantial appendices on basic category theory, Yoneda's lemma, limits and colimits, equalizers and coequalizers, Kan extensions, the relation between internal logic and intuitionistic logic, and exercises with solutions. In the final part, we explain how these ideas prepare the ground for a conceptual understanding of $\Sigma$-protocols, especially in connection with local consistency, simulability, and the passage from compatible local data to global structure.
要約:
Recent progress in image generation models (IGMs) enables high-fidelity content creation but also amplifies risks, including the reproduction of copyrighted content and the generation of offensive content. Image Generation Model Unlearning (IGMU) mitigates these risks by removing harmful concepts without full retraining. Despite growing attention, the robustness under adversarial inputs, particularly image-side threats in black-box settings, remains underexplored. To bridge this gap, we present REFORGE, a black-box red-teaming framework that evaluates IGMU robustness via adversarial image prompts. REFORGE initializes stroke-based images and optimizes perturbations with a cross-attention-guided masking strategy that allocates noise to concept-relevant regions, balancing attack efficacy and visual fidelity. Extensive experiments across representative unlearning tasks and defenses demonstrate that REFORGE significantly improves attack success rate while achieving stronger semantic alignment and higher efficiency than involved baselines. These results expose persistent vulnerabilities in current IGMU methods and highlight the need for robustness-aware unlearning against multi-modal adversarial attacks. Our code is at: https://github.com/Imfatnoily/REFORGE.
要約:
Modern operating systems increasingly randomize Media Access Control (MAC) addresses to protect user privacy, fundamentally disrupting Network Access Control (NAC) systems that have relied on MAC addresses as persistent device identifiers for over two decades. This disruption affects critical enterprise environments including federal government agencies operating under FISMA, healthcare organizations subject to HIPAA, financial institutions governed by PCI-DSS, and educational networks managing large-scale BYOD deployments. This paper presents a comprehensive framework for maintaining persistent device identity in NAC environments through a RADIUS protocol-based approach that assigns and distributes a Globally Unique Identifier (GUID) to endpoints via RADIUS Access-Accept messages. The proposed architecture addresses the complete device lifecycle including initial enrollment, re-authentication across randomized addresses, device management integration, certificate-based identity binding, and device attribute correlation. We describe the framework's design across six distinct use cases -- BYOD, managed devices, VPN-based posture assessment, non-VPN posture, guest access, and IoT device profiling -- and analyze its effectiveness in maintaining device visibility, accurate license counting, and regulatory compliance under continuous MAC address randomization. The approach is compatible with existing 802.1X and MAB infrastructure, requires no client-side operating system modifications, and aligns with the recently published RFC 9797 and IEEE 802.11bh-2024 standards. Our framework enables organizations to maintain regulatory compliance while preserving the privacy benefits that MAC address randomization was designed to provide.
要約:
Greybox fuzzing has achieved success in revealing bugs and vulnerabilities in programs. However, randomized mutation strategies have limited the fuzzer's performance on structured data. Specialized fuzzers can handle complex structured data, but require additional efforts in grammar and suffer from low throughput.
In this paper, we explore the potential of utilizing the Large Language Model to enhance greybox fuzzing for structured data. We utilize the pre-trained knowledge of LLM about data conversion and format to generate new valid inputs. We further fine-tuned it with paired mutation seeds to learn structured format and mutation strategies effectively. Our LLM-based fuzzer, LLAMAFUZZ, integrates the power of LLM to understand and mutate structured data to fuzzing. We conduct experiments on the standard bug-based benchmark Magma and a wide variety of real-world programs. LLAMAFUZZ outperforms our top competitor by 41 bugs on average. We also identified 47 unique bugs across all trials. Moreover, LLAMAFUZZ demonstrated consistent performance on both bug trigger and bug reached. Compared to AFL++, LLAMAFUZZ achieved 27.19% more branches in real-world program sets on average. We also demonstrate a case study to explain how LLMs enhance the fuzzing process in terms of code coverage.
要約:
Modern confidential computing executes sensitive computation in an abstraction called confidential VMs and protects from the hypervisor, host OS, and other co-resident VMs. It has been shown that an attacker can inject malicious interrupts to break the confidentiality and integrity of confidential VMs. We present Devlore, a device interrupt isolation mechanism that protects confidential VMs from interrupt manipulation attacks. Our design employs a delegate-but-check strategy by offloading interrupt management to the hypervisor, but adds correctness checks in the trusted software. We prototype our design on Arm Confidential Computing Architecture (CCA). We evaluate it on Arm FVP to demonstrate four diverse devices attached to confidential VMs and report costs on a Rock5b board. Our case studies show the feasibility of real-world use cases and that Devlore incurs minimal overheads of 0.06% for typical integrated GPU applications.
要約:
Large Language Models (LLMs) have emerged as powerful tools for automating programming tasks, including security-related ones. However, they can also introduce vulnerabilities during code generation, fail to detect existing vulnerabilities, or report nonexistent ones. This systematic literature review investigates the security benefits and drawbacks of using LLMs for code-related tasks. In particular, it focuses on the types of vulnerabilities introduced by LLMs when generating code. Moreover, it analyzes the capabilities of LLMs to detect and fix vulnerabilities, and examines how prompting strategies impact these tasks. Finally, it examines how data poisoning attacks impact LLMs performance in the aforementioned tasks.
要約:
As Advanced Persistent Threat (APT) complexity increases, provenance data is increasingly used for detection. Anomaly-based systems are gaining attention due to their attack-knowledge-agnostic nature and ability to counter zero-day vulnerabilities. However, traditional detection paradigms, which train on offline, limited-size data, often overlook concept drift - unpredictable changes in streaming data distribution over time. This leads to high false positive rates. We propose incremental learning as a new paradigm to mitigate this issue. However, we identify FOUR CHALLENGES while integrating incremental learning as a new paradigm. First, the long-running incremental system must combat catastrophic forgetting (C1) and avoid learning malicious behaviors (C2). Then, the system needs to achieve precise alerts (C3) and reconstruct attack scenarios (C4). We present METANOIA, the first lifelong detection system that mitigates the high false positives due to concept drift. It connects pseudo edges to combat catastrophic forgetting, transfers suspicious states to avoid learning malicious behaviors, filters nodes at the path-level to achieve precise alerts, and constructs mini-graphs to reconstruct attack scenarios. Using state-of-the-art benchmarks, we demonstrate that METANOIA improves precision performance at the window-level, graph-level, and node-level by 30%, 54%, and 29%, respectively, compared to previous approaches.
要約:
Secure multi-party computation is an area in cryptography which studies how multiple parties can compare their private information without revealing it. Besides digital protocols, many unconventional protocols for secure multi-party computation using physical objects have also been developed. The vast majority of them use playing cards as the main tools. In 2024, Kaneko et al. introduced the use of a balance scale and coins in zero-knowledge proof protocols for pencil puzzles. In this paper, we extend the use of these tools to secure multi-party computation. In particular, we develop four protocols that can securely compute any $n$-variable Boolean function using a balance scale and coins.
要約:
UNO is a popular multiplayer card game. In each turn, a player has to play a card in their hand having the same number or color as the most recently played card. When having few people, adding virtual players to play the game can easily be done in UNO video games. However, this is a challenging task for physical UNO without computers. In this paper, we propose an unconventional protocol that can simulate virtual players using nothing but physical UNO cards. In particular, our protocol can uniformly select a valid card to play from each virtual player's hand at random, or report that none exists, without revealing the rest of its hand. The protocol can also be applied to simulate virtual players in other turn-based card or tile games where each player has to select a valid card or tile to play in each turn.
要約:
Fully Homomorphic Encryption (FHE) promises the ability to compute over encrypted data without revealing sensitive contents. Yet, integrating it into real-world relational databases remains elusive due to prohibitive performance overhead and the structural mismatch between mutable database records and static ciphertexts. This paper presents Hermes, a system that enables homomorphically encrypted vectorized relational queries directly inside a standard SQL engine. To bridge the relational and algebraic abstractions, Hermes introduces a SIMD-aware data model that packs multiple records per ciphertext. By embedding precomputed aggregate statistics alongside data slots, the system supports efficient rotation-free aggregations. Furthermore, to overcome ciphertext immutability, we develop data-oblivious homomorphic algorithms based on slot masking and shifting, enabling secure in-place record modifications. Hermes is implemented as native loadable functions in MySQL, marking the first practical integration of FHE into an industrial-grade relational database engine. Extensive evaluations across diverse datasets demonstrate an over 3400x increase in encryption throughput, an over 4000x speedup for tuple insertions, and a 300x acceleration for deletions when compared to conventional scalar FHE implementations.
要約:
Large Language Models (LLMs) are increasingly adopted across domains such as education, healthcare, and finance. In healthcare, LLMs support tasks including disease diagnosis, abnormality classification, and clinical decision-making. Among these, multi-abnormality classification of radiology reports is critical for clinical workflow automation and biomedical research. Leveraging strong natural language processing capabilities, LLMs enable efficient processing of unstructured medical text and reduce the administrative burden of manual report analysis. To improve performance, LLMs are often fine-tuned on private, institution-specific datasets such as radiology reports. However, this raises significant privacy concerns: LLMs may memorize training data and become vulnerable to data extraction attacks, while sharing fine-tuned models risks exposing sensitive patient information. Despite growing interest in LLMs for medical text classification, privacy-preserving fine-tuning for multi-abnormality classification remains underexplored. To address this gap, we propose a differentially private (DP) fine-tuning framework for multi-abnormality classification from free-text radiology reports. Our approach integrates differential privacy with Low-Rank Adaptation (LoRA) to efficiently fine-tune LLMs on sensitive clinical data while mitigating leakage risks. We further employ labels generated by a larger LLM to train smaller models, enabling efficient inference under strong privacy guarantees. Experiments on MIMIC-CXR and CT-RATE demonstrate the effectiveness of our DP-LoRA framework across varying privacy regimes. On MIMIC-CXR, our method achieves weighted F1-scores up to 0.89 under moderate privacy budgets, approaching non-private LoRA (0.90) and full fine-tuning (0.96), confirming that strong privacy can be achieved with only modest performance trade-offs.
要約:
With the rise of fifth-generation (5G) networks in critical applications, it is urgent to move from detection of malicious activity to systems capable of providing a reliable verdict suitable for mitigation. In this regard, understanding and interpreting machine learning (ML) models' security alerts is crucial for enabling actionable incident response orchestration. Explainable Artificial Intelligence (XAI) techniques are expected to enhance trust by providing insights into why alerts are raised. Under the umbrella of XAI, interpretability of outcomes is crucially dependent on understanding the influence of specific inputs, referred to as feature attribution. {A dominant approach to feature attribution statistically associates feature sets that can be correlated to a given alert. This paper investigates its merits against the backdrop of criticism from recent literature, in comparison with feature attribution based on logic. We extensively study two methods, SHAP and VoTE-XAI, as representatives of each feature attribution approach by analyzing their interpretations of alerts generated by an XGBoost model across three 5G-relevant datasets (5G-NIDD, MSA, and PFCP) covering multiple attack scenarios. We identify three metrics for assessing explanations: sparsity, how concise they are; stability, how consistent they are across samples from the same attack type; and efficiency, how fast an explanation is generated. Our results reveal that logic-based attributions are consistently more sparse and stable across alerts. More importantly, we found a significant divergence between features selected by SHAP and VoTE-XAI. However, none of the top-ranked features selected by SHAP were missed by VoTE-XAI. Finally, we analyze the efficiency of both methods, discussing their suitability for real-time security monitoring even in high-dimensional 5G environments (478 features).
要約:
While finetuning AI agents on interaction data -- such as web browsing or tool use -- improves their capabilities, it also introduces critical security vulnerabilities within the agentic AI supply chain. We show that adversaries can effectively poison the data collection pipeline at multiple stages to embed hard-to-detect backdoors that, when triggered, cause unsafe or malicious behavior. We formalize three realistic threat models across distinct layers of the supply chain: direct poisoning of finetuning data, pre-backdoored base models, and environment poisoning, a novel attack vector that exploits vulnerabilities specific to agentic training pipelines. Evaluated on two widely adopted agentic benchmarks, all three threat models prove effective: poisoning only a small number of demonstrations is sufficient to embed a backdoor that causes an agent to leak confidential user information with over 80\% success.
要約:
Fully homomorphic encryption (FHE) enables secure computation on encrypted data, mitigating privacy concerns in cloud and edge environments. However, due to its high compute and memory demands, extensive acceleration research has been pursued across diverse hardware platforms, especially GPUs. In this paper, we perform a microarchitectural analysis of CKKS, a popular FHE scheme, on modern GPUs. Focusing on the memory hierarchy, we demonstrate that dominant kernels remain bound by the on-chip L2 cache despite its high bandwidth, exposing a persistent inner memory wall beyond the conventional off-chip DRAM bottleneck. Further, we reveal that the overall CKKS throughput is constrained by low per-kernel hardware utilization, caused by insufficient intra-kernel parallelism. Motivated by these findings, we introduce Theodosian, a set of complementary, memory-aware optimizations that improve cache efficiency and reduce runtime overheads. Theodosian achieves 1.45--1.83x performance improvements over a highly optimized baseline, Cheddar, across representative CKKS workloads. On an RTX 5090, we reduce the bootstrapping latency for 32,768 complex numbers from 22.1ms to 15.2ms, and further to 12.8ms with additional algorithmic optimizations, establishing a new state-of-the-art GPU performance to the best of our knowledge.
要約:
Transport Layer Security (TLS) is fundamental to secure online communication, yet vulnerabilities in certificate validation that enable Man-in-the-Middle (MitM) attacks remain a pervasive threat in Android apps. Existing detection tools are hampered by low-coverage UI interaction, costly instrumentation, and a lack of scalable root-cause analysis. We present Okara, a framework that leverages foundation models to automate the detection and deep attribution of TLS MitM Vulnerabilities (TMVs). Okara's detection component, TMV-Hunter, employs foundation model-driven GUI agents to achieve high-coverage app interaction, enabling efficient vulnerability discovery at scale. Deploying TMV-Hunter on 37,349 apps from Google Play and a third-party store revealed 8,374 (22.42%) vulnerable apps. Our measurement shows these vulnerabilities are widespread across all popularity levels, affect critical functionalities like authentication and code delivery, and are highly persistent with a median vulnerable lifespan of over 1,300 days. Okara's attribution component, TMV-ORCA, combines dynamic instrumentation with a novel LLM-based classifier to locate and categorize vulnerable code according to a comprehensive new taxonomy. This analysis attributes 41% of vulnerabilities to third-party libraries and identifies recurring insecure patterns, such as empty trust managers and flawed hostname verification. We have initiated a large-scale responsible disclosure effort and will release our tools and datasets to support further research and mitigation.
要約:
Warning: This article includes red-teaming experiments, which contain examples of compromised LLM responses that may be offensive or upsetting.
Large Language Models (LLMs) have the potential to create harmful content, such as generating sophisticated phishing emails and assisting in writing code of harmful computer viruses. Thus, it is crucial to ensure their safe and responsible response generation. To reduce the risk of generating harmful or irresponsible content, researchers have developed techniques such as reinforcement learning with human feedback to align LLM's outputs with human values and preferences. However, it is still undetermined whether such measures are sufficient to prevent LLMs from generating interesting responses. In this study, we propose Amnesia, a lightweight activation-space adversarial attack that manipulates internal transformer states to bypass existing safety mechanisms in open-weight LLMs. Through experimental analysis on state-of-the-art, open-weight LLMs, we demonstrate that our attack effectively circumvents existing safeguards, enabling the generation of harmful content without the need for any fine-tuning or additional training. Our experiments on benchmark datasets show that the proposed attack can induce various antisocial behaviors in LLMs. These findings highlight the urgent need for more robust security measures in open-weight LLMs and underscore the importance of continued research to prevent their potential misuse.
要約:
CTI-REALM (Cyber Threat Real World Evaluation and LLM Benchmarking) is a benchmark designed to evaluate AI agents' ability to interpret cyber threat intelligence (CTI) and develop detection rules. The benchmark provides a realistic environment that replicates the security analyst workflow. This enables agents to examine CTI reports, execute queries, understand schema structures, and construct detection rules. Evaluation involves emulated attacks of varying complexity across Linux systems, cloud platforms, and Azure Kubernetes Service (AKS), with ground truth data for accurate assessment. Agent performance is measured through both final detection results and trajectory-based rewards that capture decision-making effectiveness. This work demonstrates the potential of AI agents to support labor-intensive aspects of detection engineering. Our comprehensive evaluation of 16 frontier models shows that Claude Opus 4.6 (High) achieves the highest overall reward (0.637), followed by Claude Opus 4.5 (0.624) and the GPT-5 family. An ablation study confirms that CTI-specific tools significantly improve agent performance, a variance analysis across repeated runs demonstrates result stability. Finally, a memory augmentation study shows that seeded context can close 33\% of the performance gap between smaller and larger models.
要約:
In [4] Camps-Moreno et al. treated (relative) generalized Hamming weights of codes from extended norm-trace curves and they gave examples of resulting good asymmetric quantum error-correcting codes employing information on the relative distances. In the present paper we study ramp secret sharing schemes which are objects that require an analysis of higher relative weights and we show that not only do schemes defined from one-point algebraic geometric codes from extended norm-trace curves have good parameters, they also posses a second layer of security along the lines of [11]. It is left undecided in [4, page 2889] if the ``footprint-like approach'' as employed by Camps-Moreno herein is strictly better for codes related to extended norm-trace codes than the general approach for treating one-point algebraic geometric codes and their likes as presented in [12]. We demonstrate that the method used in [4] to estimate (relative) generalized Hamming weights of codes from extended norm-trace curves can be viewed as a clever application of the enhanced Goppa bound in [12] rather than a competing approach.
要約:
In this paper, we investigate the problem of distributed learning (DL) in the presence of Byzantine attacks. For this problem, various robust bounded aggregation (RBA) rules have been proposed at the central server to mitigate the impact of Byzantine attacks. However, current DL methods apply RBA rules for the local gradients from the honest devices and the disruptive information from Byzantine devices, and the learning performance degrades significantly when the local gradients of different devices vary considerably from each other. To overcome this limitation, we propose a new DL method to cope with Byzantine attacks based on coded robust aggregation (CRA-DL). Before training begins, the training data are allocated to the devices redundantly. During training, in each iteration, the honest devices transmit coded gradients to the server computed from the allocated training data, and the server then aggregates the information received from both honest and Byzantine devices using RBA rules. In this way, the global gradient can be approximately recovered at the server to update the global model. Compared with current DL methods applying RBA rules, the improvement of CRA-DL is attributed to the fact that the coded gradients sent by the honest devices are closer to each other. This closeness enhances the robustness of the aggregation against Byzantine attacks, since Byzantine messages tend to be significantly different from those of honest devices in this case. We theoretically analyze the convergence performance of CRA-DL. Finally, we present numerical results to verify the superiority of the proposed method over existing baselines, showing its enhanced learning performance under Byzantine attacks.
要約:
Safety evaluation of large language models (LLMs) increasingly relies on LLM-as-a-Judge frameworks, but the high cost of frontier models limits scalability. We propose a cost-efficient multi-agent judging framework that employs Small Language Models (SLMs) through structured debates among critic, defender, and judge agents. To rigorously assess safety judgments, we construct HAJailBench, a large-scale human-annotated jailbreak benchmark comprising 12,000 adversarial interactions across diverse attack methods and target models. The dataset provides fine-grained, expert-labeled ground truth for evaluating both safety robustness and judge reliability. Our SLM-based framework achieves agreement comparable to GPT-4o judges on HAJailBench while substantially reducing inference cost. Ablation results show that three rounds of debate yield the optimal balance between accuracy and efficiency. These findings demonstrate that structured, value-aligned debate enables SLMs to capture semantic nuances of jailbreak attacks and that HAJailBench offers a reliable foundation for scalable LLM safety evaluation.
要約:
GitHub Actions is a widely used platform to automate the build and deployment of software projects through configurable workflows. As the platform's popularity grows, it also becomes a target of choice for software supply chain attacks. These attacks exploit excessive permissions, ambiguous versions or the absence of artifact integrity checks to compromise the workflows. In response to these attacks, several security scanners have emerged to help developers harden their workflows. In this paper, we perform the first systematic comparison of 9 GitHub Actions Workflows security scanners. We compare them regarding scope (which security weaknesses they target), detection capabilities (how many weaknesses they detect), and performance (how long they take to scan a workflow). In order to compare the scanners on a common ground, we first establish a classification of 10 common security weaknesses that can be found in GitHub Actions Workflows. Then, we run the scanners against a curated set of 2722 workflows. Our study reveals that the landscape of GitHub Actions Workflows security scanners is very diverse, with both general purpose and focused scanners. More importantly, we provide evidence that these scanners implement fundamentally different analysis strategies, leading to major gaps regarding the nature and the number of reported security weaknesses. Based on these empirical evidence we make actionable recommendations for developers to harden their GitHub Actions Workflows.
要約:
We introduce MetaDOAR, a lightweight meta-controller that augments the Double Oracle / PSRO paradigm with a learned, partition-aware filtering layer and Q-value caching to enable scalable multi-agent reinforcement learning on very large cyber-network environments. MetaDOAR learns a compact state projection from per node structural embeddings to rapidly score and select a small subset of devices (a top-k partition) on which a conventional low-level actor performs focused beam search utilizing a critic agent. Selected candidate actions are evaluated with batched critic forwards and stored in an LRU cache keyed by a quantized state projection and local action identifiers, dramatically reducing redundant critic computation while preserving decision quality via conservative k-hop cache invalidation. Empirically, MetaDOAR attains higher player payoffs than SOTA baselines on large network topologies, without significant scaling issues in terms of memory usage or training time. This contribution provide a practical, theoretically motivated path to efficient hierarchical policy learning for large-scale networked decision problems.