要約:
Artificial intelligence (AI) agents are increasingly capable of initiating financial transactions on behalf of users or other agents. This evolution introduces a fundamental challenge: verifying both the authenticity of an autonomous agent and the true intent behind its transactions in a decentralized, trustless environment. Traditional payment systems assume human authorization, but autonomous, agent-led payments remove that safeguard. This paper presents a blockchain-based framework that cryptographically authenticates and verifies the intent of every AI-initiated transaction. The proposed system leverages decentralized identity (DID) standards and verifiable credentials to establish agent identities, on-chain intent proofs to record user authorization, and zero-knowledge proofs (ZKPs) to preserve privacy while ensuring policy compliance. Additionally, secure execution environments (TEE-based attestations) guarantee the integrity of agent reasoning and execution. The hybrid on-chain/off-chain architecture provides an immutable audit trail linking user intent to payment outcome. Through qualitative analysis, the framework demonstrates strong resistance to impersonation, unauthorized transactions, and misalignment of intent. This work lays the foundation for secure, auditable, and intent-aware autonomous economic agents, enabling a future of verifiable trust and accountability in AI-driven financial ecosystems.
要約:
This paper presents a comprehensive comparative analysis of two dominant blockchain consensus mechanisms, Proof of Work (PoW) and Proof of Stake (PoS), evaluated across seven critical metrics: energy use, security, transaction speed, scalability, centralization risk, environmental impact, and transaction fees. Utilizing recent academic research and real-world blockchain data, the study highlights that PoW offers robust, time-tested security but suffers from high energy consumption, slower throughput, and centralization through mining pools. In contrast, PoS demonstrates improved scalability and efficiency, significantly reduced environmental impact, and more stable transaction fees, however it raises concerns over validator centralization and long-term security maturity. The findings underscore the trade-offs inherent in each mechanism and suggest hybrid designs may combine PoW's security with PoS's efficiency and sustainability. The study aims to inform future blockchain infrastructure development by striking a balance between decentralization, performance, and ecological responsibility.
要約:
This work presents AnonLFI 2.0, a modular pseudonymization framework for CSIRTs that uses HMAC SHA256 to generate strong and reversible pseudonyms, preserves XML and JSON structures, and integrates OCR and technical recognizers for PII and security artifacts. In two case studies involving OCR applied to PDF documents and an OpenVAS XML report, the system achieved perfect precision and F1 scores of 76.5 and 92.13, demonstrating its effectiveness for securely preparing complex cybersecurity datasets.
要約:
This paper proposes an automated LLM-based method to extract and structure vulnerabilities from OpenVAS and Tenable WAS scanner reports, converting unstructured data into a standardized format for risk management. In an evaluation using a report with 34 vulnerabilities, GPT-4.1 and DeepSeek achieved the highest similarity to the baseline (ROUGE-L greater than 0.7). The method demonstrates feasibility in transforming complex reports into usable datasets, enabling effective prioritization and future anonymization of sensitive data.
要約:
Retrieval-augmented generation (RAG) systems have become widely used for enhancing large language model capabilities, but they introduce significant security vulnerabilities through prompt injection attacks. We present a comprehensive benchmark for evaluating prompt injection risks in RAG-enabled AI agents and propose a multi-layered defense framework. Our benchmark includes 847 adversarial test cases across five attack categories: direct injection, context manipulation, instruction override, data exfiltration, and cross-context contamination. We evaluate three defense mechanisms: content filtering with embedding-based anomaly detection, hierarchical system prompt guardrails, and multi-stage response verification, across seven state-of-the-art language models. Our combined framework reduces successful attack rates from 73.2% to 8.7% while maintaining 94.3% of baseline task performance. We release our benchmark dataset and defense implementation to support future research in AI agent security.
要約:
The rapid adoption of multi-cloud environments has amplified risks associated with privileged access mismanagement. Traditional Privileged Access Management (PAM) solutions based on Attribute-Based Access Control (ABAC) exhibit cubic O(n^3) complexity, rendering real-time privilege analysis intractable at enterprise scale. We present a novel PAM framework integrating NIST's Next Generation Access Control (NGAC) with hypergraph semantics to address this scalability crisis. Our approach leverages hypergraphs with labeled hyperedges to model complex, multi-dimensional privilege relationships, achieving sub-linear O(sqrt n) traversal complexity and O(nlogn) detection time-rigorously proven through formal complexity analysis. We introduce a 3-Dimensional Privilege Analysis framework encompassing Attack Surface, Attack Window, and Attack Identity to systematically identify privilege vulnerabilities. Experimental validation on AWS-based systems with 200-4000 users demonstrates 10x improvement over ABAC and 4x improvement over standard NGAC-DAG, enabling sub-second privilege detection at scale. Real-world use cases validate detection of privilege escalation chains, over-privileged users, and lateral movement pathways in multi-cloud infrastructures.
要約:
Directed Acyclic Graph (DAG)-based Byzantine Fault-Tolerant (BFT) protocols have emerged as promising solutions for high-throughput blockchains. By decoupling data dissemination from transaction ordering and constructing a well-connected DAG in the mempool, these protocols enable zero-message ordering and implicit view changes. However, we identify a fundamental liveness vulnerability: an adversary can trigger mempool explosions to prevent transaction commitment, ultimately compromising the protocol's liveness.
In response, this work presents Lifefin, a generic and self-stabilizing protocol designed to integrate seamlessly with existing DAG-based BFT protocols and circumvent such vulnerabilities. Lifefin leverages the Agreement on Common Subset (ACS) mechanism, allowing nodes to escape mempool explosions by committing transactions with bounded resource usage even in adverse conditions. As a result, Lifefin imposes (almost) zero overhead in typical cases while effectively eliminating liveness vulnerabilities.
To demonstrate the effectiveness of Lifefin, we integrate it into two state-of-the-art DAG-based BFT protocols, Sailfish and Mysticeti, resulting in two enhanced variants: Sailfish-Lifefin and Mysticeti-Lifefin. We implement these variants and compare them with the original Sailfish and Mysticeti systems. Our evaluation demonstrates that Lifefin achieves comparable transaction throughput while introducing only minimal additional latency to resist similar attacks.
要約:
Digital agriculture is transforming the way we grow food by utilizing technology to make farming more efficient, sustainable, and productive. This modern approach to agriculture generates a wealth of valuable data that could help address global food challenges, but farmers are hesitant to share it due to privacy concerns. This limits the extent to which researchers can learn from this data to inform improvements in farming. This paper presents the Digital Agriculture Sandbox, a secure online platform that solves this problem. The platform enables farmers (with limited technical resources) and researchers to collaborate on analyzing farm data without exposing private information. We employ specialized techniques such as federated learning, differential privacy, and data analysis methods to safeguard the data while maintaining its utility for research purposes. The system enables farmers to identify similar farmers in a simplified manner without needing extensive technical knowledge or access to computational resources. Similarly, it enables researchers to learn from the data and build helpful tools without the sensitive information ever leaving the farmer's system. This creates a safe space where farmers feel comfortable sharing data, allowing researchers to make important discoveries. Our platform helps bridge the gap between maintaining farm data privacy and utilizing that data to address critical food and farming challenges worldwide.
要約:
Generative AI is reshaping offensive cybersecurity by enabling autonomous red team agents that can plan, execute, and adapt during penetration tests. However, existing approaches face trade-offs between generality and specialization, and practical deployments reveal challenges such as hallucinations, context limitations, and ethical concerns. In this work, we introduce a novel command & control (C2) architecture leveraging the Model Context Protocol (MCP) to coordinate distributed, adaptive reconnaissance agents covertly across networks. Notably, we find that our architecture not only improves goal-directed behavior of the system as whole, but also eliminates key host and network artifacts that can be used to detect and prevent command & control behavior altogether. We begin with a comprehensive review of state-of-the-art generative red teaming methods, from fine-tuned specialist models to modular or agentic frameworks, analyzing their automation capabilities against task-specific accuracy. We then detail how our MCP-based C2 can overcome current limitations by enabling asynchronous, parallel operations and real-time intelligence sharing without periodic beaconing. We furthermore explore advanced adversarial capabilities of this architecture, its detection-evasion techniques, and address dual-use ethical implications, proposing defensive measures and controlled evaluation in lab settings. Experimental comparisons with traditional C2 show drastic reductions in manual effort and detection footprint. We conclude with future directions for integrating autonomous exploitation, defensive LLM agents, predictive evasive maneuvers, and multi-agent swarms. The proposed MCP-enabled C2 framework demonstrates a significant step toward realistic, AI-driven red team operations that can simulate advanced persistent threats while informing the development of next-generation defensive systems.
要約:
The rapid growth of quantum computing poses a threat to the cryptographic foundations of digital systems, requiring the development of secure and scalable electronic voting (evoting) frameworks. We introduce a post-quantum-secure evoting architecture that integrates Falcon lattice-based digital signatures, biometric authentication via MobileNetV3 and AdaFace, and a permissioned blockchain for tamper-proof vote storage. Voter registration involves capturing facial embeddings, which are digitally signed using Falcon and stored on-chain to ensure integrity and non-repudiation. During voting, real-time biometric verification is performed using anti-spoofing techniques and cosine-similarity matching. The system demonstrates low latency and robust spoof detection, monitored through Prometheus and Grafana for real-time auditing. The average classification error rates (ACER) are below 3.5% on the CelebA Spoof dataset and under 8.2% on the Wild Face Anti-Spoofing (WFAS) dataset. Blockchain anchoring incurs minimal gas overhead, approximately 3.3% for registration and 0.15% for voting, supporting system efficiency, auditability, and transparency. The experimental results confirm the system's scalability, efficiency, and resilience under concurrent loads. This approach offers a unified solution to address key challenges in voter authentication, data integrity, and quantum-resilient security for digital systems.
要約:
Traditional threat modeling remains reactive-focused on known TTPs and past incident data, while threat prediction and forecasting frameworks are often disconnected from operational or architectural artifacts. This creates a fundamental weakness: the most serious cyber threats often do not arise from what is known, but from what is assumed, overlooked, or not yet conceived, and frequently originate from the future, such as artificial intelligence, information warfare, and supply chain attacks, where adversaries continuously develop new exploits that can bypass defenses built on current knowledge. To address this mental gap, this paper introduces the theory and methodology of Future-Back Threat Modeling (FBTM). This predictive approach begins with envisioned future threat states and works backward to identify assumptions, gaps, blind spots, and vulnerabilities in the current defense architecture, providing a clearer and more accurate view of impending threats so that we can anticipate their emergence and shape the future we want through actions taken now. The proposed methodology further aims to reveal known unknowns and unknown unknowns, including tactics, techniques, and procedures that are emerging, anticipated, and plausible. This enhances the predictability of adversary behavior, particularly under future uncertainty, helping security leaders make informed decisions today that shape more resilient security postures for the future.
要約:
The growing misuse of Vision-Language Models (VLMs) has led providers to deploy multiple safeguards, including alignment tuning, system prompts, and content moderation. However, the real-world robustness of these defenses against adversarial attacks remains underexplored. We introduce Multi-Faceted Attack (MFA), a framework that systematically exposes general safety vulnerabilities in leading defense-equipped VLMs such as GPT-4o, Gemini-Pro, and Llama-4. The core component of MFA is the Attention-Transfer Attack (ATA), which hides harmful instructions inside a meta task with competing objectives. We provide a theoretical perspective based on reward hacking to explain why this attack succeeds. To improve cross-model transferability, we further introduce a lightweight transfer-enhancement algorithm combined with a simple repetition strategy that jointly bypasses both input-level and output-level filters without model-specific fine-tuning. Empirically, we show that adversarial images optimized for one vision encoder transfer broadly to unseen VLMs, indicating that shared visual representations create a cross-model safety vulnerability. Overall, MFA achieves a 58.5% success rate and consistently outperforms existing methods. On state-of-the-art commercial models, MFA reaches a 52.8% success rate, surpassing the second-best attack by 34%. These results challenge the perceived robustness of current defense mechanisms and highlight persistent safety weaknesses in modern VLMs. Code: https://github.com/cure-lab/MultiFacetedAttack
要約:
As Law Enforcement Agencies advance in cryptocurrency forensics, criminal actors aiming to conceal illicit fund movements increasingly turn to "mixin" services or privacy-based cryptocurrencies. Monero stands out as a leading choice due to its strong privacy preserving and untraceability properties, making conventional blockchain analysis ineffective. Understanding the behavior and operational patterns of criminal actors within Monero is therefore challenging and it is essential to support future investigative strategies and disrupt illicit activities. In this work, we propose a case study in which we leverage a novel graph-based methodology to extract structural and temporal patterns from Monero transactions linked to already discovered criminal activities. By building Address-Ring-Transaction graphs from flagged transactions, we extract structural and temporal features and use them to train Machine Learning models capable of detecting similar behavioral patterns that could highlight criminal modus operandi. This represents a first partial step toward developing analytical tools that support investigative efforts in privacy-preserving blockchain ecosystems
要約:
System prompts are critical for guiding the behavior of Large Language Models (LLMs), yet they often contain proprietary logic or sensitive information, making them a prime target for extraction attacks. Adversarial queries can successfully elicit these hidden instructions, posing significant security and privacy risks. Existing defense mechanisms frequently rely on heuristics, incur substantial computational overhead, or are inapplicable to models accessed via black-box APIs. This paper introduces a novel framework for hardening system prompts through shield appending, a lightweight approach that adds a protective textual layer to the original prompt. Our core contribution is the formalization of prompt hardening as a utility-constrained optimization problem. We leverage an LLM-as-optimizer to search the space of possible SHIELDs, seeking to minimize a leakage metric derived from a suite of adversarial attacks, while simultaneously preserving task utility above a specified threshold, measured by semantic fidelity to baseline outputs. This black-box, optimization-driven methodology is lightweight and practical, requiring only API access to the target and optimizer LLMs. We demonstrate empirically that our optimized SHIELDs significantly reduce prompt leakage against a comprehensive set of extraction attacks, outperforming established baseline defenses without compromising the model's intended functionality. Our work presents a paradigm for developing robust, utility-aware defenses in the escalating landscape of LLM security. The code is made public on the following link: https://github.com/psm-defense/psm
要約:
Multimodal Large Language Models (MLLMs) have demonstrated impressive capabilities in cross-modal understanding, but remain vulnerable to adversarial attacks through visual inputs despite robust textual safety mechanisms. These vulnerabilities arise from two core weaknesses: the continuous nature of visual representations, which allows for gradient-based attacks, and the inadequate transfer of text-based safety mechanisms to visual content. We introduce Q-MLLM, a novel architecture that integrates two-level vector quantization to create a discrete bottleneck against adversarial attacks while preserving multimodal reasoning capabilities. By discretizing visual representations at both pixel-patch and semantic levels, Q-MLLM blocks attack pathways and bridges the cross-modal safety alignment gap. Our two-stage training methodology ensures robust learning while maintaining model utility. Experiments demonstrate that Q-MLLM achieves significantly better defense success rate against both jailbreak attacks and toxic image attacks than existing approaches. Notably, Q-MLLM achieves perfect defense success rate (100\%) against jailbreak attacks except in one arguable case, while maintaining competitive performance on multiple utility benchmarks with minimal inference overhead. This work establishes vector quantization as an effective defense mechanism for secure multimodal AI systems without requiring expensive safety-specific fine-tuning or detection overhead. Code is available at https://github.com/Amadeuszhao/QMLLM.
要約:
As LLMs become more common, non-expert users can pose risks, prompting extensive research into jailbreak attacks. However, most existing black-box jailbreak attacks rely on hand-crafted heuristics or narrow search spaces, which limit scalability. Compared with prior attacks, we propose Game-Theory Attack (GTA), an scalable black-box jailbreak framework. Concretely, we formalize the attacker's interaction against safety-aligned LLMs as a finite-horizon, early-stoppable sequential stochastic game, and reparameterize the LLM's randomized outputs via quantal response. Building on this, we introduce a behavioral conjecture "template-over-safety flip": by reshaping the LLM's effective objective through game-theoretic scenarios, the originally safety preference may become maximizing scenario payoffs within the template, which weakens safety constraints in specific contexts. We validate this mechanism with classical game such as the disclosure variant of the Prisoner's Dilemma, and we further introduce an Attacker Agent that adaptively escalates pressure to increase the ASR. Experiments across multiple protocols and datasets show that GTA achieves over 95% ASR on LLMs such as Deepseek-R1, while maintaining efficiency. Ablations over components, decoding, multilingual settings, and the Agent's core model confirm effectiveness and generalization. Moreover, scenario scaling studies further establish scalability. GTA also attains high ASR on other game-theoretic scenarios, and one-shot LLM-generated variants that keep the model mechanism fixed while varying background achieve comparable ASR. Paired with a Harmful-Words Detection Agent that performs word-level insertions, GTA maintains high ASR while lowering detection under prompt-guard models. Beyond benchmarks, GTA jailbreaks real-world LLM applications and reports a longitudinal safety monitoring of popular HuggingFace LLMs.
要約:
Integrated sensing and communication (ISAC) will be central to 6G-enabled transportation, providing both seamless connectivity and high-precision sensing. However, this tight integration exposes attack points not encountered in pure sensing and communication systems. In this article, we identify unique ISAC-induced security challenges and opportunities in three interrelated domains: cyber-physical (where manipulation of sensors and actuators can mislead perception and control), physical-layer (where over-the-air signals are vulnerable to spoofing and jamming) and protocol (where complex cryptographic protocols cannot detect lower-layer attacks). Building on these insights, we put forward a multi-domain security vision for 6G transportation and propose an integrated security framework that unifies protection across domains.
要約:
The adoption of Vision-Language Models (VLMs) in embodied AI agents, while being effective, brings safety concerns such as jailbreaking. Prior work have explored the possibility of directly jailbreaking the embodied agents through elaborated multi-modal prompts. However, no prior work has studied or even reported indirect jailbreaks in embodied AI, where a black-box attacker induces a jailbreak without issuing direct prompts to the embodied agent. In this paper, we propose, for the first time, indirect environmental jailbreak (IEJ), a novel attack to jailbreak embodied AI via indirect prompt injected into the environment, such as malicious instructions written on a wall. Our key insight is that embodied AI does not ''think twice'' about the instructions provided by the environment -- a blind trust that attackers can exploit to jailbreak the embodied agent. We further design and implement open-source prototypes of two fully-automated frameworks: SHAWSHANK, the first automatic attack generation framework for the proposed attack IEJ; and SHAWSHANK-FORGE, the first automatic benchmark generation framework for IEJ. Then, using SHAWSHANK-FORGE, we automatically construct SHAWSHANK-BENCH, the first benchmark for indirectly jailbreaking embodied agents. Together, our two frameworks and one benchmark answer the questions of what content can be used for malicious IEJ instructions, where they should be placed, and how IEJ can be systematically evaluated. Evaluation results show that SHAWSHANK outperforms eleven existing methods across 3,957 task-scene combinations and compromises all six tested VLMs. Furthermore, current defenses only partially mitigate our attack, and we have responsibly disclosed our findings to all affected VLM vendors.
要約:
Blockchain interoperability is increasingly recognized as the centerpiece for robust interactions among decentralized services. Blockchain ledgers are generally tamper-proof and thus enforce non-repudiation for transactions recorded within the same network. However, such a guarantee does not hold for cross-blockchain transactions. When disruptions occur due to malicious activities or system failures within one blockchain network, foreign networks can take advantage by denying legitimate claims or mounting fraudulent liabilities against the defenseless network. In response, this paper introduces InterSnap, a novel blockchain snapshot archival methodology, for enabling auditability of crossblockchain transactions, enforcing non-repudiation. InterSnap introduces cross-chain transaction receipts that ensure their irrefutability. Snapshots of ledger data along with these receipts are utilized as non-repudiable proof of bilateral agreements among different networks. InterSnap enhances system resilience through a distributed snapshot generation process, need-based snapshot scheduling process, and archival storage and sharing via decentralized platforms. Through a prototype implementation based on Hyperledger Fabric, we conducted experiments using on-premise machines, AWS public cloud instances, as well as a private cloud infrastructure. We establish that InterSnap can recover from malicious attacks while preserving crosschain transaction receipts. Additionally, our proposed solution demonstrates adaptability to increasing loads while securely transferring snapshot archives with minimal overhead.
要約:
In the era of digital communication, steganography allows covert embedding of data within media files. Adaptive Pixel Value Differencing (APVD) is a steganographic method valued for its high embedding capacity and invisibility, posing challenges for traditional steganalysis. This paper proposes a deep learning-based approach for detecting APVD steganography and performing reverse steganalysis, which reconstructs the hidden payload. We present a Convolutional Neural Network (CNN) with an attention mechanism and two output heads for simultaneous stego detection and payload recovery. Trained and validated on 10,000 images from the BOSSbase and UCID datasets, our model achieves a detection accuracy of 96.2 percent. It also reconstructs embedded payloads with up to 93.6 percent recovery at lower embedding densities. Results indicate a strong inverse relationship between payload size and recovery accuracy. This study reveals a vulnerability in adaptive steganography and provides a tool for digital forensic analysis, while encouraging reassessment of data security in the age of AI-driven techniques.
要約:
Risks associated with the use of AI, ranging from algorithmic bias to model hallucinations, have received much attention and extensive research across the AI community, from researchers to end-users. However, a gap exists in the systematic assessment of supply chain risks associated with the complex web of data sources, pre-trained models, agents, services, and other systems that contribute to the output of modern AI systems. This gap is particularly problematic when AI systems are used in critical applications, such as the food supply, healthcare, utilities, law, insurance, and transport.
We survey the current state of AI risk assessment and management, with a focus on the supply chain of AI and risks relating to the behavior and outputs of the AI system. We then present a proposed taxonomy specifically for categorizing AI supply chain entities. This taxonomy helps stakeholders, especially those without extensive AI expertise, to "consider the right questions" and systematically inventory dependencies across their organization's AI systems. Our contribution bridges a gap between the current state of AI governance and the urgent need for actionable risk assessment and management of AI use in critical applications.
要約:
We introduce cyclic diagram monoids, a generalisation of classical diagram monoids that adds elements of arbitrary period by including internal components, with a view towards cryptography. We classify their simple representations and compute their dimensions in terms of the underlying diagram algebra. These go towards showing that cyclic diagram monoids possess representation gaps of exponential growth, which quantify their resistance as platforms against linear attacks on cryptographic protocols that exploit small dimensional representations.
要約:
An important proof technique in the random oracle model involves reprogramming it on hard to predict inputs and arguing that an attacker cannot detect that this occurred. In the quantum setting, a particularly challenging version of this considers adaptive reprogramming wherein the points to be reprogrammed (or the output values they should be programmed to) are dependent on choices made by the adversary. Some quantum frameworks for analyzing adaptive reprogramming were given by Unruh (CRYPTO 2014, EUROCRYPT 2015), Grilo-H\"ovelmanns-H\"ulsing-Majenz (ASIACRYPT 2021), and Pan-Zeng (PKC 2024). We show, counterintuitively, that these adaptive results follow from the \emph{nonadaptive} one-way to hiding theorem of Ambainis-Hamburg-Unruh (CRYPTO 2019). These implications contradict beliefs (whether stated explicitly or implicitly) that some properties of the adaptive frameworks cannot be provided by the Ambainis-Hamburg-Unruh result.
要約:
We investigate how to optimally design local differential privacy (LDP) mechanisms that reduce data unfairness and thereby improve fairness in downstream classification. We first derive a closed-form optimal mechanism for binary sensitive attributes and then develop a tractable optimization framework that yields the corresponding optimal mechanism for multi-valued attributes. As a theoretical contribution, we establish that for discrimination-accuracy optimal classifiers, reducing data unfairness necessarily leads to lower classification unfairness, thus providing a direct link between privacy-aware pre-processing and classification fairness. Empirically, we demonstrate that our approach consistently outperforms existing LDP mechanisms in reducing data unfairness across diverse datasets and fairness metrics, while maintaining accuracy close to that of non-private models. Moreover, compared with leading pre-processing and post-processing fairness methods, our mechanism achieves a more favorable accuracy-fairness trade-off while simultaneously preserving the privacy of sensitive attributes. Taken together, these results highlight LDP as a principled and effective pre-processing fairness intervention technique.
要約:
This paper proposes an optimization of Quantum Key Distribution (QKD) Networks using Graph Neural Networks (GNN) framework. Today, the development of quantum computers threatens the security systems of classical cryptography. Moreover, as QKD networks are designed for protecting secret communication, they suffer from multiple operational difficulties: adaptive to dynamic conditions, optimization for multiple parameters and effective resource utilization. In order to overcome these obstacles, we propose a GNN-based framework which can model QKD networks as dynamic graphs and extracts exploitable characteristics from these networks' structure. The graph contains not only topological information but also specific characteristics associated with quantum communication (the number of edges between nodes, etc). Experimental results demonstrate that the GNN-optimized QKD network achieves a substantial increase in total key rate (from 27.1 Kbits/s to 470 Kbits/s), a reduced average QBER (from 6.6% to 6.0%), and maintains path integrity with a slight reduction in average transmission distance (from 7.13 km to 6.42 km). Furthermore, we analyze network performance across varying scales (10 to 250 nodes), showing improved link prediction accuracy and enhanced key generation rate in medium-sized networks. This work introduces a novel operation mode for QKD networks, shifting the paradigm of network optimization through adaptive and scalable quantum communication systems that enhance security and performance.
要約:
Side-channel analysis (SCA) poses a real-world threat by exploiting unintentional physical signals to extract secret information from secure devices. Evaluation labs also use the same techniques to certify device security. In recent years, deep learning has emerged as a prominent method for SCA, achieving state-of-the-art attack performance at the cost of interpretability. Understanding how neural networks extract secrets is crucial for security evaluators aiming to defend against such attacks, as only by understanding the attack can one propose better countermeasures.
In this work, we apply mechanistic interpretability to neural networks trained for SCA, revealing \textit{how} models exploit \textit{what} leakage in side-channel traces. We focus on sudden jumps in performance to reverse engineer learned representations, ultimately recovering secret masks and moving the evaluation process from black-box to white-box. Our results show that mechanistic interpretability can scale to realistic SCA settings, even when relevant inputs are sparse, model accuracies are low, and side-channel protections prevent standard input interventions.
要約:
The rapid growth of the blockchain ecosystem and the increasing value locked in smart contracts necessitate robust security measures. While languages like Solidity and Move aim to improve smart contract security, vulnerabilities persist. This paper presents Smartify, a novel multi-agent framework leveraging Large Language Models (LLMs) to automatically detect and repair vulnerabilities in Solidity and Move smart contracts. Unlike traditional methods that rely solely on vast pre-training datasets, Smartify employs a team of specialized agents working on different specially fine-tuned LLMs to analyze code based on underlying programming concepts and language-specific security principles. We evaluated Smartify on a dataset for Solidity and a curated dataset for Move, demonstrating its effectiveness in fixing a wide range of vulnerabilities. Our results show that Smartify (Gemma2+codegemma) achieves state-of-the-art performance, surpassing existing LLMs and enhancing general-purpose models' capabilities, such as Llama 3.1. Notably, Smartify can incorporate language-specific knowledge, such as the nuances of Move, without requiring massive language-specific pre-training datasets. This work offers a detailed analysis of various LLMs' performance on smart contract repair, highlighting the strengths of our multi-agent approach and providing a blueprint for developing more secure and reliable decentralized applications in the growing blockchain landscape. We also provide a detailed recipe for extending this to other similar use cases.
要約:
Large Language Models (LLMs) face threats from jailbreak prompts. Existing methods for defending against jailbreak attacks are primarily based on auxiliary models. These strategies, however, often require extensive data collection or training. We propose LightDefense, a lightweight defense mechanism targeted at white-box models, which utilizes a safety-oriented direction to adjust the probabilities of tokens in the vocabulary, making safety disclaimers appear among the top tokens after sorting tokens by probability in descending order. We further innovatively leverage LLM's uncertainty about prompts to measure their harmfulness and adaptively adjust defense strength, effectively balancing safety and helpfulness. The effectiveness of LightDefense in defending against 5 attack methods across 2 target LLMs, without compromising helpfulness to benign user queries, highlights its potential as a novel and lightweight defense mechanism, enhancing security of LLMs.
要約:
The proliferation of UAVs has enabled a wide range of mission-critical applications and is becoming a cornerstone of low-altitude networks, supporting smart cities, emergency response, and more. However, the open wireless environment, dynamic topology, and resource constraints of UAVs expose low-altitude networks to severe DoS threats. Traditional defense approaches, which rely on fixed configurations or centralized decision-making, cannot effectively respond to the rapidly changing conditions in UAV swarm environments. To address these challenges, we propose a novel federated multi-agent deep reinforcement learning (FMADRL)-driven moving target defense (MTD) framework for proactive DoS mitigation in low-altitude networks. Specifically, we design lightweight and coordinated MTD mechanisms, including leader switching, route mutation, and frequency hopping, to disrupt attacker efforts and enhance network resilience. The defense problem is formulated as a multi-agent partially observable Markov decision process, capturing the uncertain nature of UAV swarms under attack. Each UAV is equipped with a policy agent that autonomously selects MTD actions based on partial observations and local experiences. By employing a policy gradient-based algorithm, UAVs collaboratively optimize their policies via reward-weighted aggregation. Extensive simulations demonstrate that our approach significantly outperforms state-of-the-art baselines, achieving up to a 34.6% improvement in attack mitigation rate, a reduction in average recovery time of up to 94.6%, and decreases in energy consumption and defense cost by as much as 29.3% and 98.3%, respectively, under various DoS attack strategies. These results highlight the potential of intelligent, distributed defense mechanisms to protect low-altitude networks, paving the way for reliable and scalable low-altitude economy.
要約:
Despite extensive alignment efforts, Large Vision-Language Models (LVLMs) remain vulnerable to jailbreak attacks, posing serious safety risks. To address this, existing detection methods either learn attack-specific parameters, which hinders generalization to unseen attacks, or rely on heuristically sound principles, which limit accuracy and efficiency. To overcome these limitations, we propose Learning to Detect (LoD), a general framework that accurately detects unknown jailbreak attacks by shifting the focus from attack-specific learning to task-specific learning. This framework includes a Multi-modal Safety Concept Activation Vector module for safety-oriented representation learning and a Safety Pattern Auto-Encoder module for unsupervised attack classification. Extensive experiments show that our method achieves consistently higher detection AUROC on diverse unknown attacks while improving efficiency. The code is available at https://anonymous.4open.science/r/Learning-to-Detect-51CB.
要約:
Large vision-language models (LVLMs) enable autonomous mobile agents to operate smartphone user interfaces, yet vulnerabilities in their perception and interaction remain critically understudied. Existing research often relies on conspicuous overlays, elevated permissions, or unrealistic threat assumptions, limiting stealth and real-world feasibility. In this paper, we introduce a practical and stealthy jailbreak attack framework, which comprises three key components: (i) non-privileged perception compromise, which injects visual payloads into the application interface without requiring elevated system permissions; (ii) agent-attributable activation, which leverages input attribution signals to distinguish agent from human interactions and limits prompt exposure to transient intervals to preserve stealth from end users; and (iii) efficient one-shot jailbreak, a heuristic iterative deepening search algorithm (HG-IDA*) that performs keyword-level detoxification to bypass built-in safety alignment of LVLMs. Moreover, we developed three representative Android applications and curated a prompt-injection dataset for mobile agents. We evaluated our attack across multiple LVLM backends, including closed-source services and representative open-source models, and observed high planning and execution hijack rates (e.g., GPT-4o: 82.5% planning / 75.0% execution), exposing a fundamental security vulnerability in current mobile agents and underscoring critical implications for autonomous smartphone operation.
要約:
Traditional security scanners fail when facing new attack patterns they haven't seen before. They rely on fixed rules and predetermined signatures, making them blind to novel threats. We present a fundamentally different approach: instead of memorizing specific attack patterns, we learn what makes systems genuinely secure. Our key insight is simple yet powerful: context determines vulnerability. A SQL query that's safe in one environment becomes dangerous in another. By modeling this context-vulnerability relationship, we achieve something remarkable: our system detects attacks it has never seen before. We introduce context-aware verification that learns from genuine system behavior. Through reconstruction learning on secure systems, we capture their essential characteristics. When an unknown attack deviates from these patterns, our system recognizes it, even without prior knowledge of that specific attack type. We prove this capability theoretically, showing detection rates improve exponentially with context information I(W;C). Our framework combines three components: (1) reconstruction learning that models secure behavior, (2) multi-scale graph reasoning that aggregates contextual clues, and (3) attention mechanisms guided by reconstruction differences. Extensive experiments validate our approach: detection accuracy jumps from 58 percent to 82 percent with full context, unknown attack detection improves by 31 percent, and our system maintains above 90 percent accuracy even against completely novel attack vectors.
要約:
The advent of Large Language Models (LLMs) promised to resolve the long-standing paradox in honeypot design, achieving high-fidelity deception with low operational risk. Through a flurry of research since late 2022, steady progress from ideation to prototype implementation is exhibited. Since late 2022, a flurry of research has demonstrated steady progress from ideation to prototype implementation. While promising, evaluations show only incremental progress in real-world deployments, and the field still lacks a cohesive understanding of the emerging architectural patterns, core challenges, and evaluation paradigms. To fill this gap, this Systematization of Knowledge (SoK) paper provides the first comprehensive overview and analysis of this new domain. We survey and systematize the field by focusing on three critical, intersecting research areas: first, we provide a taxonomy of honeypot detection vectors, structuring the core problems that LLM-based realism must solve; second, we synthesize the emerging literature on LLM-powered honeypots, identifying a canonical architecture and key evaluation trends; and third, we chart the evolutionary path of honeypot log analysis, from simple data reduction to automated intelligence generation. We synthesize these findings into a forward-looking research roadmap, arguing that the true potential of this technology lies in creating autonomous, self-improving deception systems to counter the emerging threat of intelligent, automated attackers.
要約:
LLMs are now an integral part of information retrieval. As such, their role as question answering chatbots raises significant concerns due to their shown vulnerability to adversarial man-in-the-middle (MitM) attacks. Here, we propose the first principled attack evaluation on LLM factual memory under prompt injection via Xmera, our novel, theory-grounded MitM framework. By perturbing the input given to "victim" LLMs in three closed-book and fact-based QA settings, we undermine the correctness of the responses and assess the uncertainty of their generation process. Surprisingly, trivial instruction-based attacks report the highest success rate (up to ~85.3%) while simultaneously having a high uncertainty for incorrectly answered questions. To provide a simple defense mechanism against Xmera, we train Random Forest classifiers on the response uncertainty levels to distinguish between attacked and unattacked queries (average AUC of up to ~96%). We believe that signaling users to be cautious about the answers they receive from black-box and potentially corrupt LLMs is a first checkpoint toward user cyberspace safety.
要約:
Model merging has emerged as an efficient technique for expanding large language models (LLMs) by integrating specialized expert models. However, it also introduces a new threat: model merging stealing, where free-riders exploit models through unauthorized model merging. Unfortunately, existing defense mechanisms fail to provide effective protection. Specifically, we identify three critical protection properties that existing methods fail to simultaneously satisfy: (1) proactively preventing unauthorized merging; (2) ensuring compatibility with general open-source settings; (3) achieving high security with negligible performance loss. To address the above issues, we propose MergeBarrier, a plug-and-play defense that proactively prevents unauthorized merging. The core design of MergeBarrier is to disrupt the Linear Mode Connectivity (LMC) between the protected model and its homologous counterparts, thereby eliminating the low-loss path required for effective model merging. Extensive experiments show that MergeBarrier effectively prevents model merging stealing with negligible accuracy loss.
要約:
Random selection, leader election, and collective coin flipping are fundamental tasks in fault-tolerant distributed computing. We study these problems in the full-information model where despite decades of study, key gaps remain in our understanding of the trade-offs between round complexity, communication per player in each round, and adversarial resilience. We make progress by proving improved bounds for these problems.
We first show that any $k$-round coin flipping protocol over $\ell$ players, each player sending one bit per round, can be biased by $O(\ell/\log^{(k)}(\ell))$ bad players. We obtain a similar lower bound for leader election. This strengthens prior best bounds [RSZ, SICOMP 2002] of $O(\ell/\log^{(2k-1)}(\ell))$ for coin flipping protocols and $O(\ell/\log^{(2k+1)}(\ell))$ for leader election protocols. Our result implies that any (1-bit per player) protocol tolerating linear fraction of bad players requires at least $\log^* \ell$ rounds, showing existing protocols [RZ, JCSS 2001; F, FOCS 1999] are near-optimal.
We next initiate the study of one-round, (1-bit per player) random selection. We construct a protocol resilient to $\ell / (\log \ell)^2$ bad players that outputs $(\log \ell)^2 / (\log \log \ell)^2$ uniform random bits. This implies a one-round leader election protocol resilient to $\ell / (\log \ell)^2$ bad players, improving the prior best protocol [RZ, JCSS 2001] which was resilient to $\ell / (\log \ell)^3$ bad players. Our resilience matches that of the best one-round coin flipping protocol by Ajtai & Linial. We also obtain an almost matching lower bound: any protocol outputting $(\log \ell)^2 / (\log \log \ell)^2$ bits can be corrupted by $\ell (\log \log \ell)^2 / (\log \ell)^2$ bad players. To obtain our lower bound, we introduce multi-output influence, an extension of influence of boolean functions to the multi-output setting.
要約:
Retrieval-augmented generation (RAG) is instrumental for inhibiting hallucinations in large language models (LLMs) through the use of a factual knowledge base (KB). Although PDF documents are prominent sources of knowledge, text-based RAG pipelines are ineffective at capturing their rich multi-modal information. In contrast, visual document RAG (VD-RAG) uses screenshots of document pages as the KB, which has been shown to achieve state-of-the-art results. However, by introducing the image modality, VD-RAG introduces new attack vectors for adversaries to disrupt the system by injecting malicious documents into the KB. In this paper, we demonstrate the vulnerability of VD-RAG to poisoning attacks targeting both retrieval and generation. We define two attack objectives and demonstrate that both can be realized by injecting only a single adversarial image into the KB. Firstly, we introduce a targeted attack against one or a group of queries with the goal of spreading targeted disinformation. Secondly, we present a universal attack that, for any potential user query, influences the response to cause a denial-of-service in the VD-RAG system. We investigate the two attack objectives under both white-box and black-box assumptions, employing a multi-objective gradient-based optimization approach as well as prompting state-of-the-art generative models. Using two visual document datasets, a diverse set of state-of-the-art retrievers (embedding models) and generators (vision language models), we show VD-RAG is vulnerable to poisoning attacks in both the targeted and universal settings, yet demonstrating robustness to black-box attacks in the universal setting.
要約:
Video diffusion models can generate realistic and temporally consistent videos. This raises concerns about provenance, ownership, and integrity. Watermarking can help address these issues by embedding metadata directly into the content. To work well, a watermark needs enough capacity for meaningful metadata. It must also stay imperceptible and remain robust to common video manipulations. Existing methods struggle with limited capacity, extra inference cost, or reduced visual quality. We introduce VidStamp, a watermarking framework that embeds frame-level messages through the decoder of a latent video diffusion model. The decoder is fine-tuned in two stages. The first stage uses static image datasets to encourage spatial message separation. The second stage uses synthesized video sequences to restore temporal consistency. This approach enables high-capacity watermarks with minimal perceptual impact. VidStamp also supports dynamic watermarking through a control signal that selects message templates during inference. This adds flexibility and creates a second channel for communication. We evaluate VidStamp on Stable Video Diffusion (I2V), OpenSora, and Wan (T2V). The system embeds 48 bits per frame while preserving visual quality and staying robust to common distortions. Compared with VideoSeal, VideoShield, and RivaGAN, it achieves lower log P-values and stronger detectability. Its frame-wise watermarking design also enables precise temporal tamper localization, with an accuracy of 0.96, which exceeds the VideoShield baseline. Code: https://github.com/SPIN-UMass/VidStamp
要約:
With rapid advancements in quantum computing, it is widely believed that there will be quantum hardware capable of compromising classical cryptography and hence, the internet and the current information security infrastructure in the coming decade. This is mainly due to the operational realizations of quantum algorithms such as Grover and Shor, to which the current classical encryption protocols are vulnerable. Blockchains, i.e., blockchain data structures and their data, rely heavily on classical cryptography. One approach to secure blockchain is to attempt to achieve information theoretical security by defining blockchain on quantum technologies. There have been two conceptualizations of blockchains on quantum registers: the time-entangled Greenberger-Horne-Zeilinger (GHZ) state blockchain and the quantum hypergraph blockchain. On our part, an attempt is made to conceptualize a new quantum blockchain combining features of both these schemes to achieve the absolute security of the time-temporal GHZ blockchain and the scalability and efficiency of the quantum hypergraph blockchain in the proposed quantum blockchain protocol.
要約:
Recent studies have demonstrated the vulnerability of Automatic Speech Recognition systems to adversarial examples, which can deceive these systems into misinterpreting input speech commands. While previous research has primarily focused on white-box attacks with constrained optimizations, and transferability based black-box attacks against commercial Automatic Speech Recognition devices, this paper explores cost efficient white-box attack and non transferability black-box adversarial attacks on Automatic Speech Recognition systems, drawing insights from approaches such as Fast Gradient Sign Method and Zeroth-Order Optimization. Further, the novelty of the paper includes how poisoning attack can degrade the performances of state-of-the-art models leading to misinterpretation of audio signals. Through experimentation and analysis, we illustrate how hybrid models can generate subtle yet impactful adversarial examples with very little perturbation having Signal Noise Ratio of 35dB that can be generated within a minute. These vulnerabilities of state-of-the-art open source model have practical security implications, and emphasize the need for adversarial security.