要約:
Traditional Differential Privacy (DP) mechanisms are typically tailored to specific analysis tasks, which limits the reusability of protected data. DP tabular data synthesis overcomes this by generating synthetic datasets that can be shared for arbitrary downstream tasks. However, existing synthesis methods predominantly assume centralized or local settings and overlook the more practical horizontal federated scenario. Naively synthesizing data locally or perturbing individual records either produces biased mixtures or introduces excessive noise, especially under heterogeneous data distributions across participants.
We propose HeteroFedSyn, the first DP tabular data synthesis framework designed specifically for the horizontal federated setting. Built upon the PrivSyn paradigm of 2-way marginal-based synthesis, HeteroFedSyn introduces three key innovations for distributed marginal selection: (i) an L2-based dependency metric with random projection for noise-efficient correlation measurement, (ii) an unbiased estimator to correct multiplicative noise, and (iii) an adaptive selection strategy that dynamically updates dependency scores to avoid redundancy. Extensive experiments on range queries, Wasserstein fidelity, and machine learning tasks show that, despite the increased noise inherent to federated execution, HeteroFedSyn achieves utility comparable to centralized synthesis. Our code is open-sourced via the link.
要約:
Deep learning (DL)-based Network Intrusion Detection System (NIDS) has demonstrated great promise in detecting malicious network traffic. However, they face significant security risks due to their vulnerability to adversarial examples (AEs). Most existing adversarial attacks maliciously perturb data to maximize misclassification errors. Among AEs, natural adversarial examples (NAEs) are particularly difficult to detect because they closely resemble real data, making them challenging for both humans and machine learning models to distinguish from legitimate inputs. Creating NAEs is crucial for testing and strengthening NIDS defenses. This paper proposes NetDiffuser1, a novel framework for generating NAEs capable of deceiving NIDS. NetDiffuser consists of two novel components. First, a new feature categorization algorithm is designed to identify relatively independent features in network traffic. Perturbing these features minimizes changes while preserving network flow validity. The second component is a novel application of diffusion models to inject semantically consistent perturbations for generating NAEs. NetDiffuser performance was extensively evaluated using three benchmark NIDS datasets across various model architectures and state-of-the-art adversarial detectors. Our experimental results show that NetDiffuser achieves up to a 29.93% higher attack success rate and reduces AE detection performance by at least 0.267 (in some cases up to 0.534) in the Area under the Receiver Operating Characteristic Curve (AUC-ROC) score compared to the baseline attacks.
要約:
Multi-agent artificial intelligence systems or MAS are systems of autonomous agents that exercise delegated tool authority, share persistent memory, and coordinate via inter-agent communication. MAS introduces qualitatively distinct security vulnerabilities from those documented for singular AI models. Existing security and governance frameworks were not designed for these emerging attack surfaces. This study systematically characterizes the threat landscape of MAS and quantitatively evaluates 16 security frameworks for AI against it. A four-phase methodology is proposed: constructing a deep technical knowledge base of production multi-agent architectures; conducting generative AI-assisted threat modeling scoped to MAS cybersecurity risks and validated by domain experts; structuring survey plans at individual-threat granularity; and scoring each framework on a three-point scale against the cybersecurity risks. The risks were organized into 193 distinct main threat items across nine risk categories. The expected minimal average score is 2. No reviewed framework achieves majority coverage of any single category. Non-Determinism (mean score 1.231 across all 16 frameworks) and Data Leakage (1.340) are the most under-addressed domains. The OWASP Agentic Security Initiative leads overall at 65.3\% coverage and in the design phase; the CDAO Generative AI Responsible AI Toolkit leads in development and operational coverage. These results provide the first empirical cross-framework comparison for MAS security and offer evidence-based guidance for framework selection.
要約:
Enterprises increasingly rely on cloud-based applications to process highly sensitive data artifacts. Although cloud adoption improves agility and scalability, it also introduces new security challenges such as expanded attack surfaces, a wider radius of attack from credential compromise, and challenges maintaining strict access controls across users, services, and workflows. These challenges are especially acute for applications that handle privileged data and execute security-critical analysis, where traditional trust boundaries and ad hoc safeguards are insufficient. This paper presents Lockbox; a Zero Trust architecture designed for secure processing of sensitive cloud workloads under strict enterprise security and governance requirements. Lockbox applies explicit trust verification, strong isolation, least-privilege access, and policy-driven enforcement throughout the entire application lifecycle, from user authentication and document ingestion to analysis execution and result storage. The system incorporates modern cloud security primitives including; role-based access control, centralized key management, encryption in transit and at rest, and controlled integration with cloud-based data processing services, ensuring that sensitive artifacts remain protected and accessible only to authorized users. We discuss the usage of Lockbox in processing highly sensitive cybersecurity reports and demonstrate how this architecture enables organizations to safely adopt advanced capabilities, including AI-assisted processing, without weakening their security posture.
要約:
The weaponization of LLMs for automated malware generation poses an existential threat to conventional detection paradigms. AI-generated malware exhibits polymorphic, metamorphic, and context-aware evasion capabilities that render signature-based and shallow heuristic defenses obsolete. This paper introduces a novel hybrid analysis framework that synergistically combines \emph{concolic execution} with \emph{LLM-augmented path prioritization} and \emph{deep-learning-based vulnerability classification} to detect zero-day AI-generated malware with provable guarantees. We formalize the detection problem within a first-order temporal logic over program execution traces, define a lattice-theoretic abstraction for path constraint spaces, and prove both the \emph{soundness} and \emph{relative completeness} of our detection algorithm, assuming classifier correctness. The framework introduces three novel algorithms: (i) an LLM-guided concolic exploration strategy that reduces the average number of explored paths by 73.2\% compared to depth-first search while maintaining equivalent malicious-path coverage; (ii) a transformer-based path-constraint classifier trained on symbolic execution traces; and (iii) a feedback loop that iteratively refines the LLM's prioritization policy using reinforcement learning from detection outcomes. We provide a comprehensive implementation built upon \texttt{angr} 9.2, \texttt{Z3} 4.12, Hugging Face Transformers 4.38, and PyTorch 2.2, with configuration details enabling reproducibility. Experimental evaluation on the EMBER, Malimg, SOREL-20M, and a novel AI-Gen-Malware benchmark comprising 2{,}500 LLM-synthesized samples demonstrates that achieves 98.7\% accuracy on conventional malware and 97.5\% accuracy on AI-generated threats, outperforming ClamAV, YARA, MalConv, and EMBER-GBDT baselines by margins of 8.4--52.2 percentage points on AI-generated samples.
要約:
Device-side Large Language Models (LLMs) have witnessed explosive growth, offering higher privacy and availability compared to cloud-side LLMs. During LLM inference, both model weights and user data are valuable, and attackers may even compromise the OS kernel to steal them. ARM TrustZone is the de facto hardware-based isolation technology on mobile devices, used to protect sensitive applications from a compromised OS. However, protecting LLM inference with TrustZone incurs significant overhead due to its inflexible isolation of memory and the NPU. To address these challenges, this paper introduces FlexServe, a fast and secure LLM serving system for mobile devices. It first introduces a Flexible Resource Isolation mechanism to construct Flexible Secure Memory (Flex-Mem) and Flexible Secure NPU (Flex-NPU). Both memory pages and the NPU can be efficiently switched between unprotected and protected modes. Based on these mechanisms, FlexServe designs a fast and secure LLM inference framework within TrustZone's secure world. The LLM-Aware Memory Management and Secure Inference Pipeline are introduced to accelerate inference. A Multi-Model Scheduler is proposed to optimize multi-model workflows. We implement a prototype of FlexServe and compare it with two TrustZone-based strawman designs. The results show that FlexServe achieves an average $10.05\times$ speedup in Time to First Token (TTFT) compared to the strawman, and an average $2.44\times$ TTFT speedup compared to an optimized strawman with pipeline and secure NPU enabled. For multi-model agent workflows, the end-to-end speedup is up to $24.30\times$ and $4.05\times$ compared to the strawman and optimized strawman, respectively.
要約:
Multi-agent systems (MAS) powered by LLMs promise adaptive, reasoning-driven enterprise workflows, yet granting agents autonomous control over tools, memory, and communication introduces attack surfaces absent from deterministic pipelines. While current research largely addresses prompt-level exploits and narrow individual vectors, it lacks a holistic architectural model for enterprise-grade security. We introduce AgenticCyOps (Securing Multi-Agentic AI Integration in Enterprise Cyber Operations), a framework built on a systematic decomposition of attack surfaces across component, coordination, and protocol layers, revealing that documented vectors consistently trace back to two integration surfaces: tool orchestration and memory management. Building on this observation, we formalize these integration surfaces as primary trust boundaries and define five defensive principles: authorized interfaces, capability scoping, verified execution, memory integrity & synchronization, and access-controlled data isolation; each aligned with established compliance standards (NIST, ISO 27001, GDPR, EU AI Act). We apply the framework to a Security Operations Center (SOC) workflow, adopting the Model Context Protocol (MCP) as the structural basis, with phase-scoped agents, consensus validation loops, and per-organization memory boundaries. Coverage analysis, attack path tracing, and trust boundary assessment confirm that the design addresses the documented attack vectors with defense-in-depth, intercepts three of four representative attack chains within the first two steps, and reduces exploitable trust boundaries by a minimum of 72% compared to a flat MAS, positioning AgenticCyOps as a foundation for securing enterprise-grade integration.
要約:
A common problem in private data analysis is the partition selection problem, where each user holds a set of partitions (e.g. keys in a GROUP BY operation) from a possibly unbounded set. The challenge here is in maximizing the set of released partitions while respecting a differential privacy constraint. Previous work [Desfontaines et al., PoPETS 2022] presented an optimal $(\varepsilon, \delta)$-DP algorithm when each user submits only a single partition. We generalize this approach to find the optimal algorithm under $\delta$-approximate $(\alpha, \varepsilon)$-R\'enyi differential privacy (RDP), which allows much tighter analysis under composition. Motivated by the non-existence of a general optimality result in the case where users submit multiple partitions each, we present an extension of our optimal algorithm tuned for $L^2$ bounded weighted partition selection which can be used as a drop-in improvement over the Gaussian mechanism any time the partition frequency is not also needed. We show that our primitive can be easily plugged into state of the art partition selection algorithms (PolicyGaussian from [Gopi et al., ICML 2020] and MAD2R from [Chen et al., ICML 2025]), improving performance both for parallel and sequential adaptive algorithms. Finally, we show that there is an inherent cost to algorithms which do support releasing the frequency as well as the partitions. Specifically, we formulate a basic notion of optimal approximate RDP algorithm for partition selection using additive noise, and show that there is a numerical separation between additive and non-additive noise mechanisms for this problem.
要約:
Private Information Retrieval (PIR) allows a client to privately access a database without revealing which element is accessed. Initial PIR protocols based on Ring Learning with Errors (RLWE) demonstrated the practicality of PIR, but achieve limited throughput. Alternatively, high-throughput protocols leverage an offline phase that requires substantial client-side storage (e.g., hints in SimplePIR) or involve prohibitive communication costs during the offline phase (e.g., Piano). These limitations conflict with the practical constraints of resource-limited clients and are further exacerbated by dynamic databases, where updates necessitate costly regeneration and retransmission of hints.
To address these challenges, we propose ZipPIR, a high-throughput PIR protocol that compresses LWE ciphertexts into significantly smaller Paillier ciphertexts. ZipPIR leverages the offline phase to obtain this size reduction without incurring the associated computational cost in the online phase. Moreover, under computational assumptions, ZipPIR features an almost silent offline phase, requiring no communication beyond an initial public key, enabling the server to independently generate and update hints during idle times without client interaction. ZipPIR achieves over 2 GB/s of throughput - comparable to state-of-the-art protocols such as SimplePIR - without the need for a large client-stored hint. For PIR over a 1 GB database, ZipPIR has up to 10x higher throughput than existing protocols with no client-side storage, while requiring less than 200 KB of server-side storage per client, significantly enhancing scalability for practical deployments. While prior PIR protocols using Paillier are very inefficient, ZipPIR is the first PIR protocol using Paillier that achieves throughput that is competitive with state-of-the-art PIR protocols.
要約:
Large Vision-Language Models (LVLMs) undergo safety alignment to suppress harmful content. However, current defenses predominantly target explicit malicious patterns in the input representation, often overlooking the vulnerabilities inherent in compositional reasoning. In this paper, we identify a systemic flaw where LVLMs can be induced to synthesize harmful logic from benign premises. We formalize this attack paradigm as \textit{Reasoning-Oriented Programming}, drawing a structural analogy to Return-Oriented Programming in systems security. Just as ROP circumvents memory protections by chaining benign instruction sequences, our approach exploits the model's instruction-following capability to orchestrate a semantic collision of orthogonal benign inputs. We instantiate this paradigm via \tool{}, an automated framework that optimizes for \textit{semantic orthogonality} and \textit{spatial isolation}. By generating visual gadgets that are semantically decoupled from the harmful intent and arranging them to prevent premature feature fusion, \tool{} forces the malicious logic to emerge only during the late-stage reasoning process. This effectively bypasses perception-level alignment. We evaluate \tool{} on SafeBench and MM-SafetyBench across 7 state-of-the-art 0.LVLMs, including GPT-4o and Claude 3.7 Sonnet. Our results demonstrate that \tool{} consistently circumvents safety alignment, outperforming the strongest existing baseline by an average of 4.67\% on open-source models and 9.50\% on commercial models.
要約:
Entropy--a measure of randomness--is compulsory for the generation of secure cryptographic keys; however, Internet of Things (IoT) devices that are small or constrained often struggle to collect suf ficient entropy. In this article, we solve the entropy provisioning problem for a fleet of IoT devices that can generate a limited amount of entropy. We employ a Trusted Execution Environment (TEE) based on RISC-V to create an external entropy service for a fleet of IoT devices. A small measure of true entropy or pre-installed keys can establish initial secure communication. Once connected, devices can request cryptographically strong entropy from a TEE-backed server. RISC-V offers True Random Number Generators (TRNGs) and a TEE for devices to attest that they are receiving reliable entropy. In addition, this solution can be expanded by adding IoT devices with sensors that produce high-quality entropy as additional entropy sources for the RISC-V entropy provider. Our open-source implementation shows that building trusted entropy infrastructure for IoT is both feasible and effective on open RISC-V platforms.
要約:
Tor enables anonymous web browsing and access to anonymous onion websites. Prior work has focused on crawling and content analysis rather than on what users actually try to access. Our honeypot approach measures engagement across onion-site categories, revealing behavioral interest rather than inferred popularity. In March--April 2025, we deployed honeypot onion websites and seeded neutral-looking links via three channels -- the Ahmia Tor search engine, Stronghold paste onion "paste" service, and pastebin.com -- to observe discovery and subsequent interaction events (CAPTCHA solves; registration/login attempts). We observe that, almost without exception, human users originate from Ahmia.fi; after removing the honeypot links from the Ahmia.fi search results, visits dropped to nearly zero and no users solved CAPTCHAs. The honeypot landing front pages represent different forums for cybercrime activities -- child sexual abuse, violence, malware, stolen goods, illegal firearms, illegal drugs, and forgery items -- and, as a baseline comparison, an unclear forum. Within that set, the CSAM-themed honeypot drew markedly higher engagement than the other honeypots. When identical sites were offered in multiple languages, interaction events occurred most often on the English-language versions.
要約:
We propose a robust and provably secure image steganography framework based on latent-space iterative optimization. Within this framework, the receiver treats the transmitted image as a fixed reference and iteratively refines a latent variable to minimize the reconstruction error, thereby improving message extraction accuracy. Unlike prior methods, our approach preserves the provable security of the embedding while markedly enhancing robustness under various compression and image processing scenarios. On benchmark datasets, the experimental results demonstrate that the proposed iterative optimization not only improves robustness against image compression while preserving provable security, but can also be applied as an independent module to further reinforce robustness in other provably secure steganographic schemes. This highlights the practicality and promise of latent-space optimization for building reliable, robust, and secure steganographic systems.
要約:
Advanced Persistent Threats (APTs) pose critical challenges to modern cybersecurity due to their multi-stage and stealthy nature. While provenance-based detection approaches show promise in capturing causal attack semantics, current threat provenance practices face two paradoxical issues: (1) expert skepticism, where human analysts doubt the capability of traditional detection models to identify complex attacks; and (2) expert dependence, as analysts cannot manually process large-scale raw logs to detect threats without these models. Consequently, collaboration between humans and traditional models remains the prevailing paradigm. However, this renders investigation quality contingent upon human expertise and frequently results in alert fatigue. To address these challenges, we present ProvAgent, a framework that evolves the threat provenance paradigm from human-model collaboration to a novel collaboration between multi-agent systems and traditional models. ProvAgent leverages the speed and cost-efficiency of traditional models for initial anomaly screening over large-scale logs. By enforcing fine-grained identity-behavior consistency via graph contrastive learning, it profiles entities based on specific attributes to generate high-fidelity alerts. With these alerts serving as investigation entry points, ProvAgent achieves in-depth autonomous investigation through a hypothesis-verification multi-agent framework. Evaluations with real-world datasets demonstrate that ProvAgent outperforms six state-of-the-art (SOTA) baselines in anomaly detection. Through automated investigation, ProvAgent reconstructs near-complete attack processes at a minimum cost of \$0.06 per day.
著者: Abdullah Ghani (Lahore University of Management Sciences), Yash Vekaria (University of California, Davis), Zubair Shafiq (University of California, Davis)
要約:
Tracking pixels are used to optimize online ad campaigns through personalization, re-targeting, and conversion tracking. Past research has primarily focused on detecting the prevalence of tracking pixels on the web, with limited attention to how they are configured across websites. A tracking pixel may be configured differently on different websites. In this paper, we present a differential analysis framework: PixelConfig, to reverse-engineer the configurations of Meta Pixel deployments across the web. Using this framework, we investigate three types of Meta Pixel configurations: activity tracking (i.e., what a user is doing on a website), identity tracking (i.e., who a user is or who the device is associated with), and tracking restrictions (i.e., mechanisms to limit the sharing of potentially sensitive information).
Using data from the Internet Archive's Wayback Machine, we analyze and compare Meta Pixel configurations on 18K health-related websites with a control group of the top 10K websites from 2017 to 2024. We find that activity tracking features, such as automatic events that collect button clicks and page metadata, and identity tracking features, such as first-party cookies that are unaffected by third-party cookie blocking, reached adoption rates of up to 98.4%, largely driven by the Pixel's default settings. We also find that the Pixel is being used to track potentially sensitive information, such as user interactions related to booking medical appointments and button clicks associated with specific medical conditions (e.g., erectile dysfunction) on health-related websites. Tracking restriction features, such as Core Setup, are configured on up to 34.3% of health websites and 8.7% of control websites. However, even when enabled, these tracking restriction features provide limited protection and can be circumvented in practice.
要約:
The growth in the adoption of the WebAssembly (WASM) standard has given rise to a rapidly increasing landscape of binary applications that are natively ported to the environment of websites. The flexibility of WASM has made it the preferred way to run fast and resource-heavy applications, replacing a field that JavaScript previously monopolized. Despite its success, researchers have raised concerns over the security implementations of WASM, demonstrating that binary vulnerabilities, such as Buffer Overflows and Use After Free, remain a present danger for WASM binaries. Our work aims to demonstrate that such vulnerabilities, when occurring on a WebAssembly module, can affect the behavior of a web application in unexpected ways, enabling an attacker to exploit vulnerabilities that are typical of the web security landscape. We provide several scenarios to provide examples of how each binary vulnerability might lead to a web security vulnerability, such as SQL Injections, XS-Leaks, and SSTI. Our results show that binary vulnerabilities can invalidate common security mechanisms that web developer implement in their applications, demonstrating how the security of WASM modules remains a problem that needs to be addressed. We also provide a list of best practices and defensive strategies that developers can implement to mitigate the risks associated with running unsafe WASM modules in their web applications.
要約:
Analyzing Open Source Intelligence (OSINT) from large volumes of data is critical for drafting and publishing comprehensive CTI reports. This process usually follows a three-stage workflow -- triage, deep search and TI drafting. While Large Language Models (LLMs) offer a promising route toward automation, existing benchmarks still have limitations. These benchmarks often consist of tasks that do not reflect real-world analyst workflows. For example, human analysts rarely receive tasks in the form of multiple-choice questions. Also, existing benchmarks often rely on model-centric metrics that emphasize lexical overlap rather than actionable, detailed insights essential for security analysts. Moreover, they typically fail to cover the complete three-stage workflow. To address these issues, we introduce CyberThreat-Eval, which is collected from the daily CTI workflow of a world-leading company. This expert-annotated benchmark assesses LLMs on practical tasks across all three stages as mentioned above. It utilizes analyst-centric metrics that measure factual accuracy, content quality, and operational costs. Our evaluation using this benchmark reveals important insights into the limitations of current LLMs. For example, LLMs often lack the nuanced expertise required to handle complex details and struggle to distinguish between correct and incorrect information. To address these challenges, the CTI workflow incorporates both external ground-truth databases and human expert knowledge. TRA allows human experts to iteratively provide feedback for continuous improvement. The code is available at \href{https://github.com/xschen-beb/CyberThreat-Eval}{\texttt{GitHub}} and \href{https://huggingface.co/datasets/xse/CyberThreat-Eval}{\texttt{HuggingFace}}.
要約:
Diffusion models have made substantial advances in recent years, enabling high-quality image synthesis; however, the widespread dissemination and reuse of their outputs have introduced new challenges in intellectual property protection and content provenance. Image watermarking offers a solution to these challenges, and recent work has increasingly explored Noise-as-Watermark (NaW) approaches that integrate watermarking directly into the diffusion process. However, existing NaW methods fail to balance robustness and diversity. We attribute this weakness to value encoding, which encodes watermark bits into individual sampled values. It is extremely fragile in practical application scenarios. To address this, we encode watermark bits into the structured noise pattern, so that the watermark is preserved even when individual values are perturbed. To further ensure generation diversity, we introduce a dedicated randomization design that reshuffles the positions of noise elements without changing their values, preventing the watermark from inducing fixed noise patterns or spatial locations. Extensive experiments demonstrate that our method achieves state-of-the-art robustness while maintaining high generation quality across a wide range of lossy scenarios.
要約:
Software compartmentalization breaks down an application into compartments isolated from each other: an attacker taking over a compartment will be confined to it, limiting the damage they can cause to the rest of the application. Despite the security promises of this approach, recent studies have shown that most existing compartmentalized software is plagued by vulnerabilities at cross-compartment interfaces, allowing an attacker taking over a compartment to escape its confinement and negate the security guarantees expected from compartmentalization. In that context, securing cross-compartment interfaces is notoriously difficult and engineering-intensive.
In light of recent advances in Automated Program Repair (APR), notably through the use of Large Language Models (LLMs), this paper presents a work in progress investigating the suitability of LLM-based APR at securing cross-compartment interfaces as automatically as possible. We observe that existing APR approaches and general purpose/code-centric LLMs used as is are unfit for this task, and present the design, implementation, and early results of a new APR framework dedicated to compartment interface safety. The framework integrates into a feedback loop 1) a specialized fuzzer uncovering cross-compartment interface vulnerabilities; 2) a patch generation component bridging the lack of compartmentalization awareness of existing LLMs with a series of analysis techniques; and 3) a patch validation component assessing the effectiveness of generated vulnerability fixes. We validate our framework over a sample interface vulnerability, comparing it to a naive use of general-purpose LLMs, and discuss future research avenues.
要約:
Outsourcing encrypted data to the cloud creates a fundamental tension between data privacy and functional searchability. Current Searchable Symmetric Encryption (SSE) solutions frequently have significant limitations, such as excessive metadata leakage, or a lack of fine-grained access control. These issues restrict the scalability of secure searches in real-world applications where multiple clients require different levels of authorization. Our paper proposes MASSE, a dynamic multi-client SSE scheme incorporating attribute-based access control, which expands the OXT framework. With MASSE, clients are restricted sto searching for keywords authorized by their specific attribute sets, and the server remains unaware of the keywords and attributes. MASSE supports practical dynamic updates to documents, and client authorizations, including revocation, without requiring reencryption of the database or indices, or a large number of interactions. We formally prove the security of MASSE, that is, forward and backward privacy under a well-defined leakage profile, and token unforgeability. An experimental evaluation in a database containing 100 keywords, each associated with 150 documents, demonstrates the practical efficiency of MASSE. It takes less than two seconds to generate 10 to 100 keyword queries and 14 seconds to retrieve 50 matching documents. Theoretical results show that MASSE outperforms competing solutions, including OXT, and can be scaled to large encrypted databases. MASSE is also suitable for dynamic cloud deployments.
Keywords: Searchable Encryption, SSE, Multi-Client, Attribute Based SSE, Access Control, Revocation, OXT
要約:
The rapid expansion of Internet use has increased system exposure to cyber threats, with advanced persistent threats (APTs) being especially challenging due to their stealth, prolonged duration, and multi-stage attacks targeting high-value assets. In this study, we model APT evolution as a strategic interaction between an attacker and a defender on an attack graph. With limited information about the attacker's position and progress, the defender acts at random intervals by deploying intrusion detection sensors across the network. Once a compromise is detected, affected components are immediately secured through measures such as backdoor removal, patching, or system reconfiguration. Meanwhile, the attacker begins with reconnaissance and then proceeds through the network, exploiting vulnerabilities and installing backdoors to maintain persistent access and adaptive movement. Furthermore, the attacker may take several steps between consecutive defensive operations, resulting in an asymmetric temporal dynamic. The defender's goal is to reduce the likelihood that the attacker will gain access to a critical asset, whereas the attacker's purpose is to increase this likelihood. We investigate this interaction under three informational regimes, reflecting varying levels of attacker knowledge prior to action: (i) a Stackelberg scenario, in which the attacker has full knowledge of the defender's strategy and can optimize accordingly; (ii) a blind regime, where the attacker has no information and assumes uniform beliefs about defensive deployments; and (iii) a belief-based framework, where the attacker holds accurate probabilistic beliefs about the defender's actions. For each regime, we derive optimal defensive strategies by solving the corresponding optimization problems.
要約:
Benchmarking presence-only passive reconnaissance in smart-grid communications is challenging because the adversary is receive-only, yet nearby observers can still alter propagation through additional shadowing and multipath that reshapes channel coherence. Public smart-grid cybersecurity datasets largely target active protocol- or measurement-layer attacks and rarely provide propagation-driven observables with tiered topology context, which limits reproducible evaluation under strictly passive threat models. This paper introduces an IEEE-inspired, literature-anchored benchmark dataset generator for passive reconnaissance over a tiered Home Area Network (HAN), Neighborhood Area Network (NAN), and Wide Area Network (WAN) communication graph with heterogeneous wireless and wireline links. Node-level time series are produced through a physically consistent channel-to-metrics mapping where channel state information (CSI) is represented via measurement-realistic amplitude and phase proxies that drive inferred signal-to-noise ratio (SNR), packet error behavior, and delay dynamics. Passive attacks are modeled only as windowed excess attenuation and coherence degradation with increased channel innovation, so reliability and latency deviations emerge through the same causal mapping without labels or feature shortcuts. The release provides split-independent realizations with burn-in removal, strictly causal temporal descriptors, adjacency-weighted neighbor aggregates and deviation features, and federated-ready per-node train, validation, and test partitions with train-only normalization metadata. Baseline federated experiments highlight technology-dependent detectability and enable standardized benchmarking of graph-temporal and federated detectors for passive reconnaissance.
要約:
As AI assistants become widely used, privacy-aware platforms like Anthropic's Clio have been introduced to generate insights from real-world AI use. Clio's privacy protections rely on layering multiple heuristic techniques together, including PII redaction, clustering, filtering, and LLM-based privacy auditing. In this paper, we put these claims to the test by presenting CLIOPATRA, the first privacy attack against "privacy-preserving" LLM insight systems. The attack involves a realistic adversary that carefully designs and inserts malicious chats into the system to break multiple layers of privacy protections and induce the leakage of sensitive information from a target user's chat.
We evaluated CLIOPATRA on synthetically generated medical target chats, demonstrating that an adversary who knows only the basic demographics of a target user and a single symptom can successfully extract the user's medical history in 39% of cases by just inspecting Clio's output. Furthermore, CLIOPATRA can reach close to 100% when Clio is configured with other state-of-the-art models and the adversary's knowledge of the target user is increased. We also show that existing ad hoc mitigations, such as LLM-based privacy auditing, are unreliable and fail to detect major leaks. Our findings indicate that even when layered, current heuristic protections are insufficient to adequately protect user data in LLM-based analysis systems.
要約:
News about massive online breaches is increasingly common. But there has been little good data on how exposed people are because of these breaches. We combine data from a large, representative sample of adult Americans (n = 5,000) with data from \textit{Have I Been Pwned} to estimate the lower bound of the average number of breached online accounts per person. We find that at least 82.84% of Americans have had their accounts breached. And that on average Americans' accounts have been breached at least three times. Better educated, the middle-aged, women, and Whites are more likely to have had their accounts breached than the complementary groups.
要約:
This paper contributes to the nascent debate around safety cases for frontier AI systems. Safety cases are structured, defensible arguments that a system is acceptably safe to deploy in a given context. Historically, they have been used in safety-critical industries, such as aerospace, nuclear or automotive. As a result, safety cases for frontier AI have risen in prominence, both in the safety policies of leading frontier developers and in international research agendas proposed by leaders in generative AI, such as the Singapore Consensus on Global AI Safety Research Priorities and the International AI Safety Report. This paper appraises this work. We note that research conducted within the alignment community which draws explicitly on lessons from the assurance community has significant limitations. We therefore aim to rethink existing approaches to alignment safety cases. We offer lessons from existing methodologies within safety assurance and outline the limitations involved in the alignment community's current approach. Building on this foundation, we present a case study for a safety case focused on Deceptive Alignment and CBRN capabilities, drawing on existing, theoretical safety case "sketches" created by the alignment safety case community. Overall, we contribute holistic insights from the field of safety assurance via rigorous theory and methodologies that have been applied in safety-critical contexts. We do so in order to create a better foundational framework for robust, defensible and useful safety case methodologies which can help to assure the safety of frontier AI systems.
要約:
Analyzing large volumes of sensor network data, such as electricity consumption measurements from smart meters, is essential for modern applications but raises significant privacy concerns. Privacy-enhancing technologies like z-anonymity offer efficient anonymization for continuous data streams by suppressing rare values that could lead to re-identification, making it particularly suited for resource-constrained environments. Originally designed for centralized architectures, z-anonymity assumes a trusted central entity. In this paper, we introduce deZent, a decentralized implementation of z-anonymity that minimizes trust in the central entity by realizing local z-anonymity with lightweight coordination. We develop deZent using a stochastic counting structure and secure sum to coordinate private anonymization across the network. Our results show that deZent achieves comparable performance to centralized z-anonymity in terms of publication ratio, while reducing the communication overhead towards the central entity. Thus, deZent presents a promising approach for enhancing privacy in sensor networks while preserving system efficiency.
要約:
Genomic language models (GLMs) have emerged as powerful tools for learning representations of DNA sequences, enabling advances in variant prediction, regulatory element identification, and cross-task transfer learning. However, as these models are increasingly trained or fine-tuned on sensitive genomic cohorts, they risk memorizing specific sequences from their training data, raising serious concerns around privacy, data leakage, and regulatory compliance. Despite growing awareness of memorization risks in general-purpose language models, little systematic evaluation exists for these risks in the genomic domain, where data exhibit unique properties such as a fixed nucleotide alphabet, strong biological structure, and individual identifiability. We present a comprehensive, multi-vector privacy evaluation framework designed to quantify memorization risks in GLMs. Our approach integrates three complementary risk assessment methodologies: perplexity-based detection, canary sequence extraction, and membership inference. These are combined into a unified evaluation pipeline that produces a worst-case memorization risk score. To enable controlled evaluation, we plant canary sequences at varying repetition rates into both synthetic and real genomic datasets, allowing precise quantification of how repetition and training dynamics influence memorization. We evaluate our framework across multiple GLM architectures, examining the relationship between sequence repetition, model capacity, and memorization risk. Our results establish that GLMs exhibit measurable memorization and that the degree of memorization varies across architectures and training regimes. These findings reveal that no single attack vector captures the full scope of memorization risk, underscoring the need for multi-vector privacy auditing as a standard practice for genomic AI systems.
要約:
System prompts for LLM-based coding agents are software artifacts that govern agent behavior, yet lack the testing infrastructure applied to conventional software. We present Arbiter, a framework combining formal evaluation rules with multi-model LLM scouring to detect interference patterns in system prompts. Applied to three major coding agent system prompts: Claude Code (Anthropic), Codex CLI (OpenAI), and Gemini CLI (Google), we identify 152 findings across the undirected scouring phase and 21 hand-labeled interference patterns in directed analysis of one vendor. We show that prompt architecture (monolithic, flat, modular) strongly correlates with observed failure class but not with severity, and that multi-model evaluation discovers categorically different vulnerability classes than single-model analysis. One scourer finding was structural data loss in Gemini CLI's memory system was consistent with an issue filed and patched by Google, which addressed the symptom without addressing the schema-level root cause identified by the scourer. Total cost of cross-vendor analysis: \$0.27 USD.
要約:
Given a dataset of $n$ user-contributed strings, each of length at most $\ell$, a key problem is how to identify all frequent substrings while preserving each user's privacy. Recent work by Bernardini et al. (PODS'25) introduced a $\varepsilon$-differentially private algorithm achieving near-optimal error, but at the prohibitive cost of $O(n^2\ell^4)$ space and processing time. In this work, we present a new $\varepsilon$-differentially private algorithm that retains the same near-optimal error guarantees while reducing space complexity to $O(n \ell+ |\Sigma| )$ and time complexity to $O(n \ell\log |\Sigma| + |\Sigma| )$, for input alphabet $\Sigma$. Our approach builds on a top-down exploration of candidate substrings but introduces two new innovations: (i) a refined candidate-generation strategy that leverages the structural properties of frequent prefixes and suffixes, and (ii) pruning of the search space guided by frequency relations. These techniques eliminate the quadratic blow-ups inherent in prior work, enabling scalable frequent substring mining under differential privacy.
要約:
Dataset condensation (DC) learns a compact synthetic dataset that enables models to match the performance of full-data training, prioritising utility over distributional fidelity. While typically explored for computational efficiency, DC also holds promise for healthcare data democratisation, especially when paired with differential privacy, allowing synthetic data to serve as a safe alternative to real records. However, existing DC methods rely on differentiable neural networks, limiting their compatibility with widely used clinical models such as decision trees and Cox regression. We address this gap using a differentially private, zero-order optimisation framework that extends DC to non-differentiable models using only function evaluations. Empirical results across six datasets, including both classification and survival tasks, show that the proposed method produces condensed datasets that preserve model utility while providing effective differential privacy guarantees - enabling model-agnostic data sharing for clinical prediction tasks without exposing sensitive patient information.
要約:
Delegated quantum computation enables a client with limited quantum capabilities to outsource computations to a more powerful quantum server while preserving correctness and privacy. Verification is crucial in this setting to ensure that the untrusted quantum server performs the computation honestly and returns correct results. A common verification method is the quantum cut-and-choose technique. Inspired by classical verification methods for two-party computation, the client uses the majority of the delegated rounds to test the server's honesty, while keeping the remaining ones for the actual computation. Combining this technique with other methods, such as quantum error correction, could help achieve negligible cheating probabilities for the server; however, such methods can impose significant overheads making implementations unfeasible for the near-term future. In this work, we investigate whether cut-and-choose can yield efficient and secure verifiable quantum computation without additional costly techniques. We find that verifiable delegated quantum computation protocols relying solely on cut-and-choose techniques cannot be secure and efficient at the same time.
要約:
We establish the randomized distributed function computation (RDFC) framework, in which a sender transmits just enough information for a receiver to generate a randomized function of the input data. Describing RDFC as a form of semantic communication, which can be essentially seen as a generalized remote-source-coding problem, we show that security and privacy constraints naturally fit this model, as they generally require a randomization step. Using strong coordination metrics, we ensure (local differential) privacy for every input sequence and prove that such guarantees can be met even when no common randomness is shared between the transmitter and receiver.
This work provides lower bounds on Wyner's common information (WCI), which is the communication cost when common randomness is absent, and proposes numerical techniques to evaluate the other corner point of the RDFC rate region for continuous-alphabet random variables with unlimited shared randomness. Experiments illustrate that a sufficient amount of common randomness can reduce the semantic communication rate by up to two orders of magnitude compared to the WCI point, while RDFC without any shared randomness still outperforms lossless transmission by a large margin. A finite blocklength analysis further confirms that the privacy parameter gap between the asymptotic and non-asymptotic RDFC methods closes exponentially fast with input length. Our results position RDFC as an energy-efficient semantic communication strategy for privacy-aware distributed computation systems.
要約:
Current backdoor defenses assume that neutralizing a known trigger removes the backdoor. We show this trigger-centric view is incomplete: \emph{alternative triggers}, patterns perceptually distinct from training triggers, reliably activate the same backdoor. We estimate the alternative trigger backdoor direction in feature space by contrasting clean and triggered representations, and then develop a feature-guided attack that jointly optimizes target prediction and directional alignment. First, we theoretically prove that alternative triggers exist and are an inevitable consequence of backdoor training. Then, we verify this empirically. Additionally, defenses that remove training triggers often leave backdoors intact, and alternative triggers can exploit the latent backdoor feature-space. Our findings motivate defenses targeting backdoor directions in representation space rather than input-space triggers.
要約:
The temporal assumptions underpinning conventional Identity and Access Management collapse under agentic execution regimes. A sixty-second revocation window permits on the order of $6 \times 10^3$ unauthorized API calls at 100 ops/tick; at AWS Lambda scale, the figure approaches $6 \times 10^5$. This is a coherence problem, not merely a latency problem. We define a Capability Coherence System (CCS) and construct a state-mapping $\varphi : \Sigma_{\rm MESI} \to \Sigma_{\rm auth}$ preserving transition structure under bounded-staleness semantics. A safety theorem bounds unauthorized operations for the execution-count Release Consistency-directed Coherence (RCC) strategy at $D_{\rm rcc} \leq n$, independent of agent velocity $v$ -- a qualitative departure from the $O(v \cdot \mathrm{TTL})$ scaling of time-bounded strategies. Tick-based discrete event simulation across three business-contextualised scenarios (four strategies, ten deterministic seeds each) confirms: RCC achieves a $120\times$ reduction versus TTL-based lease in the high-velocity scenario (50 vs. 6,000 unauthorized operations), and $184\times$ under anomaly-triggered revocation. Zero bound violations across all 120 runs confirm the per-capability safety guarantee. Simulation code: https://github.com/hipvlady/prizm
要約:
We explore the use of level structures to generalize the SQIsign signature scheme. We give a general framework where, given the public key and the commitment, the challenge is to exhibit an isogeny between them with an additional requirement, namely to map a chosen level structure to another. We then instantiate the framework using 1-dimensional and 2-dimensional isogenies. In doing that we provide a new explicit Deuring correspondence for supersingular elliptic curves with level structures and solve new constrained norm equations.
要約:
Role classification involves grouping hosts into related roles. It exposes the logical structure of a network, simplifies network management tasks such as policy checking and network segmentation, and can be used to improve the accuracy of network monitoring and analysis algorithms such as intrusion detection. This paper defines the role classification problem and introduces two practical algorithms that group hosts based on observed connection patterns while dealing with changes in these patterns over time. The algorithms have been implemented in a commercial network monitoring and analysis product for enterprise networks. Results from grouping two enterprise networks show that the number of groups identified by our algorithms can be two orders of magnitude smaller than the number of hosts and that the way our algorithms group hosts highly reflects the logical structure of the networks.
要約:
Large Language Models (LLMs) undergo continuous updates to improve user experience. However, prior research on the security and safety implications of LLMs has primarily focused on their specific versions, overlooking the impact of successive LLM updates. This prompts the need for a holistic understanding of the risks in these different versions of LLMs. To fill this gap, in this paper, we conduct a longitudinal study to examine the adversarial robustness -- specifically misclassification, jailbreak, and hallucination -- of three prominent LLM families: GPT, Llama, and Qwen. Our study reveals that LLM updates do not consistently improve adversarial robustness as expected. For instance, a later version of GPT-3.5 degrades regarding misclassification and hallucination despite its improved resilience against jailbreaks. GPT-4 and GPT-4o demonstrate (incrementally) higher robustness overall. Larger Llama and Qwen models do not uniformly exhibit improved robustness across all three aspects studied. In addition, larger model sizes do not necessarily yield improved robustness. Minor updates lacking substantial robustness improvements can exacerbate existing issues rather than resolve them. We hope our study can offer valuable insights into navigating model updates and informed decisions in model development and usage.
要約:
The Learning with Errors (\LWE) problem has been widely utilized as a foundation for numerous cryptographic tools over the years. In this study, we focus on an algebraic variant of the \LWE problem called \emph{Group ring} \LWE ($\GRLWE$). We select group rings (or their direct summands) that underlie specific families of finite groups constructed by taking the semi-direct product of two cyclic groups. Unlike the Ring-\LWE problem described in \cite{lyubashevsky2010ideal}, the multiplication operation in the group rings considered here is non-commutative. As an extension of Ring-$\LWE$, it maintains computational hardness and can be potentially applied in many cryptographic scenarios. In this paper, we present two polynomial-time quantum reductions. Firstly, we provide a quantum reduction from the worst-case shortest independent vectors problem (\SIVP) in ideal lattices with polynomial approximate factor to the search version of $\GRLWE$. This reduction requires that the underlying group ring possesses certain mild properties; Secondly, we present another quantum reduction for two types of group rings, where the worst-case \SIVP problem is directly reduced to the (average-case) decision $\GRLWE$ problem. The pseudorandomness of $\GRLWE$ samples guaranteed by this reduction can be consequently leveraged to construct semantically secure public-key cryptosystems.
要約:
Text-to-visualization (text-to-vis) models for tabular data have become essential tools in the era of big data, enabling users to generate visualizations and make data-driven decisions through natural language queries (NLQs). Despite their growing adoption, the security vulnerabilities of these models remain largely unexplored. To address this gap, we propose VisPoison, a backdoor attack framework that realistically simulates three types of attacks on text-to-vis models via data poisoning: data exposure, misleading visualizations, and denial-of-service (DoS). Specifically, VisPoison introduces two types of stealthy triggers to enable both proactive and passive backdoor activations. Proactive triggers are deliberately inserted by attackers using rare-word patterns to extract sensitive information, whereas passive triggers are unintentionally activated by users through first-word prompts, resulting in visualization errors or DoS failures. To support these triggers, we craft specialized payloads for visualization queries that allow compromised models to function normally on benign inputs while producing malicious outputs in the presence of triggers. Extensive evaluations on both trainable and in-context learning (ICL)-based text-to-vis models show that VisPoison achieves attack success rates exceeding 90\%, exposing serious vulnerabilities. Additionally, existing defense strategies reveal limited effectiveness against VisPoison, underscoring the urgent need for more robust and security-aware text-to-vis systems to safeguard human-data interaction.
要約:
Ensuring the privacy of votes in an election is crucial for the integrity of a democratic process. Often, voting power is delegated to representatives (e.g., in congress) who subsequently vote on behalf of voters on specific issues. This delegation model is also widely used in Decentralized Autonomous Organizations (DAOs). Although several existing voting systems used in DAOs support private voting, they only offer public delegation. In this paper, we introduce Kite, a new protocol that enables $\textit{private}$ delegation of voting power for DAO members. Voters can freely delegate, revoke, and re-delegate their power without revealing any information about who they delegated to. Even the delegate does not learn who delegated to them. The only information that is recorded publicly is that the voter delegated or re-delegated their vote to someone. Kite accommodates both public and private voting for the delegates themselves. We analyze the security of our protocol within the Universal Composability (UC) framework. We implement Kite as an extension to the existing Governor Bravo smart contract on the Ethereum blockchain, that is widely used for DAO governance. Furthermore, we provide an evaluation of our implementation that demonstrates the practicality of the protocol. The most expensive operation is delegation due to the required zero-knowledge proofs. On a consumer-grade laptop, delegation takes between 7 and 167 seconds depending on the requested level of privacy.
要約:
Large Language Models (LLMs) are increasingly augmented with external tools through standardized interfaces like the Model Context Protocol (MCP). However, current MCP implementations face critical limitations: they typically require local process execution through STDIO transports, making them impractical for resource-constrained environments like mobile devices, web browsers, and edge computing. We present MCP Bridge, a lightweight RESTful proxy that connects to multiple MCP servers and exposes their capabilities through a unified API. Unlike existing solutions, MCP Bridge is fully LLM-agnostic, supporting any backend regardless of vendor. The system implements a risk-based execution model with three security levels-standard execution, confirmation workflow, and Docker isolation-while maintaining backward compatibility with standard MCP clients. However, reliable execution within this framework requires models that can strictly adhere to protocol schemas. To this end, we also fine-tuned the Qwen3 4B and 8B model family on the Agent-Ark/Toucan-1.5M dataset using four Reinforcement Learning techniques: Group Relative Policy Optimization (GRPO), Dr. GRPO, Beta Normalization Policy Optimization (BNPO), and Decoupled Clip and Dynamic sAmpling Policy Optimization (DAPO). Evaluated on the MCPToolBench++ benchmark, our optimized model achieves an F1 score of 73.0% that outperforms GPT-OSS-120B (62.17%) and remains competitive with the 70B+ parameter baselines. Evaluation demonstrates that MCP Bridge successfully addresses the constraints of direct MCP connections while providing enhanced security controls and cross-platform compatibility, enabling sophisticated LLM-powered applications in previously inaccessible environments.
要約:
Website fingerprinting (WF) attacks remain a significant threat to encrypted traffic, prompting the development of a wide range of defenses. Among these, two prominent classes are regularization-based defenses, which shape traffic using fixed padding rules, and supersequence-based approaches, which conceal traces among predefined patterns. In this work, we present a unified framework for designing an adaptive WF defense that combines the effectiveness of regularization with the provable security of supersequence-style grouping. The scheme first extracts behavioural patterns from traces and clusters them into (k,l)-diverse anonymity sets; an early-time-series classifier (adapted from ECDIRE) then switches from a conservative global set of regularization parameters to the lighter, set-specific parameters. We instantiate the design as Adaptive Tamaraw, a variant of Tamaraw that assigns padding parameters on a per-cluster basis while retaining its original information-theoretic guarantee. Comprehensive experiments on public real-world datasets confirm the benefits. By tuning k, operators can trade privacy for efficiency: in its high-privacy mode Adaptive Tamaraw pushes the bound on any attacker's accuracy below 30%, whereas in efficiency-centred settings it cuts total overhead by 99% compared with classic Tamaraw.
要約:
We prove the conjecture stated in Appendix F.3 of [Zhu et al. (2022)]: among all conversion rules that map a R\'enyi Differential Privacy (RDP) profile $\tau \mapsto \rho(\tau)$ to a valid hypothesis-testing trade-off $f$, the rule based on the intersection of single-order RDP privacy regions is optimal. This optimality holds simultaneously for all valid RDP profiles and for all Type I error levels $\alpha$. Concretely, we show that in the space of trade-off functions, the tightest possible bound is $f_{\rho(\cdot)}(\alpha) = \sup_{\tau \geq 0.5} f_{\tau,\rho(\tau)}(\alpha)$: the pointwise maximum of the single-order bounds for each RDP privacy region. Our proof unifies and sharpens the insights of [Balle et al. (2019)], [Asoodeh et al. (2021)], and [Zhu et al. (2022)]. Our analysis relies on a precise geometric characterization of the RDP privacy region, leveraging its convexity and the fact that its boundary is determined exclusively by Bernoulli mechanisms. Our results establish that the "intersection-of-RDP-privacy-regions" rule is not only valid, but optimal: no other black-box conversion can uniformly dominate it in the Blackwell sense, marking the fundamental limit of what can be inferred about a mechanism's privacy solely from its RDP guarantees.
要約:
Autonomous agents powered by large language models introduce a class of execution-layer vulnerabilities -- prompt injection, retrieval poisoning, and uncontrolled tool invocation -- that existing guardrails fail to address systematically. In this work, we propose the Layered Governance Architecture (LGA), a four-layer framework comprising execution sandboxing (L1), intent verification (L2), zero-trust inter-agent authorization (L3), and immutable audit logging (L4). To evaluate LGA, we construct a bilingual benchmark (Chinese original, English via machine translation) of 1,081 tool-call samples -- covering prompt injection, RAG poisoning, and malicious skill plugins -- and apply it to OpenClaw, a representative open-source agent framework. Experimental results on Layer 2 intent verification with four local LLM judges (Qwen3.5-4B, Llama-3.1-8B, Qwen3.5-9B, Qwen2.5-14B) and one cloud judge (GPT-4o-mini) show that all five LLM judges intercept 93.0-98.5% of TC1/TC2 malicious tool calls, while lightweight NLI baselines remain below 10%. TC3 (malicious skill plugins) proves harder at 75-94% IR among judges with meaningful precision-recall balance, motivating complementary enforcement at Layers 1 and 3. Qwen2.5-14B achieves the best local balance (98% IR, approximately 10-20% FPR); a two-stage cascade (Qwen3.5-9B->GPT-4o-mini) achieves 91.9-92.6% IR with 1.9-6.7% FPR; a fully local cascade (Qwen3.5-9B->Qwen2.5-14B) achieves 94.7-95.6% IR with 6.0-9.7% FPR for data-sovereign deployments. An end-to-end pipeline evaluation (n=100) demonstrates that all four layers operate in concert with 96% IR and a total P50 latency of approximately 980 ms, of which the non-judge layers contribute only approximately 18 ms. Generalization to the external InjecAgent benchmark yields 99-100% interception, confirming robustness beyond our synthetic data.
要約:
Modern vision-language-model (VLM) based graphical user interface (GUI) agents are expected not only to execute actions accurately but also to respond to user instructions with low latency. While existing research on GUI-agent security mainly focuses on manipulating action correctness, the security risks related to response efficiency remain largely unexplored. In this paper, we introduce SlowBA, a novel backdoor attack that targets the responsiveness of VLM-based GUI agents. The key idea is to manipulate response latency by inducing excessively long reasoning chains under specific trigger patterns. To achieve this, we propose a two-stage reward-level backdoor injection (RBI) strategy that first aligns the long-response format and then learns trigger-aware activation through reinforcement learning. In addition, we design realistic pop-up windows as triggers that naturally appear in GUI environments, improving the stealthiness of the attack. Extensive experiments across multiple datasets and baselines demonstrate that SlowBA can significantly increase response length and latency while largely preserving task accuracy. The attack remains effective even with a small poisoning ratio and under several defense settings. These findings reveal a previously overlooked security vulnerability in GUI agents and highlight the need for defenses that consider both action correctness and response efficiency. Code can be found in https://github.com/tu-tuing/SlowBA.
要約:
Federated learning (FL) enables collaborative training without pooling raw data, but standard FL relies on a central coordinator, which introduces a single point of failure and concentrates trust in the orchestration infrastructure. Decentralized federated learning (DFL) removes the coordinator and replaces client-server orchestration with peer-to-peer coordination, making learning dynamics topology-dependent and reshaping the associated security, privacy, and systems trade-offs. This survey systematically reviews DFL methods from 2018 through early 2026 and organizes them into two architectural families: traditional distributed FL and blockchain-based FL. We then propose a unified, challenge-driven taxonomy that maps both families to the core bottlenecks they primarily address, and we summarize prevailing evaluation practices and their limitations, exposing gaps in the literature. Finally, we distill lessons learned and outline research directions, emphasizing topology-aware threat models, privacy notions that reflect decentralized exposure, incentive mechanisms robust to manipulation, and the need to explicitly define whether the objective is a single global model or personalized solutions in decentralized settings.
要約:
Large Language Models (LLMs) are trained with safety alignment to prevent generating malicious content. Although some attacks have highlighted vulnerabilities in these safety-aligned LLMs, they typically have limitations, such as necessitating access to the model weights or the generation process. Since proprietary models through API-calling do not grant users such permissions, these attacks find it challenging to compromise them. In this paper, we propose Jailbreaking Using LLM Introspection (JULI), which jailbreaks LLMs by manipulating the token log probabilities, using a tiny plug-in block, BiasNet. JULI relies solely on the knowledge of the target LLM's predicted token log probabilities. It can effectively jailbreak API-calling LLMs under a black-box setting and knowing only top-$5$ token log probabilities. Our approach demonstrates superior effectiveness, outperforming existing state-of-the-art (SOTA) approaches across multiple metrics.
要約:
There are various notions of quantum pseudorandomness, such as pseudorandom unitaries (PRUs), pseudorandom state generators (PRSGs) and pseudorandom function-like state generators (PRFSGs). Unlike classical pseudorandomness, where different notions are known to be existentially equivalent, the relations between quantum pseudorandomness notions have yet to be fully established. We present evidence suggesting that some forms of quantum pseudorandomness are unlikely to be constructed from others, indicating that quantum pseudorandomness behaves quite differently from its classical counterpart.
Our main result is a unitary oracle separation where log-length output PRFSGs exist but quantum-computable pseudorandom generators (QPRGs) with negligible correctness error do not. This suggests that the inverse-polynomial error in state-of-the-art constructions of QPRGs from log-length PRSGs is inherent. To achieve this, we prove a novel geometric barrier theorem for the product Haar measure on quantum states, replacing usual concentration inequalities by certifying a non-negligible gap between two large trace-separated sets.
As further evidence that quantum pseudorandomness does not collapse to a single assumption, we obtain separations showing limitations of: (i) deriving ancilla-free PRUs from PRFSGs, and (ii) a natural way of constructing super-log-length PRSGs from log-length PRFSGs. These results highlight technical difficulties when dealing with ancillary registers, measurements, and adaptivity in the quantum setting. We also show an intriguing gentle behavior of intermediate measurements in algorithms producing high-purity outcome states, which may be of independent interest.
All our results are based on (variants of) oracles outputting Haar random quantum states per bit string - a quantum analogue of the random oracle model.