要約:
Smart Voice assistants (SVAs) are widely adopted by youth, yet privacy decision-making in these environments is often characterized by competing considerations rather than clear-cut preferences. While our prior research has examined privacy risks, benefits, trust, and self-efficacy as distinct predictors of behavior, less attention has been paid to how these factors combine into higher-level tension that shapes privacy outcomes. This study introduces a negotiation-based framework for understanding youth privacy decision-making with SVAs by operationalizing two composite indices: the Risk-Benefit Tension Index (RBTI) and the Control-Acceptance Tension Index (CATI), using survey data from 469 Canadian youth aged 16-24. We examine the distribution of these indices and their relationship with privacy-protective behavior and SVA usage. Results show that both indices are meaningfully associated with protective action. Frequent SVA usage exhibits more benefit-dominant and acceptance-leaning negotiation profiles, suggesting that convenience-driven engagement may come at the expense of perceived control. By reframing privacy decision-making as a process of negotiation rather than inconsistency, this study offers a complementary perspective on the privacy paradox and provides a compact measurement approach for capturing how youth navigate competing privacy pressures in voice-enabled ecosystems.
要約:
Verifying the success of computer use agent (CUA) trajectories is a critical challenge: without reliable verification, neither evaluation nor training signal can be trusted. In this paper, we present lessons learned from building a best-in-class verifier for web tasks we call the Universal Verifier. We design the Universal Verifier around four key principles: 1) constructing rubrics with meaningful, non-overlapping criteria to reduce noise; 2) separating process and outcome rewards that yield complementary signals, capturing cases where an agent follows the right steps but gets blocked or succeeds through an unexpected path; 3) distinguishing between controllable and uncontrollable failures scored via a cascading-error-free strategy for finer-grained failure understanding; and 4) a divide-and-conquer context management scheme that attends to all screenshots in a trajectory, improving reliability on longer task horizons. We validate these findings on CUAVerifierBench, a new set of CUA trajectories with both process and outcome human labels, showing that our Universal Verifier agrees with humans as often as humans agree with each other. We report a reduction in false positive rates to near zero compared to baselines like WebVoyager ($\geq$ 45\%) and WebJudge ($\geq$ 22\%). We emphasize that these gains stem from the cumulative effect of the design choices above. We also find that an auto-research agent achieves 70\% of expert quality in 5\% of the time, but fails to discover all strategies required to replicate the Universal Verifier. We open-source our Universal Verifier system along with CUAVerifierBench; available at https://github.com/microsoft/fara.
著者: Jepson Taylor (VEOX Research Group), Chris Brousseau (VEOX Research Group), Jordan Hildebrandt (VEOX Research Group), Kelli Quinn (VEOX Research Group)
要約:
AI IDEs and coding agents compress discovery, fetch, workspace open, installation, and execution into one low-observability loop. Existing defenses such as provenance frameworks, package and repository firewalls, runtime protection, and tool-approval prompts each cover part of that path, but they often leave the final consumer-side execution decision implicit. ZitPit is a 100% open-source Rust system that argues for a stricter boundary: first-seen external artifacts should become durable policy events before they gain execution rights on protected developer or CI hosts. The current public evidence is intentionally narrow and explicit. It includes repeated Git smart-HTTP intake measurements showing that approved artifacts can remain faster than unmanaged public fetch, plus implemented protected-session and governed-egress proof families. The broader contribution is architectural rather than universal-coverage-by-assertion: ZitPit unifies artifact admission, repo-open state, capability-scoped execution, and durable policy records at the consumer execution boundary for agentic workflows.
要約:
Large Language Models (LLMs) and Vision-Language Models (VLMs) remain highly vulnerable to textual and visual jailbreaks, as well as prompt injections (arXiv:2307.15043, Greshake et al., 2023, arXiv:2306.13213). Existing defenses often degrade performance through complex input transformations or treat multimodal threats as isolated problems (arXiv:2309.00614, arXiv:2310.03684, Zhang et al., 2025). To address the critical gap for a unified, modal-agnostic defense that mitigates both textual and visual threats simultaneously without degrading performance or requiring architectural modifications, we introduce SALLIE (Safeguarding Against Latent Language & Image Exploits), a lightweight runtime detection framework rooted in mechanistic interpretability (Lindsey et al., 2025, Ameisen et al., 2025). By integrating seamlessly into standard token-level fusion pipelines (arXiv:2306.13549), SALLIE extracts robust signals directly from the model's internal activations. At inference, SALLIE defends via a three-stage architecture: (1) extracting internal residual stream activations, (2) calculating layer-wise maliciousness scores using a K-Nearest Neighbors (k-NN) classifier, and (3) aggregating these predictions via a layer ensemble module. We evaluate SALLIE on compact, open-source architectures - Phi-3.5-vision-instruct (arXiv:2404.14219), SmolVLM2-2.2B-Instruct (arXiv:2504.05299), and gemma-3-4b-it (arXiv:2503.19786) - prioritized for practical inference times and real-world deployment costs. Our comprehensive evaluation pipeline spans over ten datasets and more than five strong baseline methods from the literature, and SALLIE consistently outperforms these baselines across a wide range of experimental settings.
要約:
The exponential growth of Common Vulnerabilities and Exposures (CVE) disclosures poses significant challenges for enterprise security management, necessitating automated and quantitative risk assessment methodologies. Existing vulnerability analysis approaches suffer from three critical limitations: (1) lack of systematic severity quantification models that integrate heterogeneous attack attributes, (2) insufficient exploration of latent correlations among risk factors, and (3) absence of cumulative risk distribution analysis for prioritized remediation. To address these challenges, we propose MVRAF (Multi-dimensional Vulnerability Risk Assessment Framework), a comprehensive data-driven framework for large-scale CVE security analysis. Our framework introduces three key innovations: (1) a Vulnerability Severity Quantification Model that transforms CVSS attributes into normalized risk metrics through weighted aggregation of exploitability and CIA impact scores, (2) a Risk Factor Correlation Analysis module that captures statistical dependencies among attack vectors, complexity, and privilege requirements via correlation matrices, and (3) an Empirical Risk Distribution mechanism that enables cumulative threat assessment for resource allocation optimization. Extensive experiments on 1,314 real-world CVE records from the National Vulnerability Database demonstrate that our framework effectively identifies risk hotspots, with 46.2% of network-based vulnerabilities classified as high-risk and strong correlations observed between CIA impacts and overall severity scores.
要約:
With the rapid growth of interconnected devices in Industrial and Medical Internet of Things (IIoT and MIoT) ecosystems, ensuring timely and accurate detection of cyber threats has become a critical challenge. This study presents an advanced intrusion detection framework based on a hybrid Squeeze-and-Excitation Attention Vision Transformer-Bidirectional Long Short-Term Memory (SE ViT-BiLSTM) architecture. In this design, the traditional multi-head attention mechanism of the Vision Transformer is replaced with Squeeze-and-Excitation attention, and integrated with BiLSTM layers to enhance detection accuracy and computational efficiency. The proposed model was trained and evaluated on two real-world benchmark datasets; EdgeIIoT and CICIoMT2024; both before and after data balancing using the Synthetic Minority Over-sampling Technique (SMOTE) and RandomOverSampler. Experimental results demonstrate that the SE ViT-BiLSTM model outperforms existing approaches across multiple metrics. Before balancing, the model achieved accuracies of 99.11% (FPR: 0.0013%, latency: 0.00032 sec/inst) on EdgeIIoT and 96.10% (FPR: 0.0036%, latency: 0.00053 sec/inst) on CICIoMT2024. After balancing, performance further improved, reaching 99.33% accuracy with 0.00035 sec/inst latency on EdgeIIoT and 98.16% accuracy with 0.00014 sec/inst latency on CICIoMT2024.
要約:
Software-Defined Networking (SDN) improves network flexibility but also increases the need for reliable and interpretable intrusion detection. Large Language Models (LLMs) have recently been explored for cybersecurity tasks due to their strong representation learning capabilities; however, their lack of transparency limits their practical adoption in security-critical environments. Understanding how LLMs make decisions is therefore essential. This paper presents an attribution-driven analysis of encoder-based LLMs for network intrusion detection using flow-level traffic features. Attribution analysis demonstrates that model decisions are driven by meaningful traffic behavior patterns, improving transparency and trust in transformer-based SDN intrusion detection. These patterns align with established intrusion detection principles, indicating that LLMs learn attack behavior from traffic dynamics. This work demonstrates the value of attribution methods for validating and trusting LLM-based security analysis.
要約:
The Zero-trust (ZT) model is an increasingly popular model that relies on the idea that no trust should be granted to any entity (network, persons, devices) by default. ZT model is gaining attention from both research and practice, with various levels of adequation between research developed and real-life applications. NIST provided a standard to fulfill requirements of ZT architecture of network core but many practical aspects remain unspecified, some of them requiring solving first research challenges in order to be implemented efficiently. An example of such an unspecified field is the integration of IoT/Smart Peripheral Devices (SPD). Various reasons explain this gap: specificities of such resources (possibly lower energy/computation power), their lifecycle, and their use, strongly depending on the use of the whole platform IoT devices are part of. Moreover, additional difficulty to have a good understanding is induced by the fact that both Zero Trust and IoT are identified as promising trends in cybersecurity: many vendors/researchers tag their solutions as IoT integration into the ZT model, with little to no effective compliance to ZT model or standard. Industry is providing many practice-oriented literature, that has to be compared to academic work and standards, in order to consolidate the current state of knowledge and solutions offered to realize this integration. In this paper, we conduct a literature review of non-academic publications, in order to consolidate current knowledge, trends, and future challenges for the industrial integration of IoT devices in ZT architecture.
要約:
In recent years, the pace of development of information technology in various areas has increased drastically, forcing cybersecurity specialists to constantly review existing processes in order to prevent unauthorized access to confidential information. Using Ukraine as a primary case study, this paper explores the integration of international best practices, specifically ISO/IEC 27001 and the NIST Cybersecurity Framework, into national regulatory systems. A focus is placed on the transition from traditional compliance models to risk-based approaches, exemplified by the recent adoption of the Ukrainian normative documents. Furthermore, we propose a methodology for automating the development of target security profiles using Large Language Models (LLMs) enhanced by RetrievalAugmented Generation (RAG). By integrating a vector database of national regulations and organizational policies, the proposed RAG-based advisor reduces manual complexity, minimizes human error, and ensures alignment between technical controls and legal requirements. This study contributes to the field by providing a structured workflow for AI-assisted cybersecurity management in environments characterized by high-intensity hybrid threats.
要約:
Autonomous AI agents powered by Large Language Models can reason, plan, and execute complex tasks, but their ability to autonomously retrieve information and run code introduces significant security risks. Existing approaches attempt to regulate agent behavior through training or prompting, which does not offer fundamental security guarantees. We present ClawLess, a security framework that enforces formally verified policies on AI agents under a worst-case threat model where the agent itself may be adversarial. ClawLess formalizes a fine-grained security model over system entities, trust scopes, and permissions to express dynamic policies that adapt to agents' runtime behavior. These policies are translated into concrete security rules and enforced through a user-space kernel augmented with BPF-based syscall interception. This approach bridges the formal security model with practical enforcement, ensuring security regardless of the agent's internal design.
要約:
Vision-Language Models (VLMs) have become essential for tasks such as image synthesis, captioning, and retrieval by aligning textual and visual information in a shared embedding space. Yet, this flexibility also makes them vulnerable to malicious prompts designed to produce unsafe content, raising critical safety concerns. Existing defenses either rely on blacklist filters, which are easily circumvented, or on heavy classifier-based systems, both of which are costly and fragile under embedding-level attacks. We address these challenges with two complementary components: Hyperbolic Prompt Espial (HyPE) and Hyperbolic Prompt Sanitization (HyPS). HyPE is a lightweight anomaly detector that leverages the structured geometry of hyperbolic space to model benign prompts and detect harmful ones as outliers. HyPS builds on this detection by applying explainable attribution methods to identify and selectively modify harmful words, neutralizing unsafe intent while preserving the original semantics of user prompts. Through extensive experiments across multiple datasets and adversarial scenarios, we prove that our framework consistently outperforms prior defenses in both detection accuracy and robustness. Together, HyPE and HyPS offer an efficient, interpretable, and resilient approach to safeguarding VLMs against malicious prompt misuse.
要約:
In this paper, we analyze and improve the adversarial robustness of a convolutional neural network (CNN) that assists crystal-collimator alignment at CERN's Large Hadron Collider (LHC) by classifying a beam-loss monitor (BLM) time series during crystal rotation. We formalize a local robustness property for this classifier under an adversarial threat model based on real-world plausibility. Building on established parameterized input-transformation patterns used for transformation- and semantic-perturbation robustness, we instantiate a preprocessing-aware wrapper for our deployed time-series pipeline: we encode time-series normalization, padding constraints, and structured perturbations as a lightweight differentiable wrapper in front of the CNN, so that existing gradient-based robustness frameworks can operate on the deployed pipeline. For formal verification, data-dependent preprocessing such as per-window z-normalization introduces nonlinear operators that require verifier-specific abstractions. We therefore focus on attack-based robustness estimates and pipeline-checked validity by benchmarking robustness with the frameworks Foolbox and ART. Adversarial fine-tuning of the resulting CNN improves robust accuracy by up to 18.6 % without degrading clean accuracy. Finally, we extend robustness on time-series data beyond single windows to sequence-level robustness for sliding-window classification, introduce adversarial sequences as counterexamples to a temporal robustness requirement over full scans, and observe attack-induced misclassifications that persist across adjacent windows.
要約:
Given the growing reliance on private data in training Large Language Models (LLMs), Federated Learning (FL) combined with Parameter-Efficient Fine-Tuning (PEFT) has garnered significant attention for enhancing privacy and efficiency. Despite FL's privacy benefits, prior studies have shown that private data can still be extracted from shared gradients. However, these studies, mainly on full-parameter model training, are limited to reconstructing small batches, short input sequences, and specific model architectures, such as encoder-based or decoder-based models. The reconstruction quality becomes even worse when dealing with gradients from PEFT methods. To fully understand the practical attack surface of federated LLMs, this paper proposes FedSpy-LLM, a scalable and generalizable data reconstruction attack designed to reconstruct training data with larger batch sizes and longer sequences while generalizing across diverse model architectures, even when PEFT methods are deployed for training. At the core of FedSpy-LLM is a novel gradient decomposition strategy that exploits the rank deficiency and subspace structure of gradients, enabling efficient token extraction while preserving key signal components at scale. This approach further mitigates the reconstruction challenges introduced by PEFT's substantial null space, ensuring robustness across encoder-based, decoder-based, and encoder-decoder model architectures. Additionally, by iteratively aligning each token's partial-sequence gradient with the full-sequence gradient, FedSpy-LLM ensures accurate token ordering in reconstructed sequences.
要約:
The rapid evolution of intelligent networks under the Internet of Everything (IoE) paradigm is transforming connectivity by integrating people, processes, data, and things. This ecosystem includes domains such as the Internet of Things (IoT), Internet of Healthcare (IoH), Internet of Vehicles (IoV), and cyber-physical and human-machine systems. While enabling efficiency and automation, this interconnectivity also exposes critical infrastructures to increasingly sophisticated cyber threats, creating an urgent need for advanced security solutions. This chapter examines the integration of Blockchain and Artificial Intelligence (AI) as complementary approaches for securing intelligent networks. Blockchain provides decentralized, immutable, and transparent mechanisms that strengthen data integrity, trust, and accountability. In parallel, AI offers predictive analytics, anomaly detection, and adaptive defense capabilities to enable proactive threat identification and mitigation. The chapter discusses how Blockchain supports security in cyber-physical systems, how AI enables proactive security operations, and how their combination creates robust, adaptive, and trustworthy security frameworks. The chapter also explores the emerging role of large language models in threat intelligence and analyzes how controlled agentic AI can support bounded security workflows such as alert triage, evidence collection, and policy-aware response planning. Representative case studies illustrate the potential of these technologies to enhance cyber resilience. Finally, challenges related to scalability, energy efficiency, and ethical considerations are addressed, along with reported mitigation strategies and future research directions. Overall, this chapter provides researchers, practitioners, and policymakers with insights to design secure, resilient, and adaptable intelligent networks.
要約:
Web agents automate browser tasks, ranging from simple form completion to complex workflows like ordering groceries. While current benchmarks evaluate general-purpose performance~(e.g., WebArena) or safety against malicious actions~(e.g., SafeArena), no existing framework assesses an agent's ability to successfully execute user-facing website security and privacy tasks, such as managing cookie preferences, configuring privacy-sensitive account settings, or revoking inactive sessions. To address this gap, we introduce WebSP-Eval, an evaluation framework for measuring web agent performance on website security and privacy tasks. WebSP-Eval comprises 1) a manually crafted task dataset of 200 task instances across 28 websites; 2) a robust agentic system supporting account and initial state management across runs using a custom Google Chrome extension; and 3) an automated evaluator. We evaluate a total of 8 web agent instantiations using state-of-the-art multimodal large language models, conducting a fine-grained analysis across websites, task categories, and UI elements. Our evaluation reveals that current models suffer from limited autonomous exploration capabilities to reliably solve website security and privacy tasks, and struggle with specific task categories and websites. Crucially, we identify stateful UI elements such as toggles and checkboxes are a primary reason for agent failure, failing at a rate of more than 45\% in tasks containing these elements across many models.
要約:
LLM agents increasingly draft messages on behalf of users, yet users routinely overshare sensitive information and disagree on what counts as private. Existing systems support only suppression (omitting sensitive information) and generalization (replacing information with an abstraction), and are typically evaluated on single isolated messages, leaving both the strategy space and evaluation setting incomplete. We formalize privacy-preserving LLM communication as an \textbf{Information Sufficiency (IS)} task, introduce \textbf{free-text pseudonymization} as a third strategy that replaces sensitive attributes with functionally equivalent alternatives, and propose a \textbf{conversational evaluation protocol} that assesses strategies under realistic multi-turn follow-up pressure. Across 792 scenarios spanning three power-relation types (institutional, peer, intimate) and three sensitivity categories (discrimination risk, social cost, boundary), we evaluate seven frontier LLMs on privacy at two granularities, covertness, and utility. Pseudonymization yields the strongest privacy\textendash utility tradeoff overall, and single-message evaluation systematically underestimates leakage, with generalization losing up to 16.3 percentage points of privacy under follow-up.
要約:
CubeSats have revolutionized access to space by providing affordable and accessible platforms for research and education. However, their reliance on Commercial Off-The-Shelf (COTS) components and open-source software has introduced significant cybersecurity vulnerabilities. Ensuring the cybersecurity of CubeSats is vital as they play increasingly important roles in space missions. Traditional security measures, such as intrusion detection systems (IDS), are impractical for CubeSats due to resource constraints and unique operational environments. This paper provides an in-depth review of current cybersecurity practices for CubeSats, highlighting limitations and identifying gaps in existing methods. Additionally, it explores non-cyber anomaly detection techniques that offer insights into adaptable algorithms and deployment strategies suitable for CubeSat constraints. Open research problems are identified, including the need for resource-efficient intrusion detection mechanisms, evaluation of IDS solutions under realistic mission scenarios, development of autonomous response systems, and creation of cybersecurity frameworks. The addition of TinyML into CubeSat systems is explored as a promising solution to address these challenges, offering resource-efficient, real-time intrusion detection capabilities. Future research directions are proposed, such as integrating cybersecurity with health monitoring systems, and fostering collaboration between cybersecurity researchers and space domain experts.
要約:
We prove that no continuous, utility-preserving wrapper defense-a function $D: X\to X$ that preprocesses inputs before the model sees them-can make all outputs strictly safe for a language model with connected prompt space, and we characterize exactly where every such defense must fail. We establish three results under successively stronger hypotheses: boundary fixation-the defense must leave some threshold-level inputs unchanged; an $\epsilon$-robust constraint-under Lipschitz regularity, a positive-measure band around fixed boundary points remains near-threshold; and a persistent unsafe region under a transversality condition, a positive-measure subset of inputs remains strictly unsafe. These constitute a defense trilemma: continuity, utility preservation, and completeness cannot coexist. We prove parallel discrete results requiring no topology, and extend to multi-turn interactions, stochastic defenses, and capacity-parity settings. The results do not preclude training-time alignment, architectural changes, or defenses that sacrifice utility. The full theory is mechanically verified in Lean 4 and validated empirically on three LLMs.
要約:
Symbolic execution detects vulnerabilities with precision, but applying it to large codebases requires harnesses that set up symbolic state, model dependencies, and specify assertions. Writing these harnesses has traditionally been a manual process requiring expert knowledge, which significantly limits the scalability of the technique. We present Static Analysis Informed and LLM-Orchestrated Symbolic Execution (SAILOR), which automates symbolic execution harness construction by combining static analysis with LLM-based synthesis. SAILOR operates in three phases: (1) static analysis identifies candidate vulnerable locations and generates vulnerability specifications; (2) an LLM uses vulnerability specifications and orchestrates harness synthesis by iteratively refining drivers, stubs, and assertions against compiler and symbolic execution feedback; symbolic execution then detects vulnerabilities using the generated harness, and (3) concrete replay validates the symbolic execution results against the unmodified project source. This design combines the scalability of static analysis, the code reasoning of LLMs, the path precision of symbolic execution, and the ground truth produced by concrete execution. We evaluate SAILOR on 10 open-source C/C++ projects totaling 6.8 M lines of code. SAILOR discovers 379 distinct, previously unknown memory-safety vulnerabilities (421 confirmed crashes). The strongest of five baselines we compare SAILOR to (agentic vulnerability detection using Claude Code with full codebase access and unlimited interaction), finds only 12 vulnerabilities. Each phase of SAILOR is critical: Without static analysis targeting confirmed vulnerabilities drop 12.2X; without iterative LLM synthesis zero vulnerabilities are confirmed; and without symbolic execution no approach can detect more than 12 vulnerabilities.
要約:
OpenClaw's ClawHub marketplace hosts over 13,000 community-contributed agent skills, and between 13% and 26% of them contain security vulnerabilities according to recent audits. Regex scanners miss obfuscated payloads; formal static analyzers cannot read the natural language instructions in SKILL.md files where prompt injection and social engineering attacks hide. Neither approach handles both modalities. SkillSieve is a three-layer detection framework that applies progressively deeper analysis only where needed. Layer 1 runs regex, AST, and metadata checks through an XGBoost-based feature scorer, filtering roughly 86% of benign skills in under 40ms on average at zero API cost. Layer 2 sends suspicious skills to an LLM, but instead of asking one broad question, it splits the analysis into four parallel sub-tasks (intent alignment, permission justification, covert behavior detection, cross-file consistency), each with its own prompt and structured output. Layer 3 puts high-risk skills before a jury of three different LLMs that vote independently and, if they disagree, debate before reaching a verdict. We evaluate on 49,592 real ClawHub skills and adversarial samples across five evasion techniques, running the full pipeline on a 440 ARM single-board computer. On a 400-skill labeled benchmark, SkillSieve achieves 0.800 F1, outperforming ClawVet's 0.421, at an average cost of 0.006 per skill. Code, data, and benchmark are open-sourced.
要約:
Concept drift and adversarial evasion are two major challenges for deploying machine learning-based malware detectors. While both have been studied separately, their combination, the adversarial robustness of drift-adaptive detectors, remains unexplored. We address this problem with AdvDA, a recent malware detector that uses adversarial domain adaptation to align a labeled source domain with a target domain with limited labels. The distribution shift between domains poses a unique challenge: robustness learned on the source may not transfer to the target, and existing defenses assume a fixed distribution. To address this, we propose a universal robustification framework that fine-tunes a pretrained AdvDA model on adversarially transformed inputs, agnostic to the attack type and choice of transformations. We instantiate it with five defense variants spanning two threat models: white-box PGD attacks in the feature space and black-box MalGuise attacks that modify malware binaries via functionality-preserving control-flow mutations. Across nine defense configurations, five monthly adaptation windows on Windows malware, and three false-positive-rate operating points, we find the undefended AdvDA completely vulnerable to PGD (100% attack success) and moderately to MalGuise (13%). Our framework reduces these rates to as low as 3.2% and 5.1%, respectively, but the optimal strategy differs: source adversarial training is essential for PGD defenses yet counterproductive for MalGuise defenses, where target-only training suffices. Furthermore, robustness does not transfer across these two threat models. We provide deployment recommendations that balance robustness, detection accuracy, and computational cost.
要約:
While recent approaches leverage large language models (LLMs) and multi-agent pipelines to automatically generate proof-of-concept (PoC) exploits from vulnerability reports, existing systems often suffer from two fundamental limitations: unreliable validation based on surface-level execution signals and high operational cost caused by extensive trial-and-error during exploit generation. In this paper, we present PoC-Adapt, an end-to-end framework for automated PoC generation and verification, architected upon a foundation semantic runtime validation and adaptive policy learning. At the core of PoC-Adapt is a Semantic Oracle that validates exploits by comparing structured pre- and post-execution system states, enabling reliable distinction between true vulnerability exploitation and incidental behavioral changes. To reduce exploration cost, we further introduce an Adaptive Policy Learning mechanism that learns an exploitation policy over semantic states and actions, guiding the exploit agent toward effective strategies with fewer failed attempts. PoC-Adapt is implemented as a multi-agent system comprising specialized agents for root cause analysis, environment building, exploit generation, and semantic validation, coordinated through structured feedback loops. Experimenting on the CWE-Bench-Java and PrimeVul benchmarks shows that PoC-Adapt significantly improves verification reliability by 25% and reduces exploit generation cost compared to prior LLM-based systems, highlighting the importance of semantic validation and learned action policies in automated vulnerability reproduction. Applied to the latest CVE corpus, PoC-Adapt confirmed 12 verified PoC out of 80 reproduce attempts at a cost of $0.42 per generated exploit
要約:
Recent advancements in Large Language Models (LLMs) have sparked interest in their application to Static Application Security Testing (SAST), primarily due to their superior contextual reasoning capabilities compared to traditional symbolic or rule-based methods. However, existing LLM-based approaches typically attempt to replace human experts directly without integrating effectively with existing SAST tools. This lack of integration results in ineffectiveness, including high rates of false positives, hallucinations, limited reasoning depth, and excessive token usage, making them impractical for industrial deployment. To overcome these limitations, we present a paradigm shift that reorchestrates the SAST workflow from current LLM-assisted structure to a new LLM-centered workflow. We introduce Argus (Agentic and Retrieval-Augmented Guarding System), the first multi-agent framework designed specifically for vulnerability detection. Argus incorporates three key novelties: comprehensive supply chain analysis, collaborative multi-agent workflows, and the integration of state-of-the-art techniques such as Retrieval-Augmented Generation (RAG) and ReAct to minimize hallucinations and enhance reasoning. Extensive empirical evaluation demonstrates that Argus significantly outperforms existing methods by detecting a higher volume of true vulnerabilities while simultaneously reducing false positives and operational costs. Notably, Argus has identified several critical zero-day vulnerabilities with CVE assignments.
要約:
Effective detection of unknown network security threats in multi-class imbalanced environments is critical for maintaining cyberspace security. Current methods focus on learning class representations but face challenges with unknown threat detection, class imbalance, and lack of interpretability, limiting their practical use. To address this, we propose RPM-Net, a novel framework that introduces reciprocal point mechanism to learn "non-class" representations for each known attack category, coupled with adversarial margin constraints that provide geometric interpretability for unknown threat detection. RPM-Net++ further enhances performance through Fisher discriminant regularization. Experimental results show that RPM-Net achieves superior performance across multiple metrics including F1-score, AUROC, and AUPR-OUT, significantly outperforming existing methods and offering practical value for real-world network security applications. Our code is available at:https://github.com/chiachen-chang/RPM-Net
要約:
Recent standards such as RSL address AI content policy declaration -- telling AI systems what the licensing terms are. However, no existing system provides audit infrastructure -- tamper-evident licensing transaction records with independently verifiable proofs that those records have not been retroactively modified. We describe Aegon, a protocol that extends standard JWT tokens with content-specific licensing claims and maintains a Certificate Transparency-style Merkle tree over an append-only transaction ledger, enabling third-party auditors to independently verify that specific content licensing transactions were recorded and have not been retroactively modified. Publishers validate tokens at the edge using standard JWKS with no broker dependency in the content delivery path. A signed provenance event log tracks content through AI transformation stages (chunking, embedding, retrieval, citation), bound to ledger entries by transaction ID. We further describe hardware-attested compliance receipts for on-device Android AI agents using StrongBox secure element attestation -- to our knowledge, the first application of hardware-attested compliance receipts to AI content licensing. Existing DRM systems use hardware-backed keys for content decryption but do not produce verifiable compliance receipts for audit trails. We describe a reference architecture and an evaluation methodology for measuring protocol overhead. The protocol runs entirely over standard HTTPS and is designed to complement existing licensing standards rather than replace them.
要約:
Quantum computing simulators form the classical software foundation on which virtually all quantum algorithm research depends. We present Broken Quantum, the first comprehensive formal security audit of the open-source quantum computing simulator ecosystem. Applying COBALT QAI -- a four-module static analysis engine backed by the Z3 SMT solver -- we analyze 45 open-source quantum simulation frameworks from 22 organizations spanning 12 countries. We identify 547 security findings (40 CRITICAL, 492 HIGH, 15 MEDIUM) across four vulnerability classes: CWE-125/190 (C++ memory corruption), CWE-400 (Python resource exhaustion), CWE-502/94 (unsafe deserialization and code injection), and CWE-77/22 (QASM injection -- a novel, quantum-specific attack vector with no classical analog). All 13 vulnerability patterns are formally verified via Z3 satisfiability proofs (13/13 SAT). The 32-qubit boundary emerges as a consistent formal threshold in both C++ and Python vulnerability chains. Supply chain analysis identifies the first documented case of vulnerability transfer from a commercial quantum framework into US national laboratory infrastructure (IBM Qiskit Aer to XACC/Oak Ridge National Laboratory). Nine frameworks score 100/100 under all four scanners; Qiskit Aer,Cirq, tequila, PennyLane, and 5 others score 0/100.
要約:
In video conferencing, human faces serve as the primary visual focal points, playing multifaceted roles that enhance visual communication and emotional connection. However, we argue that a human face is also a side channel, which can unwittingly leak on-screen information through online video feeds. To demonstrate this, we conduct feasibility studies, which reveal that, illuminated by both ambient light and light emitted from displays, the human face can reflect optical variations of different on-screen content. The paper then proposes FaceTell, a novel side-channel attack system that eavesdrops on fine-grained application activities from pervasive yet subtle facial reflections during video conferencing. We implement FaceTell in a real-world testbed with three different brands of laptops and four mainstream video conferencing platforms. FaceTell is then evaluated with 24 human subjects across 13 unique indoor environments. With more than 12 hours of video data, FaceTell achieves a high accuracy of 99.32% for eavesdropping on 28 popular applications and is resilient to many practical impact factors. Finally, potential countermeasures are proposed to mitigate this new attack.
要約:
The lead marketing ecosystem enables collection, sale, and use of personal data submitted via web forms to deliver personalized quotes in high-value verticals such as insurance. Despite its scale and sensitivity of the collected data, this ecosystem remains largely unexplored by the research community. We present the first empirical study of privacy and spam risks in lead marketing, developing an end-to-end measurement framework to trace data flows from data collection to consumer contact. Our setup instruments over 100 health-related lead-generation websites and monitors 200 controlled phone numbers and email addresses to understand downstream marketing practices. We observe sharing of highly personal and sensitive health information to more than 70 distinct third parties on these lead generation websites. By purchasing our own and other organic leads from three major lead platforms, we uncover deceptive brokerage practices, where consumer data is sold to unvetted buyers and often augmented or fabricated with attributes such as health status and weight. We received a total of over 8,000 telemarketing phone calls, 600 text messages, and 200 emails, where calls often began within seconds of form submission. Many campaigns relied on VoIP-based neighbor spoofing and high-frequency dialing, at times rendering phones unusable. Our experiments with phone and email opt-outs suggest phone-based opt-outs to help the most, although all were ineffective at completely stopping marketing communications. Analysis of 7,432 Better Business Bureau (BBB) complaints and reviews corroborates these findings from the consumer perspective. Overall, our results reveal a highly interconnected and non-compliant lead marketing ecosystem that aggressively monetizes sensitive consumer data.
要約:
Security Information and Event Management (SIEM) systems make it possible for detecting intrusion anomalies in real-time manner by their applied security rules. However, the heterogeneity of vendor-specific rules (e.g., Splunk SPL, Microsoft KQL, IBM AQL, Google YARA-L, and RSA ESA) makes cross-platform rule reuse extremely difficult, requiring deep domain knowledge for reliable conversion. As a result, an autonomous and accurate rule conversion framework can significantly lead to effort savings, preserving the value of existing rules. In this paper, we propose ARuleCon, an agentic SIEM-rule conversion approach. Using ARuleCon, the security professionals do not need to distill the source rules' logic, the documentation of the target rules and ARuleCon can purposely convert to the target vendors without more intervention. To achieve this, ARuleCon is equipped with conversion/schema mismatches, and Python-based consistency check that running both source and target rules in controlled test environments to mitigate subtle semantic drifts. We present a comprehensive evaluation of ARuleCon ranging from textual alignment and the execution success, showcasing ARuleCon can convert rules with high fidelity, outperforming the baseline LLM model by 15% averagely. Finally, we perform case studies and interview with our industry collaborators in Singtel Singapore, which showcases that ARuleCon can significantly save expert's time on understanding cross-SIEM's documentation and remapping logic.
要約:
Skill-based agent systems tackle complex tasks by composing reusable skills, improving modularity and scalability while introducing a largely unexamined security attack surface. We propose SkillTrojan, a backdoor attack that targets skill implementations rather than model parameters or training data. SkillTrojan embeds malicious logic inside otherwise plausible skills and leverages standard skill composition to reconstruct and execute an attacker-specified payload. The attack partitions an encrypted payload across multiple benign-looking skill invocations and activates only under a predefined trigger. SkillTrojan also supports automated synthesis of backdoored skills from arbitrary skill templates, enabling scalable propagation across skill-based agent ecosystems. To enable systematic evaluation, we release a dataset of 3,000+ curated backdoored skills spanning diverse skill patterns and trigger-payload configurations. We instantiate SkillTrojan in a representative code-based agent setting and evaluate both clean-task utility and attack success rate. Our results show that skill-level backdoors can be highly effective with minimal degradation of benign behavior, exposing a critical blind spot in current skill-based agent architectures and motivating defenses that explicitly reason about skill composition and execution. Concretely, on EHR SQL, SkillTrojan attains up to 97.2% ASR while maintaining 89.3% clean ACC on GPT-5.2-1211-Global.
要約:
Current LLM-based services typically require users to submit raw text regardless of its sensitivity. While intuitive, such practice introduces substantial privacy risks, as unauthorized access may expose personal, medical, or legal information. Although prior defenses strived to mitigate these risks, they often incur substantial computational overhead and degrade model performance. To overcome this privacy-efficiency trade-off, we introduce Privacy-Preserving Fine-Tuning (PPFT), a novel training pipeline that eliminates the need for transmitting raw prompt text while maintaining a favorable balance between privacy preservation and model utility for both clients and service providers. Our approach operates in two stages: first, we train a client-side encoder together with a server-side projection module and LLM, enabling the server to condition on k-pooled prompt embeddings instead of raw text; second, we fine-tune the projection module and LLM on private, domain-specific data using noise-injected embeddings, allowing effective adaptation without exposing plain text prompts and requiring access to the decoder's internal parameters. Extensive experiments on domain-specific and general benchmarks demonstrate that PPFT achieves a striking balance between privacy and utility, maintaining competitive performance with minimal degradation compared to noise-free upper bounds.
要約:
As high quality public data becomes scarce, Federated Learning (FL) provides a vital pathway to leverage valuable private user data while preserving privacy. However, real-world client data often contains toxic or unsafe information. This leads to a critical issue we define as unintended data poisoning, which can severely damage the safety alignment of global models during federated alignment. To address this, we propose FedDetox, a robust framework tailored for Small Language Models (SLMs) on resource-constrained edge devices. We first employ knowledge distillation to transfer sophisticated safety alignment capabilities from large scale safety aligned teacher models into light weight student classifiers suitable for resource constrained edge devices. Specifically, during federated learning for human preference alignment, the edge client identifies unsafe samples at the source and replaces them with refusal templates, effectively transforming potential poisons into positive safety signals. Experiments demonstrate that our approach preserves model safety at a level comparable to centralized baselines without compromising general utility.
要約:
While Chain-of-Thought (CoT) prompting has become a standard paradigm for eliciting complex reasoning capabilities in Large Language Models, it inadvertently exposes a new attack surface for backdoor attacks. Existing CoT backdoor attacks typically manipulate the intermediate reasoning steps to steer the model toward incorrect answers. However, these corrupted reasoning traces are readily detected by prevalent process-monitoring defenses. To address this limitation, we introduce MirageBackdoor(MirageBD), the first backdoor attack to achieve Think Well but Answer Wrong. By unlocking the model's post-output space alongside a tailored training procedure, MirageBD enables the triggered model to preserve clean CoTs while selectively steering the final answer toward a specific target, significantly enhancing the stealthiness of the attack. Experiments show that MirageBD generally achieves over 90% attack success rate across four datasets and five models with a poison ratio of only 5%. Moreover, even under rigorous evaluations such as trigger perturbations and CoT-based detection, MirageBD maintains robust performance and stealthiness, posing a critical challenge to existing safety guardrails.
要約:
Data leakage is the inadvertent transfer of information between training and evaluation datasets that poses a subtle, yet critical, risk to the reliability of machine learning (ML) models in safety-critical systems such as automotive perception. While leakage is widely recognized in research, little is known about how industrial practitioners actually perceive and manage it in practice. This study investigates practitioners' knowledge, experiences, and mitigation strategies around data leakage through ten semi-structured interviews with system design, development, and verification engineers working on automotive perception functions development. Using reflexive thematic analysis, we identify that knowledge of data leakage is widespread and fragmented along role boundaries: ML engineers conceptualize it as a data-splitting or validation issue, whereas design and verification roles interpret it in terms of representativeness and scenario coverage. Detection commonly arises through generic considerations and observed performance anomalies rather than implying specific tools. However, data leakage prevention is more commonly practiced, which depends mostly on experience and knowledge sharing. These findings suggest that leakage control is a socio-technical coordination problem distributed across roles and workflows. We discuss implications for ML reliability engineering, highlighting the need for shared definitions, traceable data practices, and continuous cross-role communication to institutionalize data leakage awareness within automotive ML development.
要約:
Ensuring ciphertext indistinguishability is fundamental to cryptographic security, but empirically validating this property in real implementations and hybrid settings presents practical challenges. The transition to post-quantum cryptography (PQC), with its hybrid constructions combining classical and quantum-resistant primitives, makes empirical validation approaches increasingly valuable. By modeling IND-CPA games as binary classification tasks and training on labeled ciphertext data with BCE loss, we study deep neural network (DNN) distinguishers for ciphertext indistinguishability. We apply this methodology to PQC KEMs. We specifically test the public-key encryption (PKE) schemes used to construct examples such as ML-KEM, BIKE, and HQC. Moreover, a novel extension of this DNN modeling for empirical distinguishability testing of hybrid KEMs is presented. We implement and test this on combinations of PQC KEMs with plain RSA, RSA-OAEP, and plaintext. Finally, methodological generality is illustrated by applying the DNN IND-CPA classification framework to cascade symmetric encryption, where we test combinations of AES-CTR, AES-CBC, AES-ECB, ChaCha20, and DES-ECB. In our experiments on PQC algorithms, KEM combiners, and cascade encryption, no algorithm or combination of algorithms demonstrates a significant advantage (two-sided binomial test, significance level $\alpha = 0.01$), consistent with theoretical guarantees that hybrids including at least one IND-CPA-secure component preserve indistinguishability, and with the absence of exploitable patterns under the considered DNN adversary model. These illustrate the potential of using deep learning as an adaptive, practical, and versatile empirical estimator for indistinguishability in more general IND-CPA settings, allowing data-driven validation of implementations and compositions and complementing the analytical security analysis.
要約:
Software vulnerabilities continue to pose significant threats to modern information systems, requiring a timely and accurate risk assessment. Public repositories, such as the National Vulnerability Database and CVE details, are regularly updated, but predominantly utilize relational data models that lack native support for representing complex, interconnected structures. To address this, recent research has proposed graph-based vulnerability models. However, these systems often require complex setup procedures, lack real-time multi-source integration, and offer limited accessibility for direct data retrieval and analysis. We present VulGD, a dynamic open-access vulnerability graph database that continuously aggregates cybersecurity data from authoritative repositories. Designed for both expert and non-expert users, VulGD provides a unified web interface and a public API for interactive graph exploration and automated data access. Additionally, VulGD integrates embeddings from large language models (LLMs) to enrich vulnerability description representations, facilitating more accurate vulnerability risk assessment and threat prioritization. VulGD represents a practical and extensible platform for cybersecurity research and decision-making. The live system is publicly accessible at http://34.129.186.158/.
要約:
With the rapid advancement of decentralized applications, smart contract security faces severe challenges, particularly regarding atomicity violations in complex logic such as Oracle and NFT contracts. Rigid rule sets often limit traditional static analyzers and lack deep contextual awareness, leading to high false-positive and false-negative rates when identifying vulnerabilities that depend on intermediate state inconsistencies. To address these limitations, this paper proposes PSR\textsuperscript{2}, a novel collaborative static analysis framework that integrates structural path searching with deterministic semantic reasoning. PSR\textsuperscript{2} utilizes a Graph Structure Analysis Module (GSAM) to identify suspicious execution sequences in control flow graphs and a Semantic Context Analysis Module (SCAM) to extract data dependencies and state facts from abstract syntax trees. A Fusion Decision Module (FDM) then performs formal cross validation to confirm vulnerabilities based on a unified atomicity inconsistency model. Experimental results on 1,600 contract samples demonstrate that PSR\textsuperscript{2} significantly outperforms pattern-matching baselines, achieving an F1-score of 94.69\% in complex ERC-721 scenarios compared to 51.86\% for existing tools. Ablation studies further confirm that our fusion logic effectively reduces the false-positive rate by nearly half compared to single module analysis.
要約:
This article presents DDP-SA, a scalable privacy-preserving federated learning framework that jointly leverages client-side local differential privacy (LDP) and full-threshold additive secret sharing (ASS) for secure aggregation. Unlike existing methods that rely solely on differential privacy or on secure multi-party computation (MPC), DDP-SA integrates both techniques to deliver stronger end-to-end privacy guarantees while remaining computationally practical. The framework introduces a two-stage protection mechanism: clients first perturb their local gradients with calibrated Laplace noise, then decompose the noisy gradients into additive secret shares that are distributed across multiple intermediate servers. This design ensures that (i) no single compromised server or communication channel can reveal any information about individual client updates, and (ii) the parameter server reconstructs only the aggregated noisy gradient, never any client-specific contribution. Extensive experiments show that DDP-SA achieves substantially higher model accuracy than standalone LDP while providing stronger privacy protection than MPC-only approaches. The proposed framework scales linearly with the number of participants and offers a practical, privacy-preserving solution for federated learning applications with controllable computational and communication overhead.
要約:
As large language models (LLMs) evolve from static chatbots into autonomous agents, the primary vulnerability surface shifts from final outputs to intermediate execution traces. While safety guardrails are well-benchmarked for natural language responses, their efficacy remains largely unexplored within multi-step tool-use trajectories. To address this gap, we introduce TraceSafe-Bench, the first comprehensive benchmark specifically designed to assess mid-trajectory safety. It encompasses 12 risk categories, ranging from security threats (e.g., prompt injection, privacy leaks) to operational failures (e.g., hallucinations, interface inconsistencies), featuring over 1,000 unique execution instances. Our evaluation of 13 LLM-as-a-guard models and 7 specialized guardrails yields three critical findings: 1) Structural Bottleneck: Guardrail efficacy is driven more by structural data competence (e.g., JSON parsing) than semantic safety alignment. Performance correlates strongly with structured-to-text benchmarks ($\rho=0.79$) but shows near-zero correlation with standard jailbreak robustness. 2) Architecture over Scale: Model architecture influences risk detection performance more significantly than model size, with general-purpose LLMs consistently outperforming specialized safety guardrails in trajectory analysis. 3) Temporal Stability: Accuracy remains resilient across extended trajectories. Increased execution steps allow models to pivot from static tool definitions to dynamic execution behaviors, actually improving risk detection performance in later stages. Our findings suggest that securing agentic workflows requires jointly optimizing for structural reasoning and safety alignment to effectively mitigate mid-trajectory risks.
要約:
Operating LEO mega-constellations requires translating high-level operator intents ("reroute financial traffic away from polar links under 80 ms") into low-level routing constraints -- a task that demands both natural language understanding and network-domain expertise. We present an end-to-end system comprising three components: (1) a GNN cost-to-go router that distills Dijkstra-quality routing into a 152K-parameter graph attention network achieving 99.8% packet delivery ratio with 17x inference speedup; (2) an LLM intent compiler that converts natural language to a typed constraint intermediate representation using few-shot prompting with a verifier-feedback repair loop, achieving 98.4% compilation rate and 87.6% full semantic match on feasible intents in a 240-intent benchmark (193 feasible, 47 infeasible); and (3) an 8-pass deterministic validator with constructive feasibility certification that achieves 0% unsafe acceptance on all 47 infeasible intents (30 labeled + 17 discovered by Pass 8), with 100% corruption detection across 240 structural corruption tests and 100% on 15 targeted adversarial attacks. End-to-end evaluation across four constrained routing scenarios confirms zero constraint violations with both routers. We further demonstrate that apparent performance gaps in polar-avoidance scenarios are largely explained by topological reachability ceilings rather than routing quality, and that the LLM compiler outperforms a rule-based baseline by 46.2 percentage points on compositional intents. Our system bridges the semantic gap between operator intent and network configuration while maintaining the safety guarantees required for operational deployment.
要約:
This study introduces a hybrid deep learning model for intrusion detection in Industrial IoT (IIoT) systems, combining ResNet-1D, BiGRU, and Multi-Head Attention (MHA) for effective spatial-temporal feature extraction and attention-based feature weighting. To address class imbalance, SMOTE was applied during training on the EdgeHoTset dataset. The model achieved 98.71% accuracy, a loss of 0.0417%, and low inference latency (0.0001 sec /instance), demonstrating strong real-time capability. To assess generalizability, the model was also tested on the CICIoV2024 dataset, where it reached 99.99% accuracy and F1-score, with a loss of 0.0028, 0 % FPR, and 0.00014 sec/instance inference time. Across all metrics and datasets, the proposed model outperformed existing methods, confirming its robustness and effectiveness for real-time IoT intrusion detection.
要約:
We study stochastic convex optimization (SCO) with heavy-tailed gradients under pure epsilon-differential privacy (DP). Instead of assuming a bound on the worst-case Lipschitz parameter of the loss, we assume only a bounded k-th moment. This assumption allows for unbounded, heavy-tailed stochastic gradient distributions, and can yield sharper excess risk bounds. The minimax optimal rate for approximate (epsilon, delta)-DP SCO is known in this setting, but the pure epsilon-DP case has remained open. We characterize the minimax optimal excess-risk rate for pure epsilon-DP heavy-tailed SCO up to logarithmic factors. Our algorithm achieves this rate in polynomial time with high probability. Moreover, it runs in polynomial time with probability 1 when the worst-case Lipschitz parameter is polynomially bounded. For important structured problem classes - including hinge/ReLU-type and absolute-value losses on Euclidean balls, ellipsoids, and polytopes - we achieve the same excess-risk guarantee in polynomial time with probability 1 even when the worst-case Lipschitz parameter is infinite. Our approach is based on a novel framework for privately optimizing Lipschitz extensions of the empirical loss. We complement our excess risk upper bound with a novel high probability lower bound.
要約:
The field of cybersecurity is confronted with two interrelated challenges: a worldwide deficit of qualified practitioners and ongoing human-factor weaknesses that account for the bulk of security incidents. To tackle these issues, we present SentinelSphere, a platform driven by artificial intelligence that unifies machine learning-based threat identification with security training powered by a Large Language Model (LLM). The detection module uses an Enhanced Deep Neural Network (DNN) trained on the CIC-IDS2017 and CIC-DDoS2019 benchmark datasets, enriched with novel HTTP-layer feature engineering that captures application level attack signatures. For the educational component, we deploy a quantised variant of Phi-4 model (Q4_K_M), fine-tuned for the cybersecurity domain, enabling deployment on commodity hardware requiring only 16 GB of RAM without dedicated GPU resources. Experimental results show that the Enhanced DNN attains high detection accuracy while substantially lowering false positives relative to baseline models, and maintains strong recall across critical attack categories such as DDoS, brute force, and web-based exploits. Validation workshops involving industry professionals and university students confirmed that the Traffic Light visualisation system and conversational AI assistant are both intuitive and effective for users without technical backgrounds. SentinelSphere illustrates that coupling intelligent threat detection with adaptive, LLM-driven security education can meaningfully address both technical and human-factor cybersecurity vulnerabilities within a single, cohesive framework.
要約:
Palmprint recognition is deployed in security-critical applications, including access control and palm-based payment, due to its contactless acquisition and highly discriminative ridge-and-crease textures. However, the robustness of deep palmprint recognition systems against physically realizable attacks remains insufficiently understood. Existing studies are largely confined to the digital setting and do not adequately account for the texture-dominant nature of palmprint recognition or the distortions introduced during physical acquisition. To address this gap, we propose CAAP, a capture-aware adversarial patch framework for palmprint recognition. CAAP learns a universal patch that can be reused across inputs while remaining effective under realistic acquisition variation. To match the structural characteristics of palmprints, the framework adopts a cross-shaped patch topology, which enlarges spatial coverage under a fixed pixel budget and more effectively disrupts long-range texture continuity. CAAP further integrates three modules: ASIT for input-conditioned patch rendering, RaS for stochastic capture-aware simulation, and MS-DIFE for feature-level identity-disruptive guidance. We evaluate CAAP on the Tongji, IITD, and AISEC datasets against generic CNN backbones and palmprint-specific recognition models. Experiments show that CAAP achieves strong untargeted and targeted attack performance with favorable cross-model and cross-dataset transferability. The results further show that, although adversarial training can partially reduce the attack success rate, substantial residual vulnerability remains. These findings indicate that deep palmprint recognition systems remain vulnerable to physically realizable, capture-aware adversarial patch attacks, underscoring the need for more effective defenses in practice. Code available at https://github.com/ryliu68/CAAP.
要約:
Touch-based authentication is widely deployed on mobile devices due to its convenience and seamless user experience. However, existing systems largely model touch interaction as a purely behavioral signal, overlooking its intrinsic multidimensional nature and limiting robustness against sophisticated adversarial behaviors and real-world variations. In this work, we present BioMoTouch, a multi-modal touch authentication framework on mobile devices grounded in a key empirical finding: during touch interaction, inertial sensors capture user-specific behavioral dynamics, while capacitive screens simultaneously capture physiological characteristics related to finger morphology and skeletal structure. Building upon this insight, BioMoTouch jointly models physiological contact structures and behavioral motion dynamics by integrating capacitive touchscreen signals with inertial measurements. Rather than combining independent decisions, the framework explicitly learns their coordinated interaction to form a unified representation of touch behavior. BioMoTouch operates implicitly during natural user interactions and requires no additional hardware, enabling practical deployment on commodity mobile devices. We evaluate BioMoTouch with 38 participants under realistic usage conditions. Experimental results show that BioMoTouch achieves a balanced accuracy of 99.71% and an equal error rate of 0.27%. Moreover, it maintains false acceptance rates below 0.90% under artificial replication, mimicry, and puppet attack scenarios, demonstrating strong robustness against partial-factor manipulation.
要約:
As large language models (LLMs) are increasingly trained on sensitive user data, understanding the fundamental cost of privacy in language learning becomes essential. We initiate the study of differentially private (DP) language identification and generation in the agnostic statistical setting, establishing algorithms and matching lower bounds that precisely quantify the cost of privacy. For both tasks, approximate $(\varepsilon, \delta)$-DP with constant $\varepsilon > 0$ recovers the non-private error rates: $\exp(-r(n))$ for identification (for any $r(n) = o(n)$) and $\exp(-\Omega(n))$ for generation. Under pure $\varepsilon$-DP, the exponents degrade by a multiplicative factor of $\min\{1, \varepsilon\}$, which we show is tight up to constants. Notably, for generation under pure DP with mild assumptions, the upper bound $\exp(-\min\{1,\varepsilon\} \cdot \Omega(n))$ matches the lower bound up to some constants, establishing an optimal rate. Our results show that the cost of privacy in language learning is surprisingly mild: absent entirely under approximate DP, and exactly a $\min\{1,\varepsilon\}$ factor in the exponent under pure DP.
要約:
In recent years, large language models have been responsible for great advances in the field of artificial intelligence (AI). ChatGPT in particular, an AI chatbot developed and recently released by OpenAI, has taken the field to the next level. The conversational model is able not only to process human-like text, but also to translate natural language into code. However, the safety of programs generated by ChatGPT should not be overlooked. In this paper, we perform an experiment to address this issue. Specifically, we ask ChatGPT to generate a number of program and evaluate the security of the resulting source code. We further investigate whether ChatGPT can be prodded to improve the security by appropriate prompts, and discuss the ethical aspects of using AI to generate code. Results suggest that ChatGPT is aware of potential vulnerabilities, but nonetheless often generates source code that are not robust to certain attacks.
要約:
State-of-the-art large language models (LLMs) are typically deployed as online services, requiring users to transmit detailed prompts to cloud servers. This raises significant privacy concerns. In response, we introduce ConfusionPrompt, a novel framework for private LLM inference that protects user privacy by: (i) decomposing the original prompt into smaller sub-prompts, and (ii) generating pseudo-prompts alongside the genuine sub-prompts, which are then sent to the LLM. The server responses are later recomposed by the user to reconstruct the final output. This approach offers key advantages over previous LLM privacy protection methods: (i) it integrates seamlessly with existing black-box LLMs, and (ii) it delivers a significantly improved privacy-utility trade-off compared to existing text perturbation methods. We also develop a $(\lambda, \mu, \rho)$-privacy model to formulate the requirements for a privacy-preserving group of prompts and provide a complexity analysis to justify the role of prompt decomposition. Our empirical evaluation shows that ConfusionPrompt achieves significantly higher utility than local inference methods using open-source models and perturbation-based techniques, while also reducing memory consumption compared to open-source LLMs.
要約:
The continuous monitoring of the interactions between cyber-physical components of any industrial control system (ICS) is required to secure automation of the system controls, and to guarantee plant processes are fail-safe and remain in an acceptably safe state. Safety is achieved by managing actuation (where electric signals are used to trigger physical movement), dependent on corresponding sensor readings; used as ground truth in decision making. Timely detection of anomalies (attacks, faults and unascertained states) in ICSs is crucial for the safe running of a plant, the safety of its personnel, and for the safe provision of any services provided. We propose an anomaly detection method that involves accurate linearization of the non-linear forms arising from sensor-actuator(s) relationships, primarily because solving linear models is easier and well understood. We accomplish this by using a well-known water treatment testbed as a use case. Our experiments show millisecond time response to detect anomalies, all of which are explainable and traceable; this simultaneous coupling of detection speed and explainability has not been achieved by other state of the art Artificial Intelligence (AI)/ Machine Learning (ML) models with eXplainable AI (XAI) used for the same purpose. Our methods explainability enables us to pin-point the sensor(s) and the actuation state(s) for which the anomaly was detected. The proposed algorithm showed an accuracy of 97.72% by flagging deviations within safe operation limits as non-anomalous; indicative that slower detectors with highest detection resolution is unnecessary, for systems whose safety boundaries provide leeway within safety limits.
要約:
The increasing sophistication of large vision-language models (LVLMs) has been accompanied by advances in safety alignment mechanisms designed to prevent harmful content generation. However, these defenses remain vulnerable to sophisticated adversarial attacks. Existing jailbreak methods typically rely on direct and semantically explicit prompts, overlooking subtle vulnerabilities in how LVLMs compose information over multiple reasoning steps. In this paper, we propose a novel and effective jailbreak framework inspired by Return-Oriented Programming (ROP) techniques from software security. Our approach decomposes a harmful instruction into a sequence of individually benign visual gadgets. A carefully engineered textual prompt directs the sequence of inputs, prompting the model to integrate the benign visual gadgets through its reasoning process to produce a coherent and harmful output. This makes the malicious intent emergent and difficult to detect from any single component. We validate our method through extensive experiments on established benchmarks including SafeBench and MM-SafetyBench, targeting popular LVLMs. Results show that our approach consistently and substantially outperforms existing baselines on state-of-the-art models, achieving near-perfect attack success rates (over 0.90 on SafeBench) and improving ASR by up to 0.39. Our findings reveal a critical and underexplored vulnerability that exploits the compositional reasoning abilities of LVLMs, highlighting the urgent need for defenses that secure the entire reasoning process.
要約:
Large vision-language models (LVLMs) enable autonomous mobile agents to operate smartphone user interfaces, yet vulnerabilities in their perception and interaction remain critically understudied. Existing research often relies on conspicuous overlays, elevated permissions, or unrealistic threat assumptions, limiting stealth and real-world feasibility. In this paper, we introduce a practical and stealthy jailbreak attack framework, which comprises three key components: (i) non-privileged perception compromise, which injects visual payloads into the application interface without requiring elevated system permissions; (ii) agent-attributable activation, which leverages input attribution signals to distinguish agent from human interactions and limits prompt exposure to transient intervals to preserve stealth from end users; and (iii) efficient one-shot jailbreak, a heuristic iterative deepening search algorithm (HG-IDA*) that performs keyword-level detoxification to bypass built-in safety alignment of LVLMs. Moreover, we developed three representative Android applications and curated a prompt-injection dataset for mobile agents. We evaluated our attack across multiple LVLM backends, including closed-source services and representative open-source models, and observed high planning and execution hijack rates (e.g., GPT-4o: 82.5% planning / 75.0% execution), exposing a fundamental security vulnerability in current mobile agents and underscoring critical implications for autonomous smartphone operation.
要約:
The growing complexity of modern system-on-chip (SoC) and IP designs is making security assurance difficult day by day. One of the fundamental steps in the pre-silicon security verification of a hardware design is the identification of security assets, as it substantially influences downstream security verification tasks, such as threat modeling, security property generation, and vulnerability detection. Traditionally, assets are determined manually by security experts, requiring significant time and expertise. To address this challenge, we present LAsset, a novel automated framework that leverages large language models (LLMs) to identify security assets from both hardware design specifications and register-transfer level (RTL) descriptions. The framework performs structural and semantic analysis to identify intra-module primary and secondary assets and derives inter-module relationships to systematically characterize security dependencies at the design level. Experimental results show that the proposed framework achieves high classification accuracy, reaching up to 90% recall rate in SoC design, and 93% recall rate in IP designs. This automation in asset identification significantly reduces manual overhead and supports a scalable path forward for secure hardware development.
要約:
Security operation centers (SOCs) often produce analysis reports on security incidents, and large language models (LLMs) will likely be used for this task in the near future. We postulate that a better understanding of how veteran analysts evaluate reports, including their feedback, can help produce analysis reports in SOCs. In this paper, we aim to leverage LLMs for analysis reports. To this end, we first construct a Analyst-wise checklist to reflect SOC practitioners' opinions for analysis report evaluation through literature review and user study with SOC practitioners. Next, we design a novel LLM-based conceptual framework, named MESSALA, by further introducing two new techniques, granularization guideline and multi-perspective evaluation. MESSALA can maximize report evaluation and provide feedback on veteran SOC practitioners' perceptions. When we conduct extensive experiments with MESSALA, the evaluation results by MESSALA are the closest to those of veteran SOC practitioners compared with the existing LLM-based methods. We then show two key insights. We also conduct qualitative analysis with MESSALA, and then identify that MESSALA can provide actionable items that are necessary for improving analysis reports.
要約:
Distributed energy trading and carbon asset management involve high-frequency, small-value settlements with strong audit requirements. Fully on-chain designs incur excessive cost, while purely off-chain approaches lack verifiable consistency. This paper presents a hybrid on-chain and off-chain settlement framework that anchors settlement commitments and key constraints on-chain and links off-chain records through deterministic digests and replayable auditing. Experiments under publicly constrained workloads show that the framework significantly reduces on-chain execution and storage cost while preserving audit trustworthiness.
要約:
Large language models (LLMs) have shown strong capabilities in multi-step decision-making, planning and actions, and are increasingly integrated into various real-world applications. It is concerning whether their strong problem-solving abilities may be misused for crimes. To address this gap, we propose VirtualCrime, a sandbox simulation framework based on a three-agent system to evaluate the criminal capabilities of models. Specifically, this framework consists of an attacker agent acting as the leader of a criminal team, a judge agent determining the outcome of each action, and a world manager agent updating the environment state and entities. Furthermore, we design 40 diverse crime tasks within this framework, covering 11 maps and 13 crime objectives such as theft, robbery, kidnapping, and riot. We also introduce a human player baseline for reference to better interpret the performance of LLM agents. We evaluate 8 strong LLMs and find (1) All agents in the simulation environment compliantly generate detailed plans and execute intelligent crime processes, with some achieving relatively high success rates; (2) In some cases, agents take severe action that inflicts harm to NPCs to achieve their goals. Our work highlights the need for safety alignment when deploying agentic AI in real-world settings.
要約:
GitHub and GitLab are widely used collaborative platforms whose issue-tracking systems contain large volumes of unstructured text, including logs, code snippets, and configuration examples. This creates a significant risk of accidental secret exposure, such as API keys and credentials, yet these platforms provide no mechanism to warn users before submission. We present \textsc{IssueGuard}, a tool for real-time detection and prevention of secret leaks in issue reports. Implemented as a Chrome extension, \textsc{IssueGuard} analyzes text as users type and combines regex-based candidate extraction with a fine-tuned CodeBERT model for contextual classification. This approach effectively separates real secrets from false positives and achieves an F1-score of 92.70\% on a benchmark dataset, outperforming traditional regex-based scanners. \textsc{IssueGuard} integrates directly into the web interface and continuously analyzes the issue editor, presenting clear visual warnings to help users avoid submitting sensitive data. The source code is publicly available at \href{https://github.com/disa-lab/IssueGuard}{https://github.com/disa-lab/IssueGuard} , and a demonstration video is available at \href{https://youtu.be/kvbWA8rr9cU}{https://youtu.be/kvbWA8rr9cU} .
要約:
Connectivity does not guarantee control reliability for UAV command-and-control (C2) over 5G standalone (SA). We identify a mismatch between network-level guarantees and the requirements of cyber-physical control. A 5G network can preserve attachment and IP connectivity, while Unmanned Aerial Vehicle (UAV) operation depends on the timeliness, availability, and integrity of control messages. Using a reproducible Open5GS and UERANSIM (open source state-of-the-art 5G UE and RAN simulator) testbed that carries MAVLink over the 5G User Plane, we show that common attacker positions can violate these properties without necessarily causing disconnection. A co-tenant User Equipment (UE) can induce queue buildup, delaying C2 traffic. An insider with service-based interface reachability can disrupt core Control Plane continuity during mobility, including via previously undocumented implementation issues for which we requested CVE assignment. A compromised gNodeB can tamper with cleartext C2 payloads after radio-layer decryption. These conditions can trigger UAV failsafe actions or cause execution of incorrect commands while connectivity remains. Connectivity alone is insufficient to guarantee safe operation in 5G-connected cyber-physical systems.
要約:
The Legendre Pseudorandom Function (PRF) is a highly efficient cryptographic primitive built upon the Legendre symbol, valued for its low multiplicative complexity in Multi-Party Computation (MPC) and Zero-Knowledge Proof (ZKP) protocols. While its security over prime fields $\mathbb{F}_p$ is well-documented, recent interest has shifted toward instantiations over extension fields $\mathbb{F}_{p^r}$. This paper presents the first comprehensive cryptanalysis of the single-degree Legendre PRF operating over $\mathbb{F}_{p^r}$.
First, we analyze polynomial input encoding under a standard passive threat model (sequential additive counter queries). We demonstrate that while the absence of polynomial carry-overs causes an asynchronous "no-carry fracture" that neutralizes classical sliding-window collision attacks, the fracture itself is deterministically periodic. By introducing a novel "Differential Signature" bucketing technique, we prove that an adversary can systematically group fractured sequences by their structural shapes to bypass this defense, recovering the secret key in $\mathcal{O}(U \cdot p^r/M)$ operations, where $U$ is the unicity distance.
Second, we evaluate the PRF under an active Chosen-Query threat model. We demonstrate that an adversary can circumvent the additive fracture by evaluating the PRF along a geometric sequence generated by a primitive polynomial. This structure invokes strict multiplicative homomorphism over $\mathbb{F}^*_{p^r}$, permitting a direct generalization of state-of-the-art table collision attacks to extract the key in $\mathcal{O}(p^r/M)$ operations. Finally, we establish the cryptographic boundaries of these attacks, formally proving the necessity of higher-degree key variants ($d \ge 2$) to achieve exponential security against structural reduction in extension fields.
要約:
AI coding assistants are now used to generate production code in
security-sensitive domains, yet the exploitability of their outputs remains
unquantified. We address this gap with Broken by Default: a formal
verification study of 3,500 code artifacts generated by seven widely-deployed LLMs
across 500 security-critical prompts (five CWE categories, 100 prompts each).
Each artifact is subjected to the Z3 SMT solver via the COBALT analysis
pipeline, producing mathematical satisfiability witnesses rather than
pattern-based heuristics. Across all models, 55.8% of artifacts contain at
least one COBALT-identified vulnerability; of these, 1,055 are formally proven
via Z3 satisfiability witnesses. GPT-4o leads at 62.4% (grade F); Gemini 2.5
Flash performs best at 48.4% (grade D). No model achieves a grade better than
D. Six of seven representative findings are confirmed with runtime crashes
under GCC AddressSanitizer. Three auxiliary experiments show: (1) explicit
security instructions reduce the mean rate by only 4 points; (2) six industry
tools combined miss 97.8% of Z3-proven findings; and (3) models identify their
own vulnerable outputs 78.7% of the time in review mode yet generate them at
55.8% by default.
著者: Janine Schneider, Florian Ramming, Maximilian Eichhorn, Gaston Pugliese, Chris Hargreaves, Jan Gruber, Joschua Schilling, Julian Geus, Kevin Mayer, Lea Uhlenbrock, Lena Voigt, Frank Breitinger
要約:
Anti-forensics includes a growing set of techniques designed to obstruct forensic analysis. While cybercriminals increasingly rely on these methods, they also help researchers identify and remedy weaknesses in forensic tools, advancing the overall robustness of digital forensics. Despite repeated efforts to define it, anti-forensics remains vague and inconsistent in its use. It also poses ethical challenges regarding the appropriateness of research practices and the legitimacy of the field itself. This article presents a systematic analysis of 123 publications on anti-forensics, combining qualitative and quantitative methods. We quantify the main techniques and attack vectors, examine their occurrence in different digital forensic subdomains, and identify typical research methods, motivations, and applications. This work also discusses what these findings mean for future research and proposes directions for building a more coherent and ethically grounded understanding of anti-forensics.
要約:
Best Arm Identification (BAI) problems are progressively used for data-sensitive applications, such as designing adaptive clinical trials, tuning hyper-parameters, and conducting user studies. Motivated by the data privacy concerns invoked by these applications, we study the problem of BAI with fixed confidence in both the local and central models, i.e. $\epsilon$-local and $\epsilon$-global Differential Privacy (DP). First, to quantify the cost of privacy, we derive lower bounds on the sample complexity of any $\delta$-correct BAI algorithm satisfying $\epsilon$-global DP or $\epsilon$-local DP. Our lower bounds suggest the existence of two privacy regimes. In the high-privacy regime, the hardness depends on a coupled effect of privacy and novel information-theoretic quantities involving the Total Variation. In the low-privacy regime, the lower bounds reduce to the non-private lower bounds. We propose $\epsilon$-local DP and $\epsilon$-global DP variants of a Top Two algorithm, namely CTB-TT and AdaP-TT*, respectively. For $\epsilon$-local DP, CTB-TT is asymptotically optimal by plugging in a private estimator of the means based on Randomised Response. For $\epsilon$-global DP, our private estimator of the mean runs in arm-dependent adaptive episodes and adds Laplace noise to ensure a good privacy-utility trade-off. By adapting the transportation costs, the expected sample complexity of AdaP-TT* reaches the asymptotic lower bound up to multiplicative constants.
要約:
With the escalating proliferation of artificial intelligence technologies, AI-generated content (AIGC) has progressively permeated across diverse domains. However, this explosive application has also sparked widespread public discussion about the copyright of AIGC. Existing copyright legal frameworks, originally designed around human creators, now face a paradigm shift. As human involvement in the generation of AIGC diminishes, where creative expression increasingly hinges on AI. This discrepancy has introduced multifaceted complexities and challenges in determining the copyright ownership of AIGC within established legal boundaries. Given this, meticulous recording and auditing of contributions from all parties in AIGC generation becomes imperative. Blockchain, with its decentralized storage, offers a robust technical foundation for AIGC copyright management. Yet existing blockchain-based solutions have clear limitations: most only focus on certifying final generated products, ignoring the management of critical intermediate data across the full lifecycle, thus failing to meet the needs of core scenarios like copyright confirmation and multi-party profit distribution. For this purpose, this paper introduces AIGC-Chain, a trustworthy AIGC copyright management system. It conducts a comprehensive recording of intermediate data generated across the full lifecycle of AIGC. Such data is deposited into a decentralized blockchain for secure multi-party auditing, thereby constructing a trustworthy management for AIGC copyright. In copyright dispute scenarios, auditors can retrieve critical proof from the blockchain, facilitating precise determination of the copyright ownership of AIGC products. Both theoretical and experimental analyses confirm that this scheme shows exceptional performance and security in AIGC copyright management.
要約:
Graph neural network (GNN) have demonstrated exceptional performance in solving critical problems across diverse domains yet remain susceptible to backdoor attacks. Existing studies on backdoor attack for graph classification are limited to single target attack using subgraph replacement based mechanism where the attacker implants only one trigger into the GNN model. In this paper, we introduce the first multi-targeted backdoor attack for graph classification task, where multiple triggers simultaneously redirect predictions to different target labels. Instead of subgraph replacement, we propose subgraph injection which preserves the structure of the original graphs while poisoning the clean graphs. Extensive experiments demonstrate the efficacy of our approach, where our attack achieves high attack success rates for all target labels with minimal impact on the clean accuracy. Experimental results on five dataset demonstrate the superior performance of our attack framework compared to the conventional subgraph replacement-based attack. Our analysis on four GNN models confirms the generalization capability of our attack which is effective regardless of the GNN model architectures and training parameters settings. We further investigate the impact of the attack design parameters including injection methods, number of connections, trigger sizes, trigger edge density and poisoning ratios. Additionally, our evaluation against state-of-the-art defenses (randomized smoothing and fine-pruning) demonstrates the robustness of our proposed multi-target attacks. This work highlights the GNN vulnerability against multi-targeted backdoor attack in graph classification task. Our source codes will be available at https://github.com/SiSL-URI/Multi-Targeted-Graph-Backdoor-Attack.
要約:
Diffusion models generate high-quality images but pose serious risks like copyright violation and disinformation. Watermarking is a key defense for tracing and authenticating AI-generated content. However, existing methods rely on threshold-based detection, which only supports fuzzy matching and cannot recover structured watermark data bit-exactly, making them unsuitable for offline verification or applications requiring lossless metadata (e.g., licensing instructions). To address this problem, in this paper, we propose Gaussian Shannon, a watermarking framework that treats the diffusion process as a noisy communication channel and enables both robust tracing and exact bit recovery. Our method embeds watermarks in the initial Gaussian noise without fine-tuning or quality loss. We identify two types of channel interference, namely local bit flips and global stochastic distortions, and design a cascaded defense combining error-correcting codes and majority voting. This ensures reliable end-to-end transmission of semantic payloads. Experiments across three Stable Diffusion variants and seven perturbation types show that Gaussian Shannon achieves state-of-the-art bit-level accuracy while maintaining a high true positive rate, enabling trustworthy rights attribution in real-world deployment. The source code have been made available at: https://github.com/Rambo-Yi/Gaussian-Shannon