cs.CR updates on arXiv.org

更新日時: Thu, 22 Jan 2026 05:00:18 +0000
論文数: 58件
0件選択中

📋 論文タイトル一覧

1. Guardrails for trust, safety, and ethical development and deployment of Large Language Models (LLM)
2. Predicting Tail-Risk Escalation in IDS Alert Time Series
3. DDSA: Dual-Domain Strategic Attack for Spatial-Temporal Efficiency in Adversarial Robustness Testing
4. An Optimized Decision Tree-Based Framework for Explainable IoT Anomaly Detection
5. CORVUS: Red-Teaming Hallucination Detectors via Internal Signal Camouflage in Large Language Models
6. Tracing the Data Trail: A Survey of Data Provenance, Transparency and Traceability in LLMs
7. SilentDrift: Exploiting Action Chunking for Stealthy Backdoor Attacks on Vision-Language-Action Models backdoor
8. Turn-Based Structural Triggers: Prompt-Free Backdoors in Multi-Turn LLMs backdoor
9. Rethinking On-Device LLM Reasoning: Why Analogical Mapping Outperforms Abstract Thinking for IoT DDoS Detection
10. A Survey of Security Challenges and Solutions for Advanced Air Mobility and eVTOL Aircraft
11. Uma Prova de Conceito para a Verifica\c{c}\~ao Formal de Contratos Inteligentes
12. European digital identity: A missed opportunity?
13. Uncovering and Understanding FPR Manipulation Attack in Industrial IoT Networks
14. Towards Transparent Malware Detection With Granular Explainability: Backtracking Meta-Coarsened Explanations Onto Assembly Flow Graphs With Graph Neural Networks
15. LLM Security and Safety: Insights from Homotopy-Inspired Prompt Obfuscation
16. AI Agents vs. Human Investigators: Balancing Automation, Security, and Expertise in Cyber Forensic Analysis agent
17. WebAssembly Based Portable and Secure Sensor Interface for Internet of Things
18. Automatically Tightening Access Control Policies with Restricter
19. IntelliSA: An Intelligent Static Analyzer for IaC Security Smell Detection Using Symbolic Rules and Neural Inference
20. Holmes: An Evidence-Grounded LLM Agent for Auditable DDoS Investigation in Cloud Networks agent
21. An LLM Agent-based Framework for Whaling Countermeasures agent
22. Towards Cybersecurity Superintelligence: from AI-guided humans to human-guided AI
23. NeuroFilter: Privacy Guardrails for Conversational LLM Agents privacyagent
24. STEAD: Robust Provably Secure Linguistic Steganography with Diffusion Language Model diffusion
25. On Implementing Hybrid Post-Quantum End-to-End Encryption
26. Interoperable Architecture for Digital Identity Delegation for AI Agents with Blockchain Integration agent
27. On the Effectiveness of Mempool-based Transaction Auditing
28. SpooFL: Spoofing Federated Learning
29. Dynamic Management of a Deep Learning-Based Anomaly Detection System for 5G Networks
30. Lightweight LLMs for Network Attack Detection in IoT Networks
31. GCG Attack On A Diffusion LLM diffusion
32. The Limits of Lognormal: Assessing Cryptocurrency Volatility and VaR using Geometric Brownian Motion
33. Gradient Structure Estimation under Label-Only Oracles via Spectral Sensitivity
34. Unpacking Security Scanners for GitHub Actions Workflows
35. Constructing Multi-label Hierarchical Classification Models for MITRE ATT&CK Text Tagging
36. Agent Identity URI Scheme: Topology-Independent Naming and Capability-Based Discovery for Multi-Agent Systems agent
37. Optimality of Staircase Mechanisms for Vector Queries under Differential Privacy privacy
38. Beyond Denial-of-Service: The Puppeteer's Attack for Fine-Grained Control in Ranking-Based Federated Learning
39. SAGA: Detecting Security Vulnerabilities Using Static Aspect Analysis
40. A Measurement of Genuine Tor Traces for Realistic Website Fingerprinting
41. Neural Honeytrace: Plug&Play Watermarking Framework against Model Extraction Attacks model extractionintellectual property
42. Targeting Alignment: Extracting Safety Classifiers of Aligned LLMs
43. GenPTW: Latent Image Watermarking for Provenance Tracing and Tamper Localization intellectual property
44. SPECTRE: Conditional System Prompt Poisoning to Hijack LLMs backdoor
45. On the Reliability and Stability of Selective Methods in Malware Classification Tasks
46. Decentralized COVID-19 Health System Leveraging Blockchain
47. Towards Effective Prompt Stealing Attack against Text-to-Image Diffusion Models diffusion
48. "Abuse Risks are Often Inherent to Product Features": Exploring AI Vendors' Bug Bounty and Responsible Disclosure Policies
49. Reading Between the Lines: Towards Reliable Black-box LLM Fingerprinting via Zeroth-order Gradient Estimation
50. PrivTune: Efficient and Privacy-Preserving Fine-Tuning of Large Language Models via Device-Cloud Collaboration privacy
51. DNF: Dual-Layer Nested Fingerprinting for Large Language Model Intellectual Property Protection intellectual property
52. Hidden-in-Plain-Text: A Benchmark for Social-Web Indirect Prompt Injection in RAG
53. KinGuard: Hierarchical Kinship-Aware Fingerprinting to Defend Against Large Language Model Stealing model extraction
54. Formal Power Series on Algebraic Cryptanalysis
55. Connecting Kani's Lemma and path-finding in the Bruhat-Tits tree to compute supersingular endomorphism rings
56. The Good, the Bad and the Ugly: Meta-Analysis of Watermarks, Transferable Attacks and Adversarial Defenses intellectual property
57. Training-Free In-Context Forensic Chain for Image Manipulation Detection and Localization
58. DRGW: Learning Disentangled Representations for Robust Graph Watermarking intellectual property
📄 論文詳細
著者: Anjanava Biswas, Wrick Talukdar
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
The AI era has ushered in Large Language Models (LLM) to the technological forefront, which has been much of the talk in 2023, and is likely to remain as such for many years to come. LLMs are the AI models that are the power house behind generative AI applications such as ChatGPT. These AI models, fueled by vast amounts of data and computational prowess, have unlocked remarkable capabilities, from human-like text generation to assisting with natural language understanding (NLU) tasks. They have quickly become the foundation upon which countless applications and software services are being built, or at least being augmented with. However, as with any groundbreaking innovations, the rise of LLMs brings forth critical safety, privacy, and ethical concerns. These models are found to have a propensity to leak private information, produce false information, and can be coerced into generating content that can be used for nefarious purposes by bad actors, or even by regular users unknowingly. Implementing safeguards and guardrailing techniques is imperative for applications to ensure that the content generated by LLMs are safe, secure, and ethical. Thus, frameworks to deploy mechanisms that prevent misuse of these models via application implementations is imperative. In this study, wepropose a Flexible Adaptive Sequencing mechanism with trust and safety modules, that can be used to implement safety guardrails for the development and deployment of LLMs.
著者: Ambarish Gurjar, L Jean Camp
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
Network defenders face a steady stream of attacks, observed as raw Intrusion Detection System (IDS) alerts. The sheer volume of alerts demands prioritization, typically based on high-level risk classifications. This work expands the scope of risk measurement by examining alerts not only through their technical characteristics but also by examining and classifying their temporal patterns. One critical issue in responding to intrusion alerts is determining whether an alert is part of an escalating attack pattern or an opportunistic scan. To identify the former, we apply extreme-regime forecasting methods from financial modeling to IDS data. Extreme-regime forecasting is designed to identify likely future high-impact events or significant shifts in system behavior. Using these methods, we examine attack patterns by computing per-minute alert intensity, volatility, and a short-term momentum measure derived from weighted moving averages. We evaluate the efficacy of a supervised learning model for forecasting future escalation patterns using these derived features. The trained model identifies future high-intensity attacks and demonstrates strong predictive performance, achieving approximately 91\% accuracy, 89\% recall, and 98\% precision. Our contributions provide a temporal measurement framework for identifying future high-intensity attacks and demonstrate the presence of predictive early-warning signals within the temporal structure of IDS alert streams. We describe our methods in sufficient detail to enable reproduction using other IDS datasets. In addition, we make the trained models openly available to support further research. Finally, we introduce an interpretable visualization that enables defenders to generate early predictive warnings of elevated volumetric arrival risk.
著者: Jinwei Hu, Shiyuan Meng, Yi Dong, Xiaowei Huang
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
Image transmission and processing systems in resource-critical applications face significant challenges from adversarial perturbations that compromise mission-specific object classification. Current robustness testing methods require excessive computational resources through exhaustive frame-by-frame processing and full-image perturbations, proving impractical for large-scale deployments where massive image streams demand immediate processing. This paper presents DDSA (Dual-Domain Strategic Attack), a resource-efficient adversarial robustness testing framework that optimizes testing through temporal selectivity and spatial precision. We introduce a scenario-aware trigger function that identifies critical frames requiring robustness evaluation based on class priority and model uncertainty, and employ explainable AI techniques to locate influential pixel regions for targeted perturbation. Our dual-domain approach achieves substantial temporal-spatial resource conservation while maintaining attack effectiveness. The framework enables practical deployment of comprehensive adversarial robustness testing in resource-constrained real-time applications where computational efficiency directly impacts mission success.
著者: Ashikuzzaman, Md. Shawkat Hossain, Jubayer Abdullah Joy, Md Zahid Akon, Md Manjur Ahmed, Md. Naimul Islam
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
The increase in the number of Internet of Things (IoT) devices has tremendously increased the attack surface of cyber threats thus making a strong intrusion detection system (IDS) with a clear explanation of the process essential towards resource-constrained environments. Nevertheless, current IoT IDS systems are usually traded off with detection quality, model elucidability, and computational effectiveness, thus the deployment on IoT devices. The present paper counteracts these difficulties by suggesting an explainable AI (XAI) framework based on an optimized Decision Tree classifier with both local and global importance methods: SHAP values that estimate feature attribution using local explanations, and Morris sensitivity analysis that identifies the feature importance in a global view. The proposed system attains the state of art on the test performance with 99.91% accuracy, F1-score of 99.51% and Cohen Kappa of 0.9960 and high stability is confirmed by a cross validation mean accuracy of 98.93%. Efficiency is also enhanced in terms of computations to provide faster inferences compared to those that are generalized in ensemble models. SrcMac has shown as the most significant predictor in feature analyses according to SHAP and Morris methods. Compared to the previous work, our solution eliminates its major drawback lack because it allows us to apply it to edge devices and, therefore, achieve real-time processing, adhere to the new regulation of transparency in AI, and achieve high detection rates on attacks of dissimilar classes. This combination performance of high accuracy, explainability, and low computation make the framework useful and reliable as a resource-constrained IoT security problem in real environments.
著者: Nay Myat Min, Long H. Pham, Hongyu Zhang, Jun Sun
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
Single-pass hallucination detectors rely on internal telemetry (e.g., uncertainty, hidden-state geometry, and attention) of large language models, implicitly assuming hallucinations leave separable traces in these signals. We study a white-box, model-side adversary that fine-tunes lightweight LoRA adapters on the model while keeping the detector fixed, and introduce CORVUS, an efficient red-teaming procedure that learns to camouflage detector-visible telemetry under teacher forcing, including an embedding-space FGSM attention stress test. Trained on 1,000 out-of-distribution Alpaca instructions (<0.5% trainable parameters), CORVUS transfers to FAVA-Annotation across Llama-2, Vicuna, Llama-3, and Qwen2.5, and degrades both training-free detectors (e.g., LLM-Check) and probe-based detectors (e.g., SEP, ICR-probe), motivating adversary-aware auditing that incorporates external grounding or cross-model evidence.
著者: Richard Hohensinner, Belgin Mutlu, Inti Gabriel Mendoza Estrada, Matej Vukovic, Simone Kopeinik, Roman Kern
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
Large language models (LLMs) are deployed at scale, yet their training data life cycle remains opaque. This survey synthesizes research from the past ten years on three tightly coupled axes: (1) data provenance, (2) transparency, and (3) traceability, and three supporting pillars: (4) bias \& uncertainty, (5) data privacy, and (6) tools and techniques that operationalize them. A central contribution is a proposed taxonomy defining the field's domains and listing corresponding artifacts. Through analysis of 95 publications, this work identifies key methodologies concerning data generation, watermarking, bias measurement, data curation, data privacy, and the inherent trade-off between transparency and opacity.
backdoor
著者: Bingxin Xu, Yuzhang Shang, Binghui Wang, Emilio Ferrara
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
Vision-Language-Action (VLA) models are increasingly deployed in safety-critical robotic applications, yet their security vulnerabilities remain underexplored. We identify a fundamental security flaw in modern VLA systems: the combination of action chunking and delta pose representations creates an intra-chunk visual open-loop. This mechanism forces the robot to execute K-step action sequences, allowing per-step perturbations to accumulate through integration. We propose SILENTDRIFT, a stealthy black-box backdoor attack exploiting this vulnerability. Our method employs the Smootherstep function to construct perturbations with guaranteed C2 continuity, ensuring zero velocity and acceleration at trajectory boundaries to satisfy strict kinematic consistency constraints. Furthermore, our keyframe attack strategy selectively poisons only the critical approach phase, maximizing impact while minimizing trigger exposure. The resulting poisoned trajectories are visually indistinguishable from successful demonstrations. Evaluated on the LIBERO, SILENTDRIFT achieves a 93.2% Attack Success Rate with a poisoning rate under 2%, while maintaining a 95.3% Clean Task Success Rate.
backdoor
著者: Yiyang Lu, Jinwen He, Yue Zhao, Kai Chen, Ruigang Liang
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
Large Language Models (LLMs) are widely integrated into interactive systems such as dialogue agents and task-oriented assistants. This growing ecosystem also raises supply-chain risks, where adversaries can distribute poisoned models that degrade downstream reliability and user trust. Existing backdoor attacks and defenses are largely prompt-centric, focusing on user-visible triggers while overlooking structural signals in multi-turn conversations. We propose Turn-based Structural Trigger (TST), a backdoor attack that activates from dialogue structure, using the turn index as the trigger and remaining independent of user inputs. Across four widely used open-source LLM models, TST achieves an average attack success rate (ASR) of 99.52% with minimal utility degradation, and remains effective under five representative defenses with an average ASR of 98.04%. The attack also generalizes well across instruction datasets, maintaining an average ASR of 99.19%. Our results suggest that dialogue structure constitutes an important and under-studied attack surface for multi-turn LLM systems, motivating structure-aware auditing and mitigation in practice.
著者: William Pan, Guiran Liu, Binrong Zhu, Qun Wang, Yingzhou Lu, Beiyu Lin, Rose Qingyang Hu
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
The rapid expansion of IoT deployments has intensified cybersecurity threats, notably Distributed Denial of Service (DDoS) attacks, characterized by increasingly sophisticated patterns. Leveraging Generative AI through On-Device Large Language Models (ODLLMs) provides a viable solution for real-time threat detection at the network edge, though limited computational resources present challenges for smaller ODLLMs. This paper introduces a novel detection framework that integrates Chain-of-Thought (CoT) reasoning with Retrieval-Augmented Generation (RAG), tailored specifically for IoT edge environments. We systematically evaluate compact ODLLMs, including LLaMA 3.2 (1B, 3B) and Gemma 3 (1B, 4B), using structured prompting and exemplar-driven reasoning strategies. Experimental results demonstrate substantial performance improvements with few-shot prompting, achieving macro-average F1 scores as high as 0.85. Our findings highlight the significant advantages of incorporating exemplar-based reasoning, underscoring that CoT and RAG approaches markedly enhance small ODLLMs' capabilities in accurately classifying complex network attacks under stringent resource constraints.
著者: Mahyar Ghazanfari, Iman Sharifi, Peng Wei, Noah Dahle, Abel Diaz Gonzalez, Austin Coursey, Bryce Bjorkman, Cailani Lemieux-Mack, Robert Canady, Abenezer Taye, Bryan C. Ward, Xenofon Koutsoukos, Gautam Biswas, Maheed H. Ahmed, Hyeong Tae Kim, Mahsa Ghasemi, Vijay Gupta, Filippos Fotiadis, Ufuk Topcu, Junchi Lu, Alfred Chen, Abdul Kareem Ras, Nischal Aryal, Amer Ibrahim, Amir Shirkhodaie, Heber Herencia-Zapana, Saqib Hasan, Isaac Amundson
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
This survey reviews the existing and envisioned security vulnerabilities and defense mechanisms relevant to Advanced Air Mobility (AAM) systems, with a focus on electric vertical takeoff and landing (eVTOL) aircraft. Drawing from vulnerabilities in the avionics in commercial aviation and the automated unmanned aerial systems (UAS), the paper presents a taxonomy of attacks, analyzes mitigation strategies, and proposes a secure system architecture tailored to the future AAM ecosystem. The paper also highlights key threat vectors, including Global Positioning System (GPS) jamming/spoofing, ATC radio frequency misuse, attacks on TCAS and ADS-B, possible backdoor via Electronic Flight Bag (EFB), new vulnerabilities introduced by aircraft automation and connectivity, and risks from flight management system (FMS) software, database and cloud services. Finally, this paper describes emerging defense techniques against these attacks, and open technical problems to address toward better defense mechanisms.
著者: Murilo de Souza Neves, Adilson Luiz Bonifacio
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
Smart contracts are tools with self-execution capabilities that provide enhanced security compared to traditional contracts; however, their immutability makes post-deployment fault correction extremely complex, highlighting the need for a verification layer prior to this stage. Although formalisms such as Contract Language (CL) enable logical analyses, they prove limited in attributing responsibilities within complex multilateral scenarios. This work presents a proof of concept using the Relativized Contract Language (RCL) and the RECALL tool for the specification and verification of a purchase and sale contract involving multiple agents. The study demonstrates the tool's capability to detect normative conflicts during the modeling phase. After correcting logical inconsistencies, the contract was translated into Solidity and functionally validated within the Remix IDE environment, confirming that prior formal verification is fundamental to ensuring the reliability and security of the final code.
著者: Wouter Termont (IDLab), Beatriz Esteves (IDLab)
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
Recent European efforts around digital identity -- the EUDI regulation and its OpenID architecture -- aim high, but start from a narrow and ill-defined conceptualization of authentication. Based on a broader, more grounded understanding of the term, in we identify several issues in the design of OpenID4VCI and OpenID4VP: insecure practices, static, and subject-bound credential types, and a limited query language restrict their application to classic scenarios of credential exchange -- already supported by existing solutions like OpenID Connect, SIOPv2, OIDC4IDA, and OIDC Claims Aggregation -- barring dynamic, asynchronous, or automated use cases. We also debunk OpenID's 'paradigm-shifting' trust-model, which -- when compared to existing decentralized alternatives -- does not deliver any significant increase in control, privacy, and portability of personal information. Not only the technical choices limit the capabilities of the EUDI framework; also the legislation itself cannot accommodate the promise of self-sovereign identity. In particular, we criticize the introduction of institutionalized trusted lists, and discuss their economical and political risks. Their potential to decline into an exclusory, re-centralized ecosystem endangers the vision of a user-oriented identity management in which individuals are in charge. Instead, the consequences might severely restrict people in what they can do with their personal information, and risk increased linkability and monitoring. In anticipation of revisions to the EUDI regulations, we suggest several technical alternatives that overcome some of the issues with the architecture of OpenID. In particular, OAuth's UMA extension and its A4DS profile, as well as their integration in GNAP, are worth looking into. Future research into uniform query (meta-)languages is needed to address the heterogeneity of attestations and providers.
著者: Mohammad Shamim Ahsan, Peng Liu
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
In the network security domain, due to practical issues -- including imbalanced data and heterogeneous legitimate network traffic -- adversarial attacks in machine learning-based NIDSs have been viewed as attack packets misclassified as benign. Due to this prevailing belief, the possibility of (maliciously) perturbed benign packets being misclassified as attack has been largely ignored. In this paper, we demonstrate that this is not only theoretically possible, but also a particular threat to NIDS. In particular, we uncover a practical cyberattack, FPR manipulation attack (FPA), especially targeting industrial IoT networks, where domain-specific knowledge of the widely used MQTT protocol is exploited and a systematic simple packet-level perturbation is performed to alter the labels of benign traffic samples without employing traditional gradient-based or non-gradient-based methods. The experimental evaluations demonstrate that this novel attack results in a success rate of 80.19% to 100%. In addition, while estimating impacts in the Security Operations Center, we observe that even a small fraction of false positive alerts, irrespective of different budget constraints and alert traffic intensities, can increase the delay of genuine alerts investigations up to 2 hr in a single day under normal operating conditions. Furthermore, a series of relevant statistical and XAI analyses is conducted to understand the key factors behind this remarkable success. Finally, we explore the effectiveness of the FPA packets to enhance models' robustness through adversarial training and investigate the changes in decision boundaries accordingly.
著者: Griffin Higgins, Roozbeh Razavi-Far, Hossein Shokouhinejad, Ali A. Ghorbani
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
As malware continues to become increasingly sophisticated, threatening, and evasive, malware detection systems must keep pace and become equally intelligent, powerful, and transparent. In this paper, we propose Assembly Flow Graph (AFG) to comprehensively represent the assembly flow of a binary executable as graph data. Importantly, AFG can be used to extract granular explanations needed to increase transparency for malware detection using Graph Neural Networks (GNNs). However, since AFGs may be large in practice, we also propose a Meta-Coarsening approach to improve computational tractability via graph reduction. To evaluate our proposed approach we consider several novel and existing metrics to quantify the granularity and quality of explanations. Lastly, we also consider several hyperparameters in our proposed Meta-Coarsening approach that can be used to control the final explanation size. We evaluate our proposed approach using the CIC-DGG-2025 dataset. Our results indicate that our proposed AFG and Meta-Coarsening approach can provide both increased explainability and inference performance at certain coarsening levels. However, most importantly, to the best of our knowledge, we are the first to consider granular explainability in malware detection using GNNs.
著者: Luis Lazo, Hamed Jelodar, Roozbeh Razavi-Far
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
In this study, we propose a homotopy-inspired prompt obfuscation framework to enhance understanding of security and safety vulnerabilities in Large Language Models (LLMs). By systematically applying carefully engineered prompts, we demonstrate how latent model behaviors can be influenced in unexpected ways. Our experiments encompassed 15,732 prompts, including 10,000 high-priority cases, across LLama, Deepseek, KIMI for code generation, and Claude to verify. The results reveal critical insights into current LLM safeguards, highlighting the need for more robust defense mechanisms, reliable detection strategies, and improved resilience. Importantly, this work provides a principled framework for analyzing and mitigating potential weaknesses, with the goal of advancing safe, responsible, and trustworthy AI technologies.
agent
著者: Sneha Sudhakaran, Naresh Kshetri
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
In an era where cyber threats are rapidly evolving, the reliability of cyber forensic analysis has become increasingly critical for effective digital investigations and cybersecurity responses. AI agents are being adopted across digital forensic practices due to their ability to automate processes such as anomaly detection, evidence classification, and behavioral pattern recognition, significantly enhancing scalability and reducing investigation timelines. However, the characteristics that make AI indispensable also introduce notable risks. AI systems, often trained on biased or incomplete datasets, can produce misleading results, including false positives and false negatives, thereby jeopardizing the integrity of forensic investigations. This study presents a meticulous comparative analysis of the effectiveness of the most used AI agent, ChatGPT, and human forensic investigators in the realm of cyber forensic analysis. Our research reveals critical limitations within AI-driven approaches, demonstrating scenarios in which sophisticated or novel cyber threats remain undetected due to the rigid pattern-based nature of AI systems. Conversely, our analysis highlights the crucial role that human forensic investigators play in mitigating these risks. Through adaptive decision-making, ethical reasoning, and contextual understanding, human investigators effectively identify subtle anomalies and threats that may evade automated detection systems. To reinforce our findings, we conducted comprehensive reliability testing of forensic techniques using multiple cyber threat scenarios. These tests confirmed that while AI agents significantly improve the efficiency of routine analyses, human oversight remains crucial in ensuring accuracy and comprehensiveness of the results.
著者: Botong Ou, Baijian Yang
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
As the expansion of IoT connectivity continues to provide quality-of-life improvements around the world, they simultaneously introduce increasing privacy and security concerns. The lack of a clear definition in managing shared and protected access to IoT sensors offer channels by which devices can be compromised and sensitive data can be leaked. In recent years, WebAssembly has received considerable attention for its efficient application sandboxing suitable for embedded systems, making it a prime candidate for exploring a secure and portable sensor interface. This paper introduces the first WebAssembly System Interface (WASI) extension offering a secure, portable, and low-footprint sandbox enabling multi-tenant access to sensor data across heterogeneous embedded devices. The runtime extensions provide application memory isolation, ensure appropriate resource privileges by intercepting sensor access, and offer an MQTT-SN interface enabling in-network access control. When targeting the WebAssembly byte-code with the associated runtime extensions implemented atop the Zephyr RTOS, our evaluation of sensor access indicates a latency overhead of 6% with an additional memory footprint of 5% when compared to native execution. As MQTT-SN requests are dominated by network delays, the WASI-SN implementation of MQTT-SN introduces less than 1% additional latency with similar memory footprint.
著者: Ka Lok Wu, Christa Jenkins, Scott D. Stolle, Omar Chowdhury
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
Robust access control is a cornerstone of secure software, systems, and networks. An access control mechanism is as effective as the policy it enforces. However, authoring effective policies that satisfy desired properties such as the principle of least privilege is a challenging task even for experienced administrators, as evidenced by many real instances of policy misconfiguration. In this paper, we set out to address this pain point by proposing Restricter, which automatically tightens each (permit) policy rule of a policy with respect to an access log, which captures some already exercised access requests and their corresponding access decisions (i.e., allow or deny). Restricter achieves policy tightening by reducing the number of access requests permitted by a policy rule without sacrificing the functionality of the underlying system it is regulating. We implement Restricter for Amazon's Cedar policy language and demonstrate its effectiveness through two realistic case studies.
著者: Qiyue Mei, Michael Fu
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
Infrastructure as Code (IaC) enables automated provisioning of large-scale cloud and on-premise environments, reducing the need for repetitive manual setup. However, this automation is a double-edged sword: a single misconfiguration in IaC scripts can propagate widely, leading to severe system downtime and security risks. Prior studies have shown that IaC scripts often contain security smells--bad coding patterns that may introduce vulnerabilities--and have proposed static analyzers based on symbolic rules to detect them. Yet, our preliminary analysis reveals that rule-based detection alone tends to over-approximate, producing excessive false positives and increasing the burden of manual inspection. In this paper, we present IntelliSA, an intelligent static analyzer for IaC security smell detection that integrates symbolic rules with neural inference. IntelliSA applies symbolic rules to over-approximate potential smells for broad coverage, then employs neural inference to filter false positives. While an LLM can effectively perform this filtering, reliance on LLM APIs introduces high cost and latency, raises data governance concerns, and limits reproducibility and offline deployment. To address the challenges, we adopt a knowledge distillation approach: an LLM teacher generates pseudo-labels to train a compact student model--over 500x smaller--that learns from the teacher's knowledge and efficiently classifies false positives. We evaluate IntelliSA against two static analyzers and three LLM baselines (Claude-4, Grok-4, and GPT-5) using a human-labeled dataset including 241 security smells across 11,814 lines of real-world IaC code. Experimental results show that IntelliSA achieves the highest F1 score (83%), outperforming baselines by 7-42%. Moreover, IntelliSA demonstrates the best cost-effectiveness, detecting 60% of security smells while inspecting less than 2% of the codebase.
agent
著者: Haodong Chen, Ziheng Zhang, Jinghui Jiang, Qiang Su, Qiao Xiang
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
Cloud environments face frequent DDoS threats due to centralized resources and broad attack surfaces. Modern cloud-native DDoS attacks further evolve rapidly and often blend multi-vector strategies, creating an operational dilemma: defenders need wire-speed monitoring while also requiring explainable, auditable attribution for response. Existing rule-based and supervised-learning approaches typically output black-box scores or labels, provide limited evidence chains, and generalize poorly to unseen attack variants; meanwhile, high-quality labeled data is often difficult to obtain in cloud settings. We present Holmes (DDoS Detective), an LLM-based DDoS detection agent that reframes the model as a virtual SRE investigator rather than an end-to-end classifier. Holmes couples a funnel-like hierarchical workflow (counters/sFlow for continuous sensing and triage; PCAP evidence collection triggered only on anomaly windows) with an Evidence Pack abstraction that converts binary packets into compact, reproducible, high-signal structured evidence. On top of this evidence interface, Holmes enforces a structure-first investigation protocol and strict JSON/quotation constraints to produce machine-consumable reports with auditable evidence anchors. We evaluate Holmes on CICDDoS2019 reflection/amplification attacks and script-triggered flooding scenarios. Results show that Holmes produces attribution decisions grounded in salient evidence anchors across diverse attack families, and when errors occur, its audit logs make the failure source easy to localize, demonstrating the practicality of an LLM agent for cost-controlled and traceable DDoS investigation in cloud operations.
agent
著者: Daisuke Miyamoto, Takuji Iimura, Narushige Michishita
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
With the spread of generative AI in recent years, attacks known as Whaling have become a serious threat. Whaling is a form of social engineering that targets important high-authority individuals within organizations and uses sophisticated fraudulent emails. In the context of Japanese universities, faculty members frequently hold positions that combine research leadership with authority within institutional workflows. This structural characteristic leads to the wide public disclosure of high-value information such as publications, grants, and detailed researcher profiles. Such extensive information exposure enables the construction of highly precise target profiles using generative AI. This raises concerns that Whaling attacks based on high-precision profiling by generative AI will become prevalent. In this study, we propose a Whaling countermeasure framework for university faculty members that constructs personalized defense profiles and uses large language model (LLM)-based agents. We design agents that (i) build vulnerability profiles for each target from publicly available information on faculty members, (ii) identify potential risk scenarios relevant to Whaling defense based on those profiles, (iii) construct defense profiles corresponding to the vulnerabilities and anticipated risks, and (iv) analyze Whaling emails using the defense profiles. Furthermore, we conduct a preliminary risk-assessment experiment. The results indicate that the proposed method can produce judgments accompanied by explanations of response policies that are consistent with the work context of faculty members who are Whaling targets. The findings also highlight practical challenges and considerations for future operational deployment and systematic evaluation.
著者: V\'ictor Mayoral-Vilches, Stefan Rass, Martin Pinzger, Endika Gil-Uriarte, Unai Ayucar-Carbajo, Jon Ander Ruiz-Alcalde, Maite del Mundo de Torres, Luis Javier Navarrete-Lozano, Mar\'ia Sanz-G\'omez, Francesco Balassone, Crist\'obal R. J. Veas-Chavez, Vanesa Turiel, Alfonso Glera-Pic\'on, Daniel S\'anchez-Prieto, Yuri Salvatierra, Paul Zabalegui-Landa, Ruffino Reydel Cabrera-\'Alvarez, Patxi Mayoral-Pizarroso
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
Cybersecurity superintelligence -- artificial intelligence exceeding the best human capability in both speed and strategic reasoning -- represents the next frontier in security. This paper documents the emergence of such capability through three major contributions that have pioneered the field of AI Security. First, PentestGPT (2023) established LLM-guided penetration testing, achieving 228.6% improvement over baseline models through an architecture that externalizes security expertise into natural language guidance. Second, Cybersecurity AI (CAI, 2025) demonstrated automated expert-level performance, operating 3,600x faster than humans while reducing costs 156-fold, validated through #1 rankings at international competitions including the $50,000 Neurogrid CTF prize. Third, Generative Cut-the-Rope (G-CTR, 2026) introduces a neurosymbolic architecture embedding game-theoretic reasoning into LLM-based agents: symbolic equilibrium computation augments neural inference, doubling success rates while reducing behavioral variance 5.2x and achieving 2:1 advantage over non-strategic AI in Attack & Defense scenarios. Together, these advances establish a clear progression from AI-guided humans to human-guided game-theoretic cybersecurity superintelligence.
privacyagent
著者: Saswat Das, Ferdinando Fioretto
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
This work addresses the computational challenge of enforcing privacy for agentic Large Language Models (LLMs), where privacy is governed by the contextual integrity framework. Indeed, existing defenses rely on LLM-mediated checking stages that add substantial latency and cost, and that can be undermined in multi-turn interactions through manipulation or benign-looking conversational scaffolding. Contrasting this background, this paper makes a key observation: internal representations associated with privacy-violating intent can be separated from benign requests using linear structure. Using this insight, the paper proposes NeuroFilter, a guardrail framework that operationalizes contextual integrity by mapping norm violations to simple directions in the model's activation space, enabling detection even when semantic filters are bypassed. The proposed filter is also extended to capture threats arising during long conversations using the concept of activation velocity, which measures cumulative drift in internal representations across turns. A comprehensive evaluation across over 150,000 interactions and covering models from 7B to 70B parameters, illustrates the strong performance of NeuroFilter in detecting privacy attacks while maintaining zero false positives on benign prompts, all while reducing the computational inference cost by several orders of magnitude when compared to LLM-based agentic privacy defenses.
diffusion
著者: Yuang Qi, Na Zhao, Qiyi Yao, Benlong Wu, Weiming Zhang, Nenghai Yu, Kejiang Chen
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
Recent provably secure linguistic steganography (PSLS) methods rely on mainstream autoregressive language models (ARMs) to address historically challenging tasks, that is, to disguise covert communication as ``innocuous'' natural language communication. However, due to the characteristic of sequential generation of ARMs, the stegotext generated by ARM-based PSLS methods will produce serious error propagation once it changes, making existing methods unavailable under an active tampering attack. To address this, we propose a robust, provably secure linguistic steganography with diffusion language models (DLMs). Unlike ARMs, DLMs can generate text in a partially parallel manner, allowing us to find robust positions for steganographic embedding that can be combined with error-correcting codes. Furthermore, we introduce error correction strategies, including pseudo-random error correction and neighborhood search correction, during steganographic extraction. Theoretical proof and experimental results demonstrate that our method is secure and robust. It can resist token ambiguity in stegotext segmentation and, to some extent, withstand token-level attacks of insertion, deletion, and substitution.
著者: Aditi Gandhi, Aakankshya Das, Aswani Kumar Cherukuri
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
The emergence of quantum computing poses a fundamental threat to current public key cryptographic systems. This threat is necessitating a transition to quantum resistant cryptographic alternatives in all the applications. In this work, we present the implementation of a practical hybrid end-to-end encryption system that combines classical and post-quantum cryptographic primitives to achieve both security and efficiency. Our system employs CRYSTALS-Kyber, a NIST-standardized lattice-based key encapsulation mechanism, for quantum-safe key exchange, coupled with AES-256-GCM for efficient authenticated symmetric encryption and SHA-256 for deterministic key derivation. The architecture follows a zero-trust model where a relay server facilitates communication without accessing plaintext messages or cryptographic keys. All encryption and decryption operations occur exclusively at client endpoints. The system demonstrates that NIST standardized post-quantum cryptography can be effectively integrated into practical messaging systems with acceptable performance characteristics, offering protection against both classical and quantum adversaries. As our focus is on implementation rather than on novelty, we also provide an open-source implementation to facilitate reproducibility and further research in post quantum secure communication systems.
agent
著者: David Ricardo Saavedra
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
Verifiable delegation in digital identity systems remains unresolved across centralized, federated, and self-sovereign identity (SSI) environments, particularly where both human users and autonomous AI agents must exercise and transfer authority without exposing primary credentials or private keys. We introduce a unified framework that enables bounded, auditable, and least-privilege delegation across heterogeneous identity ecosystems. The framework includes four key elements: Delegation Grants (DGs), first-class authorization artefacts that encode revocable transfers of authority with enforced scope reduction; a Canonical Verification Context (CVC) that normalizes verification requests into a single structured representation independent of protocols or credential formats; a layered reference architecture that separates trust anchoring, credential and proof validation, policy evaluation, and protocol mediation via a Trust Gateway; and an explicit treatment of blockchain anchoring as an optional integrity layer rather than a structural dependency. Together, these elements advance interoperable delegation and auditability and provide a foundation for future standardization, implementation, and integration of autonomous agents into trusted digital identity infrastructures.
著者: Jannik Albrecht, Ghassan Karame
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
While the literature features a number of proposals to defend against transaction manipulation attacks, existing proposals are still not integrated within large blockchains, such as Bitcoin, Ethereum, and Cardano. Instead, the user community opted to rely on more practical but ad-hoc solutions (such as Mempool.space) that aim at detecting censorship and transaction displacement attacks by auditing discrepancies in the mempools of so-called observers. In this paper, we precisely analyze, for the first time, the interplay between mempool auditing and the ability to detect censorship and transaction displacement attacks by malicious miners in Bitcoin and Ethereum. Our analysis shows that mempool auditing can result in mis-accusations against miners with a probability larger than 25% in some settings. On a positive note, however, we show that mempool auditing schemes can successfully audit the execution of any two transactions (with an overwhelming probability of 99.9%) if they are consistently received by all observers and sent at least 30 seconds apart from each other. As a direct consequence, our findings show, for the first time, that batch-order fair-ordering schemes can offer only strong fairness guarantees for a limited subset of transactions in real-world deployments.
著者: Isaac Baglin, Xiatian Zhu, Simon Hadfield
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
Traditional defenses against Deep Leakage (DL) attacks in Federated Learning (FL) primarily focus on obfuscation, introducing noise, transformations or encryption to degrade an attacker's ability to reconstruct private data. While effective to some extent, these methods often still leak high-level information such as class distributions or feature representations, and are frequently broken by increasingly powerful denoising attacks. We propose a fundamentally different perspective on FL defense: framing it as a spoofing problem.We introduce SpooFL (Figure 1), a spoofing-based defense that deceives attackers into believing they have recovered the true training data, while actually providing convincing but entirely synthetic samples from an unrelated task. Unlike prior synthetic-data defenses that share classes or distributions with the private data and thus still leak semantic information, SpooFL uses a state-of-the-art generative model trained on an external dataset with no class overlap. As a result, attackers are misled into recovering plausible yet completely irrelevant samples, preventing meaningful data leakage while preserving FL training integrity. We implement the first example of such a spoofing defense, and evaluate our method against state-of-the-art DL defenses and demonstrate that it successfully misdirects attackers without compromising model performance significantly.
著者: Lorenzo Fern\'andez Maim\'o, Alberto Huertas Celdr\'an, Manuel Gil P\'erez, F\'elix J. Garc\'ia Clemente, Gregorio Mart\'inez P\'erez
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
Fog and mobile edge computing (MEC) will play a key role in the upcoming fifth generation (5G) mobile networks to support decentralized applications, data analytics and management into the network itself by using a highly distributed compute model. Furthermore, increasing attention is paid to providing user-centric cybersecurity solutions, which particularly require collecting, processing and analyzing significantly large amount of data traffic and huge number of network connections in 5G networks. In this regard, this paper proposes a MEC-oriented solution in 5G mobile networks to detect network anomalies in real-time and in autonomic way. Our proposal uses deep learning techniques to analyze network flows and to detect network anomalies. Moreover, it uses policies in order to provide an efficient and dynamic management system of the computing resources used in the anomaly detection process. The paper presents relevant aspects of the deployment of the proposal and experimental results to show its performance.
著者: Piyumi Bhagya Sudasinghe, Kushan Sudheera Kalupahana Liyanage, Harsha S. Gardiyawasam Pussewalage
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
The rapid growth of Internet of Things (IoT) devices has increased the scale and diversity of cyberattacks, exposing limitations in traditional intrusion detection systems. Classical machine learning (ML) models such as Random Forest and Support Vector Machine perform well on known attacks but require retraining to detect unseen or zero-day threats. This study investigates lightweight decoder-only Large Language Models (LLMs) for IoT attack detection by integrating structured-to-text conversion, Quantized Low-Rank Adaptation (QLoRA) fine-tuning, and Retrieval-Augmented Generation (RAG). Network traffic features are transformed into compact natural-language prompts, enabling efficient adaptation under constrained hardware. Experiments on the CICIoT2023 dataset show that a QLoRA-tuned LLaMA-1B model achieves an F1-score of 0.7124, comparable to the Random Forest (RF) baseline (0.7159) for known attacks. With RAG, the system attains 42.63% accuracy on unseen attack types without additional training, demonstrating practical zero-shot capability. These results highlight the potential of retrieval-enhanced lightweight LLMs as adaptable and resource-efficient solutions for next-generation IoT intrusion detection.
diffusion
著者: Ruben Neyroud, Sam Corley
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
While most LLMs are autoregressive, diffusion-based LLMs have recently emerged as an alternative method for generation. Greedy Coordinate Gradient (GCG) attacks have proven effective against autoregressive models, but their applicability to diffusion language models remains largely unexplored. In this work, we present an exploratory study of GCG-style adversarial prompt attacks on LLaDA (Large Language Diffusion with mAsking), an open-source diffusion LLM. We evaluate multiple attack variants, including prefix perturbations and suffix-based adversarial generation, on harmful prompts drawn from the AdvBench dataset. Our study provides initial insights into the robustness and attack surface of diffusion language models and motivates the development of alternative optimization and evaluation strategies for adversarial analysis in this setting.
著者: Ekleen Kaur
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
The integration of cryptocurrencies into institutional portfolios necessitates the adoption of robust risk modeling frameworks. This study is a part of a series of subsequent works to fine-tune model risk analysis for cryptocurrencies. Through this first research work, we establish a foundational benchmark by applying the traditional industry-standard Geometric Brownian Motion (GBM) model. Popularly used for non-crypto financial assets, GBM assumes Lognormal return distributions for a multi-asset cryptocurrency portfolio (XRP, SOL, ADA). This work utilizes Maximum Likelihood Estimation and a correlated Monte Carlo Simulation incorporating the Cholesky decomposition of historical covariance. We present our stock portfolio model as a Minimum Variance Portfolio (MVP). We observe the model's structural shift within the heavy-tailed, non-Gaussian cryptocurrency environment. The results reveal limitations of the Lognormal assumption: the calculated Value-at-Risk at the 5% confidence level over the one-year horizon. For baselining our results, we also present a holistic comparative analysis with an equity portfolio (AAPL, TSLA, NVDA), demonstrating a significantly lower failure rate. This performance provides conclusive evidence that the GBM model is fundamentally the perfect benchmark for our subsequent works. Results from this novel work will be an indicator for the success criteria in our future model for crypto risk management, rigorously motivating the development and application of advanced models.
著者: Jun Liu, Leo Yu Zhang, Fengpeng Li, Isao Echizen, Jiantao Zhou
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
Hard-label black-box settings, where only top-1 predicted labels are observable, pose a fundamentally constrained yet practically important feedback model for understanding model behavior. A central challenge in this regime is whether meaningful gradient information can be recovered from such discrete responses. In this work, we develop a unified theoretical perspective showing that a wide range of existing sign-flipping hard-label attacks can be interpreted as implicitly approximating the sign of the true loss gradient. This observation reframes hard-label attacks from heuristic search procedures into instances of gradient sign recovery under extremely limited feedback. Motivated by this first-principles understanding, we propose a new attack framework that combines a zero-query frequency-domain initialization with a Pattern-Driven Optimization (PDO) strategy. We establish theoretical guarantees demonstrating that, under mild assumptions, our initialization achieves higher expected cosine similarity to the true gradient sign compared to random baselines, while the proposed PDO procedure attains substantially lower query complexity than existing structured search approaches. We empirically validate our framework through extensive experiments on CIFAR-10, ImageNet, and ObjectNet, covering standard and adversarially trained models, commercial APIs, and CLIP-based models. The results show that our method consistently surpasses SOTA hard-label attacks in both attack success rate and query efficiency, particularly in low-query regimes. Beyond image classification, our approach generalizes effectively to corrupted data, biomedical datasets, and dense prediction tasks. Notably, it also successfully circumvents Blacklight, a SOTA stateful defense, resulting in a $0\%$ detection rate. Our code will be released publicly soon at https://github.com/csjunjun/DPAttack.git.
著者: Madjda Fares, Yogya Gamage, Benoit Baudry
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
GitHub Actions is a widely used platform that allows developers to automate the build and deployment of their projects through configurable workflows. As the platform's popularity continues to grow, it has become a target of choice for recent software supply chain attacks. These attacks exploit excessive permissions, ambiguous versions, or the absence of artifact integrity checks to compromise workflows. In response to these attacks, several security scanners have emerged to help developers harden their workflows. In this paper, we perform the first systematic comparison of 9 GitHub Actions workflow security scanners. We compare them in terms of scope (which security weaknesses they target), detection capabilities (how many weaknesses they detect), and usability (how long they take to scan a workflow). To compare scanners on a common ground, we first establish a taxonomy of 10 security weaknesses that can occur in GitHub Actions workflows. Then, we run the scanners against a curated set of 596 workflows. Our study reveals that the landscape of GitHub Actions workflow security scanners is diverse, with both broad-scope tools and very focused ones. More importantly, we show that scanners interpret security weaknesses differently, leading to significant differences in the type and number of reported weaknesses. Based on this empirical evidence, we make actionable recommendations for developers to harden their GitHub Actions workflows.
著者: Andrew Crossman, Jonah Dodd, Viralam Ramamurthy Chaithanya Kumar, Riyaz Mohammed, Andrew R. Plummer, Chandra Sekharudu, Deepak Warrier, Mohammad Yekrangian
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
MITRE ATT&amp;CK is a cybersecurity knowledge base that organizes threat actor and cyber-attack information into a set of tactics describing the reasons and goals threat actors have for carrying out attacks, with each tactic having a set of techniques that describe the potential methods used in these attacks. One major application of ATT&amp;CK is the use of its tactic and technique hierarchy by security specialists as a framework for annotating cyber-threat intelligence reports, vulnerability descriptions, threat scenarios, inter alia, to facilitate downstream analyses. To date, the tagging process is still largely done manually. In this technical note, we provide a stratified "task space" characterization of the MITRE ATT&amp;CK text tagging task for organizing previous efforts toward automation using AIML methods, while also clarifying pathways for constructing new methods. To illustrate one of the pathways, we use the task space strata to stage-wise construct our own multi-label hierarchical classification models for the text tagging task via experimentation over general cyber-threat intelligence text -- using shareable computational tools and publicly releasing the models to the security community (via https://github.com/jpmorganchase/MITRE_models). Our multi-label hierarchical approach yields accuracy scores of roughly 94% at the tactic level, as well as accuracy scores of roughly 82% at the technique level. The models also meet or surpass state-of-the-art performance while relying only on classical machine learning methods -- removing any dependence on LLMs, RAG, agents, or more complex hierarchical approaches. Moreover, we show that GPT-4o model performance at the tactic level is significantly lower (roughly 60% accuracy) than our own approach. We also extend our baseline model to a corpus of threat scenarios for financial applications produced by subject matter experts.
agent
著者: Roland R. Rodriguez Jr
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
Multi-agent systems face a fundamental architectural flaw: agent identity is bound to network location. When agents migrate between providers, scale across instances, or federate across organizations, URI-based identity schemes break references, fragment audit trails, and require centralized coordination. We propose the agent:// URI scheme, which decouples identity from topology through three orthogonal components: a trust root establishing organizational authority, a hierarchical capability path enabling semantic discovery, and a sortable unique identifier providing stable reference. The scheme enables capability-based discovery through DHT key derivation, where queries return agents by what they do rather than where they are. Trust-root scoping prevents cross-organization pollution while permitting federation when desired. Cryptographic attestation via PASETO tokens binds capability claims to agent identity, enabling verification without real-time contact with the issuing authority. We evaluate the scheme across four dimensions: capability expressiveness (100% coverage on 369 production tools with zero collision), discovery precision (F1=1.0 across 10,000 agents), identity stability (formal proofs of migration invariance), and performance (all operations under 5 microseconds). The agent:// URI scheme provides a formally-specified, practically-evaluated foundation for decentralized agent identity and capability-based discovery.
privacy
著者: James Melbourne, Mario Diaz, Shahab Asoodeh
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
We study the optimal design of additive mechanisms for vector-valued queries under $\epsilon$-differential privacy (DP). Given only the sensitivity of a query and a norm-monotone cost function measuring utility loss, we ask which noise distribution minimizes expected cost among all additive $\epsilon$-DP mechanisms. Using convex rearrangement theory, we show that this infinite-dimensional optimization problem admits a reduction to a one-dimensional compact and convex family of radially symmetric distributions whose extreme points are the staircase distributions. As a consequence, we prove that for any dimension, any norm, and any norm-monotone cost function, there exists an $\epsilon$-DP staircase mechanism that is optimal among all additive mechanisms. This result resolves a conjecture of Geng, Kairouz, Oh, and Viswanath, and provides a geometric explanation for the emergence of staircase mechanisms as extremal solutions in differential privacy.
著者: Zhihao Chen, Zirui Gong, Jianting Ning, Yanjun Zhang, Leo Yu Zhang
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
Federated Rank Learning (FRL) is a promising Federated Learning (FL) paradigm designed to be resilient against model poisoning attacks due to its discrete, ranking-based update mechanism. Unlike traditional FL methods that rely on model updates, FRL leverages discrete rankings as a communication parameter between clients and the server. This approach significantly reduces communication costs and limits an adversary's ability to scale or optimize malicious updates in the continuous space, thereby enhancing its robustness. This makes FRL particularly appealing for applications where system security and data privacy are crucial, such as web-based auction and bidding platforms. While FRL substantially reduces the attack surface, we demonstrate that it remains vulnerable to a new class of local model poisoning attack, i.e., fine-grained control attacks. We introduce the Edge Control Attack (ECA), the first fine-grained control attack tailored to ranking-based FL frameworks. Unlike conventional denial-of-service (DoS) attacks that cause conspicuous disruptions, ECA enables an adversary to precisely degrade a competitor's accuracy to any target level while maintaining a normal-looking convergence trajectory, thereby avoiding detection. ECA operates in two stages: (i) identifying and manipulating Ascending and Descending Edges to align the global model with the target model, and (ii) widening the selection boundary gap to stabilize the global model at the target accuracy. Extensive experiments across seven benchmark datasets and nine Byzantine-robust aggregation rules (AGRs) show that ECA achieves fine-grained accuracy control with an average error of only 0.224%, outperforming the baseline by up to 17x. Our findings highlight the need for stronger defenses against advanced poisoning attacks. Our code is available at: https://github.com/Chenzh0205/ECA
著者: Yoann Marquer, Domenico Bianculli, Lionel C. Briand
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
Python is one of the most popular programming languages; as such, projects written in Python involve an increasing number of diverse security vulnerabilities. However, existing state-of-the-art analysis tools for Python only support a few vulnerability types. Hence, there is a need to detect a large variety of vulnerabilities in Python projects. In this paper, we propose the SAGA approach to detect and locate vulnerabilities in Python source code in a versatile way. SAGA includes a source code parser able to extract control- and data-flow information and to represent it as a symbolic control-flow graph, as well as a domain-specific language defining static aspects of the source code and their evolution during graph traversals. We have leveraged this language to define a library of static aspects for integrity, confidentiality, and other security-related properties. We have evaluated SAGA on a dataset of 108 vulnerabilities, obtaining 100% sensitivity and 99.15% specificity, with only one false positive, while outperforming four common security analysis tools. This analysis was performed in less than 31 seconds, i.e., between 2.5 and 512.1 times faster than the baseline tools.
著者: Rob Jansen, Ryan Wails, Aaron Johnson
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
Website fingerprinting (WF) is a dangerous attack on web privacy because it enables an adversary to predict the website a user is visiting, despite the use of encryption, VPNs, or anonymizing networks such as Tor. Previous WF work almost exclusively uses synthetic datasets to evaluate the performance and estimate the feasibility of WF attacks despite evidence that synthetic data misrepresents the real world. In this paper we present GTT23, the first WF dataset of genuine Tor traces, which we obtain through a large-scale measurement of the Tor network and which is intended especially for WF. It represents real Tor user behavior better than any existing WF dataset, is larger than any existing WF dataset by at least an order of magnitude, and will help ground the future study of realistic WF attacks and defenses. In a detailed evaluation, we survey 28 WF datasets published since 2008 and compare their characteristics to those of GTT23. We discover common deficiencies of synthetic datasets that make them inferior to GTT23 for drawing meaningful conclusions about the effectiveness of WF attacks directed at real Tor users. We have made GTT23 available to promote reproducible research and to help inspire new directions for future work.
model extractionintellectual property
著者: Yixiao Xu, Binxing Fang, Rui Wang, Yinghai Zhou, Yuan Liu, Mohan Li, Zhihong Tian
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
Triggerable watermarking enables model owners to assert ownership against model extraction attacks. However, most existing approaches require additional training, which limits post-deployment flexibility, and the lack of clear theoretical foundations makes them vulnerable to adaptive attacks. In this paper, we propose Neural Honeytrace, a plug-and-play watermarking framework that operates without retraining. We redefine the watermark transmission mechanism from an information perspective, designing a training-free multi-step transmission strategy that leverages the long-tailed effect of backdoor learning to achieve efficient and robust watermark embedding. Extensive experiments demonstrate that Neural Honeytrace reduces the average number of queries required for a worst-case t-test-based ownership verification to as low as $2\%$ of existing methods, while incurring zero training cost.
著者: Jean-Charles Noirot Ferrand, Yohan Beugin, Eric Pauley, Ryan Sheatsley, Patrick McDaniel
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
Alignment in large language models (LLMs) is used to enforce guidelines such as safety. Yet, alignment fails in the face of jailbreak attacks that modify inputs to induce unsafe outputs. In this paper, we introduce and evaluate a new technique for jailbreak attacks. We observe that alignment embeds a safety classifier in the LLM responsible for deciding between refusal and compliance, and seek to extract an approximation of this classifier: a surrogate classifier. To this end, we build candidate classifiers from subsets of the LLM. We first evaluate the degree to which candidate classifiers approximate the LLM's safety classifier in benign and adversarial settings. Then, we attack the candidates and measure how well the resulting adversarial inputs transfer to the LLM. Our evaluation shows that the best candidates achieve accurate agreement (an F1 score above 80%) using as little as 20% of the model architecture. Further, we find that attacks mounted on the surrogate classifiers can be transferred to the LLM with high success. For example, a surrogate using only 50% of the Llama 2 model achieved an attack success rate (ASR) of 70% with half the memory footprint and runtime -- a substantial improvement over attacking the LLM directly, where we only observed a 22% ASR. These results show that extracting surrogate classifiers is an effective and efficient means for modeling (and therein addressing) the vulnerability of aligned models to jailbreaking attacks. The code is available at https://github.com/jcnf0/targeting-alignment.
intellectual property
著者: Zhenliang Gan, Chunya Liu, Yichao Tang, Binghao Wang, Shiwen Cui, Weiqiang Wang, Xinpeng Zhang
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
The proliferation of generative image models has revolutionized AIGC creation while amplifying concerns over content provenance and manipulation forensics. Existing methods are typically either unable to localize tampering or restricted to specific generative settings, limiting their practical utility. We propose \textbf{GenPTW}, a \textbf{Gen}eral watermarking framework that unifies \textbf{P}rovenance tracing and \textbf{T}amper localization in latent space. It supports both in-generation and post-generation embedding without altering the generative process, and is plug-and-play compatible with latent diffusion models (LDMs) and visual autoregressive (VAR) models. To achieve precise provenance tracing and tamper localization, we embed the watermark using two complementary mechanisms: cross-attention fusion aligned with latent semantics and spatial fusion providing explicit spatial guidance for edit sensitivity. A tamper-aware extractor jointly conducts provenance tracing and tamper localization by leveraging watermark features together with high-frequency features. Experiments show that GenPTW maintains high visual fidelity and strong robustness against diverse AIGC-editing.
backdoor
著者: Viet Pham, Thai Le
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
Large Language Models (LLMs) are increasingly deployed via third-party system prompts downloaded from public marketplaces. We identify a critical supply-chain vulnerability: conditional system prompt poisoning, where an adversary injects a ``sleeper agent'' into a benign-looking prompt. Unlike traditional jailbreaks that aim for broad refusal-breaking, our proposed framework, SPECTRE, optimizes system prompts to trigger LLMs to output targeted, compromised responses only for specific queries (e.g., ``Who should I vote for the US President?'') while maintaining high utility on benign inputs. Operating in a strict black-box setting without model weight access, SPECTRE utilizes a two-stage optimization including a global semantic search followed by a greedy lexical refinement. Tested on open-source models and commercial APIs (GPT-4o-mini, GPT-3.5), SPECTRE achieves up to 70% F1 reduction on targeted queries with minimal degradation to general capabilities. We further demonstrate that these poisoned prompts evade standard defenses, including perplexity filters and typo-correction, by exploiting the natural noise found in real-world system prompts. Our code and data are available at https://github.com/vietph34/CAIN. WARNING: Our paper contains examples that might be sensitive to the readers!
著者: Alexander Herzog, Aliai Eusebi, Lorenzo Cavallaro
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
The performance figures of modern drift-adaptive malware classifiers appear promising, but does this translate to genuine operational reliability? The standard evaluation paradigm primarily focuses on baseline performance metrics, neglecting confidence-error alignment and operational stability. While prior works established the importance of temporal evaluation and introduced selective classification in malware classification tasks, we take a complementary direction by investigating whether malware classifiers maintain reliable and stable confidence estimates under distribution shifts and exploring the tensions between scientific advancement and practical impacts when they do not. We propose Aurora, a framework to evaluate malware classifiers based on their confidence quality and operational resilience. Aurora subjects the confidence profile of a given model to verification to assess the reliability of its estimates. Unreliable confidence estimates erode operational trust, waste valuable annotation budgets on non-informative samples for active learning, and leave error-prone instances undetected in selective classification. Aurora is further complemented by a set of metrics designed to go beyond point-in-time performance, striving towards a more holistic assessment of operational stability throughout temporal evaluation periods. The fragility we observe in SOTA frameworks across datasets of varying drift severity suggests it may be time to revisit the underlying assumptions.
著者: Lingsheng Chen, Shipeng Ye, Xiaoqi Li
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
With the development of the Internet, the amount of data generated by the medical industry each year has grown exponentially. The Electronic Health Record (EHR) manages the electronic data generated during the user's treatment process. Typically, an EHR data manager belongs to a medical institution. This traditional centralized data management model has many unreasonable or inconvenient aspects, such as difficulties in data sharing, and it is hard to verify the authenticity and integrity of the data. The decentralized, non-forgeable, data unalterable and traceable features of blockchain are in line with the application requirements of EHR. This paper takes the most common COVID-19 as the application scenario and designs a COVID-19 health system based on blockchain, which has extensive research and application value. Considering that the public and transparent nature of blockchain violates the privacy requirements of some health data, in the system design stage, from the perspective of practical application, the data is divided into public data and private data according to its characteristics. For private data, data encryption methods are adopted to ensure data privacy. The searchable encryption technology is combined with blockchain technology to achieve the retrieval function of encrypted data. Then, the proxy re-encryption technology is used to realize authorized access to data. In the system implementation part, based on the Hyperledger Fabric architecture, some functions of the system design are realized, including data upload, retrieval of the latest data and historical data. According to the environment provided by the development architecture, Go language chaincode (smart contract) is written to implement the relevant system functions.
diffusion
著者: Shiqian Zhao, Chong Wang, Yiming Li, Yihao Huang, Wenjie Qu, Siew-Kei Lam, Yi Xie, Kangjie Chen, Jie Zhang, Tianwei Zhang
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
Text-to-Image (T2I) models, represented by DALL$\cdot$E and Midjourney, have gained huge popularity for creating realistic images. The quality of these images relies on the carefully engineered prompts, which have become valuable intellectual property. While skilled prompters showcase their AI-generated art on markets to attract buyers, this business incidentally exposes them to \textit{prompt stealing attacks}. Existing state-of-the-art attack techniques reconstruct the prompts from a fixed set of modifiers (i.e., style descriptions) with model-specific training, which exhibit restricted adaptability and effectiveness to diverse showcases (i.e., target images) and diffusion models. To alleviate these limitations, we propose Prometheus, a training-free, proxy-in-the-loop, search-based prompt-stealing attack, which reverse-engineers the valuable prompts of the showcases by interacting with a local proxy model. It consists of three innovative designs. First, we introduce dynamic modifiers, as a supplement to static modifiers used in prior works. These dynamic modifiers provide more details specific to the showcases, and we exploit NLP analysis to generate them on the fly. Second, we design a contextual matching algorithm to sort both dynamic and static modifiers. This offline process helps reduce the search space of the subsequent step. Third, we interact with a local proxy model to invert the prompts with a greedy search algorithm. Based on the feedback guidance, we refine the prompt to achieve higher fidelity. The evaluation results show that Prometheus successfully extracts prompts from popular platforms like PromptBase and AIFrog against diverse victim models, including Midjourney, Leonardo.ai, and DALL$\cdot$E, with an ASR improvement of 25.0\%. We also validate that Prometheus is resistant to extensive potential defenses, further highlighting its severity in practice.
著者: Yangheran Piao (University of Edinburgh), Jingjie Li (University of Edinburgh), Daniel W. Woods (University of Edinburgh)
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
As vendors adopt AI technologies, security researchers are working to uncover and fix related vulnerabilities, which is important given AI systems handle sensitive data and critical functions. This process relies on vendors receiving and rewarding AI vulnerability reports. To assess current practices, we analyzed the vulnerability disclosure policies of 264 AI vendors. We employed a mixed-methods approach, combining snapshot and longitudinal qualitative analysis, as well as comparing alignment with 320 AI incidents and 260 academic articles. Our analysis reveals that 36% of AI vendors have no established policy, and only 18% mention AI risks. Data access, authorization, and model extraction vulnerabilities are most consistently declared in-scope. Jailbreaking and hallucination are most commonly declared out-of-scope. We identify three profiles that reflect vendors' different positions toward AI vulnerabilities: proactive clarification (n = 46), silent (n = 115), and restrictive (n = 103). Our alignment results suggest that vendors may address AI vulnerability disclosure later than academic research and real-world incidents.
著者: Shuo Shao, Yiming Li, Hongwei Yao, Yifei Chen, Yuchen Yang, Zhan Qin
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
The substantial investment required to develop Large Language Models (LLMs) makes them valuable intellectual property, raising significant concerns about copyright protection. LLM fingerprinting has emerged as a key technique to address this, which aims to verify a model's origin by extracting an intrinsic, unique signature (a "fingerprint") and comparing it to that of a source model to identify illicit copies. However, existing black-box fingerprinting methods often fail to generate distinctive LLM fingerprints. This ineffectiveness arises because black-box methods typically rely on model outputs, which lose critical information about the model's unique parameters due to the usage of non-linear functions. To address this, we first leverage Fisher Information Theory to formally demonstrate that the gradient of the model's input is a more informative feature for fingerprinting than the output. Based on this insight, we propose ZeroPrint, a novel method that approximates these information-rich gradients in a black-box setting using zeroth-order estimation. ZeroPrint overcomes the challenge of applying this to discrete text by simulating input perturbations via semantic-preserving word substitutions. This operation allows ZeroPrint to estimate the model's Jacobian matrix as a unique fingerprint. Experiments on the standard benchmark show ZeroPrint achieves a state-of-the-art effectiveness and robustness, significantly outperforming existing black-box methods.
privacy
著者: Yi Liu, Weixiang Han, Chengjun Cai, Xingliang Yuan, Cong Wang
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
With the rise of large language models, service providers offer language models as a service, enabling users to fine-tune customized models via uploaded private datasets. However, this raises concerns about sensitive data leakage. Prior methods, relying on differential privacy within device-cloud collaboration frameworks, struggle to balance privacy and utility, exposing users to inference attacks or degrading fine-tuning performance. To address this, we propose PrivTune, an efficient and privacy-preserving fine-tuning framework via Split Learning (SL). The key idea of PrivTune is to inject crafted noise into token representations from the SL bottom model, making each token resemble the $n$-hop indirect neighbors. PrivTune formulates this as an optimization problem to compute the optimal noise vector, aligning with defense-utility goals. On this basis, it then adjusts the parameters (i.e., mean) of the $d_\chi$-Privacy noise distribution to align with the optimization direction and scales the noise according to token importance to minimize distortion. Experiments on five datasets (covering both classification and generation tasks) against three embedding inversion and three attribute inference attacks show that, using RoBERTa on the Stanford Sentiment Treebank dataset, PrivTune reduces the attack success rate to 10% with only a 3.33% drop in utility performance, outperforming state-of-the-art baselines.
intellectual property
著者: Zhenhua Xu, Yiran Zhao, Mengting Zhong, Dezhang Kong, Changting Lin, Tong Qiao, Meng Han
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
The rapid growth of large language models raises pressing concerns about intellectual property protection under black-box deployment. Existing backdoor-based fingerprints either rely on rare tokens -- leading to high-perplexity inputs susceptible to filtering -- or use fixed trigger-response mappings that are brittle to leakage and post-hoc adaptation. We propose \textsc{Dual-Layer Nested Fingerprinting} (DNF), a black-box method that embeds a hierarchical backdoor by coupling domain-specific stylistic cues with implicit semantic triggers. Across Mistral-7B, LLaMA-3-8B-Instruct, and Falcon3-7B-Instruct, DNF achieves perfect fingerprint activation while preserving downstream utility. Compared with existing methods, it uses lower-perplexity triggers, remains undetectable under fingerprint detection attacks, and is relatively robust to incremental fine-tuning and model merging. These results position DNF as a practical, stealthy, and resilient solution for LLM ownership verification and intellectual property protection.
著者: Haoze Guo, Ziqi Wei
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
Retrieval-augmented generation (RAG) systems put more and more emphasis on grounding their responses in user-generated content found on the Web, amplifying both their usefulness and their attack surface. Most notably, indirect prompt injection and retrieval poisoning attack the web-native carriers that survive ingestion pipelines and are very concerning. We provide OpenRAG-Soc, a compact, reproducible benchmark-and-harness for web-facing RAG evaluation under these threats, in a discrete data package. The suite combines a social corpus with interchangeable sparse and dense retrievers and deployable mitigations - HTML/Markdown sanitization, Unicode normalization, and attribution-gated answered. It standardizes end-to-end evaluation from ingestion to generation and reports attacks time of one of the responses at answer time, rank shifts in both sparse and dense retrievers, utility and latency, allowing for apples-to-apples comparisons across carriers and defenses. OpenRAG-Soc targets practitioners who need fast, and realistic tests to track risk and harden deployments.
model extraction
著者: Zhenhua Xu, Xiaoning Tian, Wenjun Zeng, Wenpeng Xing, Tianliang Lu, Gaolei Li, Chaochao Chen, Meng Han
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
Protecting the intellectual property of large language models requires robust ownership verification. Conventional backdoor fingerprinting, however, is flawed by a stealth-robustness paradox: to be robust, these methods force models to memorize fixed responses to high-perplexity triggers, but this targeted overfitting creates detectable statistical artifacts. We resolve this paradox with KinGuard, a framework that embeds a private knowledge corpus built on structured kinship narratives. Instead of memorizing superficial triggers, the model internalizes this knowledge via incremental pre-training, and ownership is verified by probing its conceptual understanding. Extensive experiments demonstrate KinGuard's superior effectiveness, stealth, and resilience against a battery of attacks including fine-tuning, input perturbation, and model merging. Our work establishes knowledge-based embedding as a practical and secure paradigm for model fingerprinting.
著者: Shuhei Nakamura
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
In the complexity estimation for an attack that reduces a cryptosystem to solving a system of polynomial equations, the degree of regularity and an upper bound of the first fall degree are often used in cryptanalysis. While the degree of regularity can be easily computed using a univariate formal power series under the semi-regularity assumption, determining an upper bound of the first fall degree requires investigating the concrete syzygies of an input system. In this paper, we investigate an upper bound of the first fall degree for a polynomial system over a sufficiently large field. In this case, we prove that the first fall degree of a non-semi-regular system is bounded above by the degree of regularity, and that the first fall degree of a multi-graded polynomial system is bounded above by a certain value determined from a multivariate formal power series. Moreover, we provide a theoretical assumption for computing the first fall degree of a polynomial system over a sufficiently large field.
著者: Kirsten Eisentraeger, Gabrielle Scullard
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
We give a deterministic polynomial time algorithm to compute the endomorphism ring of a supersingular elliptic curve in characteristic p, provided that we are given two noncommuting endomorphisms and the factorization of the discriminant of the ring $\mathcal{O}_0$ they generate. At each prime $q$ for which $\mathcal{O}_0$ is not maximal, we compute the endomorphism ring locally by computing a q-maximal order containing it and, when $q \neq p$, recovering a path to $\text{End}(E) \otimes \mathbb{Z}_q$ in the Bruhat-Tits tree. We use techniques of higher-dimensional isogenies to navigate towards the local endomorphism ring. Our algorithm improves on a previous algorithm which requires a restricted input and runs in subexponential time under certain heuristics. Page and Wesolowski give a probabilistic polynomial time algorithm to compute the endomorphism ring on input of a single non-scalar endomorphism. Beyond using techniques of higher-dimensional isogenies to divide endomorphisms by a scalar, our methods are completely different.
intellectual property
著者: Grzegorz G{\l}uch, Berkant Turan, Sai Ganesh Nagarajan, Sebastian Pokutta
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
We formalize and analyze the trade-off between backdoor-based watermarks and adversarial defenses, framing it as an interactive protocol between a verifier and a prover. While previous works have primarily focused on this trade-off, our analysis extends it by identifying transferable attacks as a third, counterintuitive, but necessary option. Our main result shows that for all learning tasks, at least one of the three exists: a watermark, an adversarial defense, or a transferable attack. By transferable attack, we refer to an efficient algorithm that generates queries indistinguishable from the data distribution and capable of fooling all efficient defenders. Using cryptographic techniques, specifically fully homomorphic encryption, we construct a transferable attack and prove its necessity in this trade-off. Finally, we show that tasks of bounded VC-dimension allow adversarial defenses against all attackers, while a subclass allows watermarks secure against fast adversaries.
著者: Rui Chen, Bin Liu, Changtao Miao, Xinghao Wang, Yi Li, Tao Gong, Qi Chu, Nenghai Yu
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
Advances in image tampering pose serious security threats, underscoring the need for effective image manipulation localization (IML). While supervised IML achieves strong performance, it depends on costly pixel-level annotations. Existing weakly supervised or training-free alternatives often underperform and lack interpretability. We propose the In-Context Forensic Chain (ICFC), a training-free framework that leverages multi-modal large language models (MLLMs) for interpretable IML tasks. ICFC integrates an objectified rule construction with adaptive filtering to build a reliable knowledge base and a multi-step progressive reasoning pipeline that mirrors expert forensic workflows from coarse proposals to fine-grained forensics results. This design enables systematic exploitation of MLLM reasoning for image-level classification, pixel-level localization, and text-level interpretability. Across multiple benchmarks, ICFC not only surpasses state-of-the-art training-free methods but also achieves competitive or superior performance compared to weakly and fully supervised approaches.
intellectual property
著者: Jiasen Li, Yanwei Liu, Zhuoyi Shang, Xiaoyan Gu, Weiping Wang
公開日: Thu, 22 Jan 2026 00:00:00 -0500
要約:
Graph-structured data is foundational to numerous web applications, and watermarking is crucial for protecting their intellectual property and ensuring data provenance. Existing watermarking methods primarily operate on graph structures or entangled graph representations, which compromise the transparency and robustness of watermarks due to the information coupling in representing graphs and uncontrollable discretization in transforming continuous numerical representations into graph structures. This motivates us to propose DRGW, the first graph watermarking framework that addresses these issues through disentangled representation learning. Specifically, we design an adversarially trained encoder that learns an invariant structural representation against diverse perturbations and derives a statistically independent watermark carrier, ensuring both robustness and transparency of watermarks. Meanwhile, we devise a graph-aware invertible neural network to provide a lossless channel for watermark embedding and extraction, guaranteeing high detectability and transparency of watermarks. Additionally, we develop a structure-aware editor that resolves the issue of latent modifications into discrete graph edits, ensuring robustness against structural perturbations. Experiments on diverse benchmark datasets demonstrate the superior effectiveness of DRGW.
生成日時: 2026-01-22 18:00:03