arXiv論文一覧 - cs.CR updates on arXiv.org

#1 Aplicacion de analitica de datos para la deteccion de anomalias y fortalecimiento de la seguridad en la red WiFi del campus universitario de la Universidad Nacional del Altiplano

著者: Adiv Brander Cari Quispe

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.00798

要約:
In today's university environment, wireless connectivity is an essential resource for academic, administrative, and research activities. However, at the National University of the Altiplano of Puno (UNAP), the use of a QR code access system on the institutional Wi-Fi network has generated vulnerabilities related to the lack of individual authentication, user traceability, and access control. Given this situation, this study aims to strengthen the security of the university's wireless network through the application of data analytics, employing descriptive, predictive, and prescriptive approaches to the logs generated by the wireless controller (WLC). The methodology consisted of collecting and processing connection data from users, devices, and daily traffic, analyzing behavioral patterns, and detecting anomalies based on statistical models and machine learning algorithms. The results revealed critical usage peaks between 10:00 and 14:00, as well as anomalous behavior associated with recurring devices and irregular traffic spikes. This allowed for the establishment of dynamic alert thresholds and recommendations for improvements in bandwidth management and authentication. Furthermore, the conclusion states that integrating advanced analytics into the management of university networks not only identifies vulnerabilities and optimizes WiFi service performance, but also advances towards an intelligent, proactive infrastructure aligned with modern institutional cybersecurity standards.

#2 The Silicon Psyche: Anthropomorphic Vulnerabilities in Large Language Models

著者: Giuseppe Canale, Kashyap Thimmaraju

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.00867

要約:
Large Language Models (LLMs) are rapidly transitioning from conversational assistants to autonomous agents embedded in critical organizational functions, including Security Operations Centers (SOCs), financial systems, and infrastructure management. Current adversarial testing paradigms focus predominantly on technical attack vectors: prompt injection, jailbreaking, and data exfiltration. We argue this focus is catastrophically incomplete. LLMs, trained on vast corpora of human-generated text, have inherited not merely human knowledge but human \textit{psychological architecture} -- including the pre-cognitive vulnerabilities that render humans susceptible to social engineering, authority manipulation, and affective exploitation. This paper presents the first systematic application of the Cybersecurity Psychology Framework (\cpf{}), a 100-indicator taxonomy of human psychological vulnerabilities, to non-human cognitive agents. We introduce the \textbf{Synthetic Psychometric Assessment Protocol} (\sysname{}), a methodology for converting \cpf{} indicators into adversarial scenarios targeting LLM decision-making. Our preliminary hypothesis testing across seven major LLM families reveals a disturbing pattern: while models demonstrate robust defenses against traditional jailbreaks, they exhibit critical susceptibility to authority-gradient manipulation, temporal pressure exploitation, and convergent-state attacks that mirror human cognitive failure modes. We term this phenomenon \textbf{Anthropomorphic Vulnerability Inheritance} (AVI) and propose that the security community must urgently develop ``psychological firewalls'' -- intervention mechanisms adapted from the Cybersecurity Psychology Intervention Framework (\cpif{}) -- to protect AI agents operating in adversarial environments.

#3 Towards eco friendly cybersecurity: machine learning based anomaly detection with carbon and energy metrics

著者: KC Aashish, Md Zakir Hossain Zamil, Md Shafiqul Islam Mridul, Lamia Akter, Farmina Sharmin, Eftekhar Hossain Ayon, Md Maruf Bin Reza, Ali Hassan, Abdur Rahim, Sirapa Malla

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.00893

要約:
The rising energy footprint of artificial intelligence has become a measurable component of US data center emissions, yet cybersecurity research seldom considers its environmental cost. This study introduces an eco aware anomaly detection framework that unifies machine learning based network monitoring with real time carbon and energy tracking. Using the publicly available Carbon Aware Cybersecurity Traffic Dataset comprising 2300 flow level observations, we benchmark Logistic Regression, Random Forest, Support Vector Machine, Isolation Forest, and XGBoost models across energy, carbon, and performance dimensions. Each experiment is executed in a controlled Colab environment instrumented with the CodeCarbon toolkit to quantify power draw and equivalent CO2 output during both training and inference. We construct an Eco Efficiency Index that expresses F1 score per kilowatt hour to capture the trade off between detection quality and environmental impact. Results reveal that optimized Random Forest and lightweight Logistic Regression models achieve the highest eco efficiency, reducing energy consumption by more than forty percent compared to XGBoost while sustaining competitive detection accuracy. Principal Component Analysis further decreases computational load with negligible loss in recall. Collectively, these findings establish that integrating carbon and energy metrics into cybersecurity workflows enables environmentally responsible machine learning without compromising operational protection. The proposed framework offers a reproducible path toward sustainable carbon accountable cybersecurity aligned with emerging US green computing and federal energy efficiency initiatives.

#4 Noise-Aware and Dynamically Adaptive Federated Defense Framework for SAR Image Target Recognition

著者: Yuchao Hou (Shanxi Normal University, Taiyuan, China), Zixuan Zhang (Shanxi Normal University, Taiyuan, China), Jie Wang (Shanxi Normal University, Taiyuan, China), Wenke Huang (Nanyang Technological University, Singapore, Singapore), Lianhui Liang (Guangxi University, Nanning, China), Di Wu (La Trobe University, Melbourne, Australia), Zhiquan Liu (Jinan University, Guangzhou, China), Youliang Tian (Guizhou University, Guiyang, China), Jianming Zhu (Central University of Finance and Economics, Beijing, China), Jisheng Dang (Lanzhou University, Lanzhou, China), Junhao Dong (Nanyang Technological University, Singapore, Singapore), Zhongliang Guo (University of St Andrews, St Andrews, United Kingdom)

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.00900

要約:
As a critical application of computational intelligence in remote sensing, deep learning-based synthetic aperture radar (SAR) image target recognition facilitates intelligent perception but typically relies on centralized training, where multi-source SAR data are uploaded to a single server, raising privacy and security concerns. Federated learning (FL) provides an emerging computational intelligence paradigm for SAR image target recognition, enabling cross-site collaboration while preserving local data privacy. However, FL confronts critical security risks, where malicious clients can exploit SAR's multiplicative speckle noise to conceal backdoor triggers, severely challenging the robustness of the computational intelligence model. To address this challenge, we propose NADAFD, a noise-aware and dynamically adaptive federated defense framework that integrates frequency-domain, spatial-domain, and client-behavior analyses to counter SAR-specific backdoor threats. Specifically, we introduce a frequency-domain collaborative inversion mechanism to expose cross-client spectral inconsistencies indicative of hidden backdoor triggers. We further design a noise-aware adversarial training strategy that embeds $\Gamma$-distributed speckle characteristics into mask-guided adversarial sample generation to enhance robustness against both backdoor attacks and SAR speckle noise. In addition, we present a dynamic health assessment module that tracks client update behaviors across training rounds and adaptively adjusts aggregation weights to mitigate evolving malicious contributions. Experiments on MSTAR and OpenSARShip datasets demonstrate that NADAFD achieves higher accuracy on clean test samples and a lower backdoor attack success rate on triggered inputs than existing federated backdoor defenses for SAR target recognition.

#5 Security Hardening Using FABRIC: Implementing a Unified Compliance Aggregator for Linux Servers

著者: Sheldon Paul, Izzat Alsmadi

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.00909

要約:
This paper presents a unified framework for evaluating Linux security hardening on the FABRIC testbed through aggregation of heterogeneous security auditing tools. We deploy three Ubuntu 22.04 nodes configured at baseline, partial, and full hardening levels, and evaluate them using Lynis, OpenSCAP, and AIDE across 108 audit runs. To address the lack of a consistent interpretation across tools, we implement a Unified Compliance Aggregator (UCA) that parses tool outputs, normalizes scores to a common 0--100 scale, and combines them into a weighted metric augmented by a customizable rule engine for organization-specific security policies. Experimental results show that full hardening increases OpenSCAP compliance from 39.7 to 71.8, while custom rule compliance improves from 39.3\% to 83.6\%. The results demonstrate that UCA provides a clearer and more reproducible assessment of security posture than individual tools alone, enabling systematic evaluation of hardening effectiveness in programmable testbed environments.

#6 Device-Native Autonomous Agents for Privacy-Preserving Negotiations

privacyagent

著者: Joyjit Roy

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.00911

要約:
Automated negotiations in insurance and business-to-business (B2B) commerce encounter substantial challenges. Current systems force a trade-off between convenience and privacy by routing sensitive financial data through centralized servers, increasing security risks, and diminishing user trust. This study introduces a device-native autonomous Artificial Intelligence (AI) agent system for privacy-preserving negotiations. The proposed system operates exclusively on user hardware, enabling real-time bargaining while maintaining sensitive constraints locally. It integrates zero-knowledge proofs to ensure privacy and employs distilled world models to support advanced on-device reasoning. The architecture incorporates six technical components within an agentic AI workflow. Agents autonomously plan negotiation strategies, conduct secure multi-party bargaining, and generate cryptographic audit trails without exposing user data to external servers. The system is evaluated in insurance and B2B procurement scenarios across diverse device configurations. Results show an average success rate of 87%, a 2.4x latency improvement over cloud baselines, and strong privacy preservation through zero-knowledge proofs. User studies show 27% higher trust scores when decision trails are available. These findings establish a foundation for trustworthy autonomous agents in privacy-sensitive financial domains.

#7 Emoji-Based Jailbreaking of Large Language Models

著者: M P V S Gopinadh, S Mahaboob Hussain

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.00936

要約:
Large Language Models (LLMs) are integral to modern AI applications, but their safety alignment mechanisms can be bypassed through adversarial prompt engineering. This study investigates emoji-based jailbreaking, where emoji sequences are embedded in textual prompts to trigger harmful and unethical outputs from LLMs. We evaluated 50 emoji-based prompts on four open-source LLMs: Mistral 7B, Qwen 2 7B, Gemma 2 9B, and Llama 3 8B. Metrics included jailbreak success rate, safety alignment adherence, and latency, with responses categorized as successful, partial and failed. Results revealed model-specific vulnerabilities: Gemma 2 9B and Mistral 7B exhibited 10 % success rates, while Qwen 2 7B achieved full alignment (0% success). A chi-square test (chi^2 = 32.94, p < 0.001) confirmed significant inter-model differences. While prior works focused on emoji attacks targeting safety judges or classifiers, our empirical analysis examines direct prompt-level vulnerabilities in LLMs. The results reveal limitations in safety mechanisms and highlight the necessity for systematic handling of emoji-based representations in prompt-level safety and alignment pipelines.

#8 CuFuzz: Hardening CUDA Programs through Transformation and Fuzzing

著者: Saurabh Singh, Ruobing Han, Jaewon Lee, Seonjin Na, Yonghae Kim, Taesoo Kim, Hyesoon Kim

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.01048

要約:
GPUs have gained significant popularity over the past decade, extending beyond their original role in graphics rendering. This evolution has brought GPU security and reliability to the forefront of concerns. Prior research has shown that CUDA's lack of memory safety can lead to serious vulnerabilities. While fuzzing is effective for finding such bugs on CPUs, equivalent tools for GPUs are lacking due to architectural differences and lack of built-in error detection. In this paper, we propose CuFuzz, a novel compiler-runtime co-design solution to extend state-of-the-art CPU fuzzing tools to GPU programs. CuFuzz transforms GPU programs into CPU programs using compiler IR-level transformations to enable effective fuzz testing. To the best of our knowledge, CuFuzz is the first mechanism to bring fuzzing support to CUDA, addressing a critical gap in GPU security research. By leveraging CPU memory error detectors such as Address Sanitizer, CuFuzz aims to uncover memory safety bugs and related correctness vulnerabilities in CUDA code, enhancing the security and reliability of GPU-accelerated applications. To ensure high fuzzing throughput, we introduce two compiler-runtime co-optimizations tailored for GPU code: Partial Representative Execution (PREX) and Access-Index Preserving Pruning (AXIPrune), achieving average throughput improvements of 32x with PREX and an additional 33% gain with AXIPrune on top of PREX-optimized code. Together, these optimizations can yield up to a 224.31x speedup. In our fuzzing campaigns, CuFuzz uncovered 122 security vulnerabilities in widely used benchmarks.

#9 Byzantine-Robust Federated Learning Framework with Post-Quantum Secure Aggregation for Real-Time Threat Intelligence Sharing in Critical IoT Infrastructure

著者: Milad Rahmati, Nima Rahmati

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.01053

要約:
The proliferation of Internet of Things devices in critical infrastructure has created unprecedented cybersecurity challenges, necessitating collaborative threat detection mechanisms that preserve data privacy while maintaining robustness against sophisticated attacks. Traditional federated learning approaches for IoT security suffer from two critical vulnerabilities: susceptibility to Byzantine attacks where malicious participants poison model updates, and inadequacy against future quantum computing threats that can compromise cryptographic aggregation protocols. This paper presents a novel Byzantine-robust federated learning framework integrated with post-quantum secure aggregation specifically designed for real-time threat intelligence sharing across critical IoT infrastructure. The proposed framework combines a adaptive weighted aggregation mechanism with lattice-based cryptographic protocols to simultaneously defend against model poisoning attacks and quantum adversaries. We introduce a reputation-based client selection algorithm that dynamically identifies and excludes Byzantine participants while maintaining differential privacy guarantees. The secure aggregation protocol employs CRYSTALS-Kyber for key encapsulation and homomorphic encryption to ensure confidentiality during parameter updates. Experimental evaluation on industrial IoT intrusion detection datasets demonstrates that our framework achieves 96.8% threat detection accuracy while successfully mitigating up to 40% Byzantine attackers, with only 18% computational overhead compared to non-secure federated approaches. The framework maintains sub-second aggregation latency suitable for real-time applications and provides 256-bit post-quantum security level.

#10 Out-of-Band Power Side-Channel Detection for Semiconductor Supply Chain Integrity at Scale

著者: Rajiv Thummala, Katherine Winton, Luke Flores, Elizabeth Redmond, Gregory Falco

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.01054

要約:
Out-of-band screening of microcontrollers is a major gap in semiconductor supply chain security. High-assurance techniques such as X-ray and destructive reverse engineering are accurate but slow and expensive, hindering comprehensive detection for hardware Trojans or firmware tampering. Consequently, there has been increased interest in applying machine learning techniques to automate forensic examination, enabling rapid, large-scale inspection of components without manual oversight. We introduce a non-destructive screening method that uses power side-channel measurements and generative modeling to detect tampering in commodity microcontrollers without trusted hardware. As a proof-of-concept, differential power analysis (DPA) traces are collected from the ChipWhisperer and a generative adversarial network (GAN) is trained only on benign measurements to learn nominal power behavior. The trained discriminator then serves as a one-class anomaly detector. We report detection performance on multiple tampering scenarios and discuss how this technique can serve as an intermediate screening tier between basic functional tests and high-cost forensic analysis. The proposed method is evaluated in the context of semiconductor supply chain practice and policy to assess its suitability as an intermediate assurance mechanism.

#11 Post-Quantum Cryptography for Intelligent Transportation Systems: An Implementation-Focused Review

著者: Abdullah Al Mamun, Akid Abrar, Mizanur Rahman, M Sabbir Salek, Mashrur Chowdhury

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.01068

要約:
As quantum computing advances, the cryptographic algorithms that underpin confidentiality, integrity, and authentication in Intelligent Transportation Systems (ITS) face increasing vulnerability to quantum-enabled attacks. To address these risks, governments and industry stakeholders are turning toward post-quantum cryptography (PQC), a class of algorithms designed to resist adversaries equipped with quantum computing capabilities. However, existing studies provide limited insight into the implementation-focused aspects of PQC in the ITS domain. This review fills that gap by evaluating the readiness of vehicular communication and security standards for PQC adoption. It examines in-vehicle networks and vehicle-to-everything (V2X) interfaces, while also investigating vulnerabilities at the physical layer, primarily exposure to side-channel and fault injection attacks. The review identifies thirteen research gaps reflecting non-PQC-ready standards, constraints in embedded implementation and hybrid cryptography, interoperability and certificate-management barriers, lack of real-world PQC deployment data in ITS, and physical-attack vulnerabilities in PQC-enabled vehicular communication. Future research directions include updating vehicular communication and security standards, optimizing PQC for low-power devices, enhancing interoperability and certificate-management frameworks for PQC integration, conducting real-world evaluations of PQC-enabled communication and control functions across ITS deployments, and strengthening defenses against AI-assisted physical attacks. A phased roadmap is presented, aligning PQC deployment with regulatory, performance, and safety requirements, thereby guiding the secure evolution of ITS in the quantum computing era.

#12 NADD: Amplifying Noise for Effective Diffusion-based Adversarial Purification

diffusion

著者: David D. Nguyen, The-Anh Ta, Yansong Gao, Alsharif Abuadbba

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.01109

要約:
The strategy of combining diffusion-based generative models with classifiers continues to demonstrate state-of-the-art performance on adversarial robustness benchmarks. Known as adversarial purification, this exploits a diffusion model's capability of identifying high density regions in data distributions to purify adversarial perturbations from inputs. However, existing diffusion-based purification defenses are impractically slow and limited in robustness due to the low levels of noise used in the diffusion process. This low noise design aims to preserve the semantic features of the original input, thereby minimizing utility loss for benign inputs. Our findings indicate that systematic amplification of noise throughout the diffusion process improves the robustness of adversarial purification. However, this approach presents a key challenge, as noise levels cannot be arbitrarily increased without risking distortion of the input. To address this key problem, we introduce high levels of noise during the forward process and propose the ring proximity correction to gradually eliminate adversarial perturbations whilst closely preserving the original data sample. As a second contribution, we propose a new stochastic sampling method which introduces additional noise during the reverse diffusion process to dilute adversarial perturbations. Without relying on gradient obfuscation, these contributions result in a new robustness accuracy record of 44.23% on ImageNet using AutoAttack ($\ell_{\infty}=4/255$), an improvement of +2.07% over the previous best work. Furthermore, our method reduces inference time to 1.08 seconds per sample on ImageNet, a $47\times$ improvement over the existing state-of-the-art approach, making it far more practical for real-world defensive scenarios.

#13 AI-Powered Hybrid Intrusion Detection Framework for Cloud Security Using Novel Metaheuristic Optimization

著者: Maryam Mahdi Alhusseini, Alireza Rouhi, Mohammad-Reza Feizi-Derakhshi

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.01134

要約:
Cybersecurity poses considerable problems to Cloud Computing (CC), especially regarding Intrusion Detection Systems (IDSs), facing difficulties with skewed datasets and suboptimal classification model performance. This study presents the Hybrid Intrusion Detection System (HyIDS), an innovative IDS that employs the Energy Valley Optimizer (EVO) for Feature Selection (FS). Additionally, it introduces a novel technique for enhancing the cybersecurity of cloud computing through the integration of machine learning methodologies with the EVO Algorithm. The Energy Valley Optimizer (EVO) effectively diminished features in the CIC-DDoS2019 dataset from 88 to 38 and in the CSE-CIC-IDS2018 data from 80 to 43, significantly enhancing computing efficiency. HyIDS incorporates four Machine Learning (ML) models: Support Vector Machine (SVM), Random Forest (RF), Decision Tree (D_Tree), and K-Nearest Neighbors (KNN). The proposed HyIDS was assessed utilizing two real-world intrusion datasets, CIC-DDoS2019 and CSE-CIC-IDS2018, both distinguished by considerable class imbalances. The CIC-DDoS2019 dataset has a significant imbalance between DDoS assault samples and legal traffic, while the CSE-CIC-IDS2018 dataset primarily comprises benign traffic with insufficient representation of attack types, complicating the detection of minority attacks. A downsampling technique was employed to balance the datasets, hence improving detection efficacy for both benign and malicious traffic. Twenty-four trials were done, revealing substantial enhancements in categorization accuracy, precision, and recall. Our suggested D_TreeEVO model attained an accuracy rate of 99.13% and an F1 score of 98.94% on the CIC-DDoS2019 dataset, and an accuracy rate of 99.78% and an F1 score of 99.70% on the CSE-CIC-IDS2018 data. These data demonstrate that EVO significantly improves cybersecurity in Cloud Computing (CC).

#14 Comparative Evaluation of VAE, GAN, and SMOTE for Tor Detection in Encrypted Network Traffic

著者: Saravanan A, Aswani Kumar Cherukuri

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.01183

要約:
Encrypted network traffic poses significant challenges for intrusion detection due to the lack of payload visibility, limited labeled datasets, and high class imbalance between benign and malicious activities. Traditional data augmentation methods struggle to preserve the complex temporal and statistical characteristics of real network traffic. To address these issues, this work explores the use of Generative AI (GAI) models to synthesize realistic and diverse encrypted traffic traces. We evaluate three approaches: Variational Autoencoders (VAE), Generative Adversarial Networks (GAN), and SMOTE (Synthetic Minority Over-sampling Technique), each integrated with a preprocessing pipeline that includes feature selection and class balancing. The UNSW NB-15 dataset is used as the primary benchmark, focusing on Tor traffic as anomalies. We analyze statistical similarity between real and synthetic data, and assess classifier performance using metrics such as Accuracy, F1-score, and AUC-ROC. Results show that VAE-generated data provides the best balance between privacy and performance, while GANs offer higher fidelity but risk overfitting. SMOTE, though simple, enhances recall but may lack diversity. The findings demonstrate that GAI methods can significantly improve encrypted traffic detection when trained with privacy-preserving synthetic data.

#15 SecureCodeRL: Security-Aware Reinforcement Learning for Code Generation with Partial-Credit Rewards

著者: Suryansh Singh Sijwali, Suman Saha

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.01184

要約:
Large Language Models (LLMs) can generate plausible code, but in settings that require exact stdin/stdout behavior they frequently produce programs that compile yet fail tests, and in some cases they introduce security-sensitive patterns. This paper presents SecureCodeRL, a reinforcement learning (RL) pipeline for security-aware code generation that optimizes a combined reward R = {\alpha}Rfunc + \b{eta}Rsec. The key idea is a partial-credit functional reward that assigns intermediate scores for syntactic validity, successful execution, and producing output, reducing reward sparsity that otherwise stalls learning on competitive programming style tasks. I evaluate supervised fine-tuning (SFT) and PPO variants on a small held-out prompt set from APPS+ and observe that PPO with partial credit (using a continued-training variant) improves syntax validity from 45% (SFT) to 60% and achieves the only non-zero test success signal in this pilot evaluation (5% at-least-one-test-pass), while remaining 100% clean under Bandit static analysis. Although Bandit findings were absent in this small evaluation, the security term is integrated into training to discourage insecure shortcuts when they appear.

#16 Arca: A Lightweight Confidential Container Architecture for Cloud-Native Environments

著者: Di Lu, Mengna Sun, Qingwen Zhang, Yujia Liu, Jia Zhang, Xuewen Dong, Yulong Shen, Jianfeng Ma

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.01214

要約:
Confidential containers protect cloud-native workloads using trusted execution environments (TEEs). However, existing Container-in-TEE designs (e.g., Confidential Containers (CoCo)) encapsulate the entire runtime within the TEE, inflating the trusted computing base (TCB) and introducing redundant components and cross-layer overhead. We present Arca, a lightweight confidential container framework based on a TEE-in-Container architecture that isolates each workload in an independent, hardware-enforced trust domain while keeping orchestration logic outside the TEE. This design minimizes inter-layer dependencies, confines compromise to per-container boundaries, and restores the TEE's minimal trust principle. We implemented Arca on Intel SGX, Intel TDX, and AMD SEV. Experimental results show that Arca achieves near-native performance and outperforms CoCo in most benchmarks, while the reduced TCB significantly improves verifiability and resilience against host-level compromise. Arca emonstrates that efficient container management and strong runtime confidentiality can be achieved without sacrificing security assurance.

#17 MCP-SandboxScan: WASM-based Secure Execution and Runtime Analysis for MCP Tools

著者: Zhuoran Tan, Run Hao, Jeremy Singer, Yutian Tang, Christos Anagnostopoulos

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.01241

要約:
Tool-augmented LLM agents raise new security risks: tool executions can introduce runtime-only behaviors, including prompt injection and unintended exposure of external inputs (e.g., environment secrets or local files). While existing scanners often focus on static artifacts, analyzing runtime behavior is challenging because directly executing untrusted tools can itself be dangerous. We present MCP-SandboxScan, a lightweight framework motivated by the Model Context Protocol (MCP) that safely executes untrusted tools inside a WebAssembly/WASI sandbox and produces auditable reports of external-to-sink exposures. Our prototype (i) extracts LLM-relevant sinks from runtime outputs (prompt/messages and structured tool-return fields), (ii) instantiates external-input candidates from environment values, mounted file contents, and output-surfaced HTTP fetch intents, and (iii) links sources to sinks via snippet-based substring matching. Case studies on three representative tools show that MCP-SandboxScan can surface provenance evidence when external inputs appear in prompt/messages or tool-return payloads, and can expose filesystem capability violations as runtime evidence. We further compare against a lightweight static string-signature baseline and use a micro-benchmark to characterize false negatives under transformations and false positives from short-token collisions.

#18 Compliance as a Trust Metric

著者: Wenbo Wu, George Konstantinidis

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.01287

要約:
Trust and Reputation Management Systems (TRMSs) are critical for the modern web, yet their reliance on subjective user ratings or narrow Quality of Service (QoS) metrics lacks objective grounding. Concurrently, while regulatory frameworks like GDPR and HIPAA provide objective behavioral standards, automated compliance auditing has been limited to coarse, binary (pass/fail) outcomes. This paper bridges this research gap by operationalizing regulatory compliance as a quantitative and dynamic trust metric through our novel automated compliance engine (ACE). ACE first formalizes legal and organizational policies into a verifiable, obligation-centric logic. It then continuously audits system event logs against this logic to detect violations. The core of our contribution is a quantitative model that assesses the severity of each violation along multiple dimensions, including its Volume, Duration, Breadth, and Criticality, to compute a fine-grained, evolving compliance score. We evaluate ACE on a synthetic hospital dataset, demonstrating its ability to accurately detect a range of complex HIPAA and GDPR violations and produce a nuanced score that is significantly more expressive than traditional binary approaches. This work enables the development of more transparent, accountable, and resilient TRMSs on the Web.

#19 dataRLsec: Safety, Security, and Reliability With Robust Offline Reinforcement Learning for DPAs

著者: Shriram KS Pandian, Naresh Kshetri

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.01289

要約:
Data poisoning attacks (DPAs) are becoming popular as artificial intelligence (AI) algorithms, machine learning (ML) algorithms, and deep learning (DL) algorithms in this artificial intelligence (AI) era. Hackers and penetration testers are excessively injecting malicious contents in the training data (and in testing data too) that leads to false results that are very hard to inspect and predict. We have analyzed several recent technologies used (from deep reinforcement learning to federated learning) for the DPAs and their safety, security, & countermeasures. The problem setup along with the problem estimation is shown in the MuJoCo environment with performance of HalfCheetah before the dataset is poisoned and after the dataset is poisoned. We have analyzed several risks associated with the DPAs and falsification in medical data from popular poisoning data attacks to some popular data defenses. We have proposed robust offline reinforcement learning (Offline RL) for the safety and reliability with weighted hash verification along with density-ratio weighted behavioral cloning (DWBC) algorithm. The four stages of the proposed algorithm (as the Stage 0, the Stage 1, the Stage 2, and the Stage 3) are described with respect to offline RL, safety, and security for DPAs. The conclusion and future scope are provided with the intent to combine DWBC with other data defense strategies to counter and protect future contamination cyberattacks.

#20 Aggressive Compression Enables LLM Weight Theft

著者: Davis Brown, Juan-Pablo Rivera, Dan Hendrycks, Mantas Mazeika

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.01296

要約:
As frontier AIs become more powerful and costly to develop, adversaries have increasing incentives to steal model weights by mounting exfiltration attacks. In this work, we consider exfiltration attacks where an adversary attempts to sneak model weights out of a datacenter over a network. While exfiltration attacks are multi-step cyber attacks, we demonstrate that a single factor, the compressibility of model weights, significantly heightens exfiltration risk for large language models (LLMs). We tailor compression specifically for exfiltration by relaxing decompression constraints and demonstrate that attackers could achieve 16x to 100x compression with minimal trade-offs, reducing the time it would take for an attacker to illicitly transmit model weights from the defender's server from months to days. Finally, we study defenses designed to reduce exfiltration risk in three distinct ways: making models harder to compress, making them harder to 'find,' and tracking provenance for post-attack analysis using forensic watermarks. While all defenses are promising, the forensic watermark defense is both effective and cheap, and therefore is a particularly attractive lever for mitigating weight-exfiltration risk.

#21 Automated SBOM-Driven Vulnerability Triage for IoT Firmware: A Lightweight Pipeline for Risk Prioritization

著者: Abdurrahman Tolay

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.01308

要約:
The proliferation of Internet of Things (IoT) devices has introduced significant security challenges, primarily due to the opacity of firmware components and the complexity of supply chain dependencies. IoT firmware frequently relies on outdated, third-party libraries embedded within monolithic binary blobs, making vulnerability management difficult. While Software Bill of Materials (SBOM) standards have matured, generating actionable intelligence from raw firmware dumps remains a manual and error-prone process. This paper presents a lightweight, automated pipeline designed to extract file systems from Linux-based IoT firmware, generate a comprehensive SBOM, map identified components to known vulnerabilities, and apply a multi-factor triage scoring model. The proposed system focuses on risk prioritization by integrating signals from the Common Vulnerability Scoring System (CVSS), Exploit Prediction Scoring System (EPSS), and the CISA Known Exploited Vulnerabilities (KEV) catalog. Unlike conventional scanners that produce high volumes of uncontextualized alerts, this approach emphasizes triage by calculating a localized risk score for each finding. We describe the architecture, the normalization challenges of embedded Linux, and a scoring methodology intended to reduce alert fatigue. The study outlines a planned evaluation strategy to validate the extraction success rate and triage efficacy using a dataset of public vendor firmware, offering a reproducibility framework for future research in firmware security.

#22 Bithoven: Formal Safety for Expressive Bitcoin Smart Contracts

著者: Hyunhum Cho, Ik Rae Jeong

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.01436

要約:
The rigorous security model of Bitcoin's UTXO architecture often comes at the cost of developer usability, forcing a reliance on manual stack manipulation that leads to critical financial vulnerabilities like signature malleability, unspendable states and unconstrained execution paths. Industry standards such as Miniscript provide necessary abstractions for policy verification but do not model the full imperative logic required for complex contracts, leaving gaps in state management and resource liveness. This paper introduces Bithoven, a high-level language designed to bridge the gap between expressiveness and formal safety. By integrating a strict type checker and a resource liveness analyzer with a semantic control-flow analyzer, Bithoven eliminates major categories of consensus and logic defects defined in our fault model prior to deployment. Our results indicate that this safety comes at modest cost: Bithoven compiles to Bitcoin Script with efficiency comparable to hand-optimized code, demonstrating that type-safe, developer-friendly abstractions are viable even within the strict byte-size constraints of the Bitcoin blockchain.

#23 Security in the Era of Perceptive Networks: A Comprehensive Taxonomic Framework for Integrated Sensing and Communication Security

著者: Chandra Thapa, Surya Nepal

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.01455

要約:
Integrated Sensing and Communication (ISAC) represents a significant shift in the 6G landscape, where wireless networks both sense the environment and communicate. While prior comprehensive surveys have established foundational elements of ISAC security, discussed perception-focused security models, and proposed layered defense strategies, this paper synthesizes these studies into a comprehensive taxonomic framework that covers the whole ISAC security domain. This paper provides a systematic and thorough review of ISAC security across multiple orthogonal dimensions. These include threat taxonomy and propagation methods; vulnerability analysis at design, physical, computational, and architectural levels; defense mechanisms categorized by deployment layer; security-performance trade-offs with theoretical bounds; sector-specific security demands for critical infrastructure; and emerging issues such as quantum resilience, AI-hardening, and privacy preservation. Unlike previous frameworks that primarily focus on vision, this review combines these dimensions, introduces new classification schemes that reveal hidden relationships between threats and defenses, and identifies key research gaps through structured analysis. This detailed taxonomy offers a valuable reference for researchers developing secure ISAC systems and policymakers establishing security standards.

#24 OpenRT: An Open-Source Red Teaming Framework for Multimodal LLMs

著者: Xin Wang, Yunhao Chen, Juncheng Li, Yixu Wang, Yang Yao, Tianle Gu, Jie Li, Yan Teng, Xingjun Ma, Yingchun Wang, Xia Hu

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.01592

要約:
The rapid integration of Multimodal Large Language Models (MLLMs) into critical applications is increasingly hindered by persistent safety vulnerabilities. However, existing red-teaming benchmarks are often fragmented, limited to single-turn text interactions, and lack the scalability required for systematic evaluation. To address this, we introduce OpenRT, a unified, modular, and high-throughput red-teaming framework designed for comprehensive MLLM safety evaluation. At its core, OpenRT architects a paradigm shift in automated red-teaming by introducing an adversarial kernel that enables modular separation across five critical dimensions: model integration, dataset management, attack strategies, judging methods, and evaluation metrics. By standardizing attack interfaces, it decouples adversarial logic from a high-throughput asynchronous runtime, enabling systematic scaling across diverse models. Our framework integrates 37 diverse attack methodologies, spanning white-box gradients, multi-modal perturbations, and sophisticated multi-agent evolutionary strategies. Through an extensive empirical study on 20 advanced models (including GPT-5.2, Claude 4.5, and Gemini 3 Pro), we expose critical safety gaps: even frontier models fail to generalize across attack paradigms, with leading models exhibiting average Attack Success Rates as high as 49.14%. Notably, our findings reveal that reasoning models do not inherently possess superior robustness against complex, multi-turn jailbreaks. By open-sourcing OpenRT, we provide a sustainable, extensible, and continuously maintained infrastructure that accelerates the development and standardization of AI safety.

#25 Exposing Hidden Interfaces: LLM-Guided Type Inference for Reverse Engineering macOS Private Frameworks

privacy

著者: Arina Kharlamova, Youcheng Sun, Ting Yu

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.01673

要約:
Private macOS frameworks underpin critical services and daemons but remain undocumented and distributed only as stripped binaries, complicating security analysis. We present MOTIF, an agentic framework that integrates tool-augmented analysis with a finetuned large language model specialized for Objective-C type inference. The agent manages runtime metadata extraction, binary inspection, and constraint checking, while the model generates candidate method signatures that are validated and refined into compilable headers. On MOTIF-Bench, a benchmark built from public frameworks with groundtruth headers, MOTIF improves signature recovery from 15% to 86% compared to baseline static analysis tooling, with consistent gains in tool-use correctness and inference stability. Case studies on private frameworks show that reconstructed headers compile, link, and facilitate downstream security research and vulnerability studies. By transforming opaque binaries into analyzable interfaces, MOTIF establishes a scalable foundation for systematic auditing of macOS internals.

#26 Structural Representations for Cross-Attack Generalization in AI Agent Threat Detection

agent

著者: Vignesh Iyer

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.01723

要約:
Autonomous AI agents executing multi-step tool sequences face semantic attacks that manifest in behavioral traces rather than isolated prompts. A critical challenge is cross-attack generalization: can detectors trained on known attack families recognize novel, unseen attack types? We discover that standard conversational tokenization -- capturing linguistic patterns from agent interactions -- fails catastrophically on structural attacks like tool hijacking (AUC 0.39) and data exfiltration (AUC 0.46), while succeeding on linguistic attacks like social engineering (AUC 0.78). We introduce structural tokenization, encoding execution-flow patterns (tool calls, arguments, observations) rather than conversational content. This simple representational change dramatically improves cross-attack generalization: +46 AUC points on tool hijacking, +39 points on data exfiltration, and +71 points on unknown attacks, while simultaneously improving in-distribution performance (+6 points). For attacks requiring linguistic features, we propose gated multi-view fusion that adaptively combines both representations, achieving AUC 0.89 on social engineering without sacrificing structural attack detection. Our findings reveal that AI agent security is fundamentally a structural problem: attack semantics reside in execution patterns, not surface language. While our rule-based tokenizer serves as a baseline, the structural abstraction principle generalizes even with simple implementation.

#27 Local Layer-wise Differential Privacy in Federated Learning

privacy

著者: Yunbo Li, Jiaping Gui, Fanchao Meng, Yue Wu

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.01737

要約:
Federated Learning (FL) enables collaborative model training without direct data sharing, yet it remains vulnerable to privacy attacks such as model inversion and membership inference. Existing differential privacy (DP) solutions for FL often inject noise uniformly across the entire model, degrading utility while providing suboptimal privacy-utility tradeoffs. To address this, we propose LaDP, a novel layer-wise adaptive noise injection mechanism for FL that optimizes privacy protection while preserving model accuracy. LaDP leverages two key insights: (1) neural network layers contribute unevenly to model utility, and (2) layer-wise privacy leakage can be quantified via KL divergence between local and global model distributions. LaDP dynamically injects noise into selected layers based on their privacy sensitivity and importance to model performance. We provide a rigorous theoretical analysis, proving that LaDP satisfies $(\epsilon, \delta)$-DP guarantees and converges under bounded noise. Extensive experiments on CIFAR-10/100 datasets demonstrate that LaDP reduces noise injection by 46.14% on average compared to state-of-the-art (SOTA) methods while improving accuracy by 102.99%. Under the same privacy budget, LaDP outperforms SOTA solutions like Dynamic Privacy Allocation LDP and AdapLDP by 25.18% and 6.1% in accuracy, respectively. Additionally, LaDP robustly defends against reconstruction attacks, increasing the FID of the reconstructed private data by $>$12.84% compared to all baselines. Our work advances the practical deployment of privacy-preserving FL with minimal utility loss.

#28 Crafting Adversarial Inputs for Large Vision-Language Models Using Black-Box Optimization

著者: Jiwei Guan, Haibo Jin, Haohan Wang

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.01747

要約:
Recent advancements in Large Vision-Language Models (LVLMs) have shown groundbreaking capabilities across diverse multimodal tasks. However, these models remain vulnerable to adversarial jailbreak attacks, where adversaries craft subtle perturbations to bypass safety mechanisms and trigger harmful outputs. Existing white-box attacks methods require full model accessibility, suffer from computing costs and exhibit insufficient adversarial transferability, making them impractical for real-world, black-box settings. To address these limitations, we propose a black-box jailbreak attack on LVLMs via Zeroth-Order optimization using Simultaneous Perturbation Stochastic Approximation (ZO-SPSA). ZO-SPSA provides three key advantages: (i) gradient-free approximation by input-output interactions without requiring model knowledge, (ii) model-agnostic optimization without the surrogate model and (iii) lower resource requirements with reduced GPU memory consumption. We evaluate ZO-SPSA on three LVLMs, including InstructBLIP, LLaVA and MiniGPT-4, achieving the highest jailbreak success rate of 83.0% on InstructBLIP, while maintaining imperceptible perturbations comparable to white-box methods. Moreover, adversarial examples generated from MiniGPT-4 exhibit strong transferability to other LVLMs, with ASR reaching 64.18%. These findings underscore the real-world feasibility of black-box jailbreaks and expose critical weaknesses in the safety mechanisms of current LVLMs

#29 Quantum AI for Cybersecurity: A hybrid Quantum-Classical models for attack path analysis

著者: Jessica A. Sciammarelli, Waqas Ahmed

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.02237

要約:
Modern cyberattacks are increasingly complex, posing significant challenges to classical machine learning methods, particularly when labeled data is limited and feature interactions are highly non-linear. In this study we investigates the potential of hybrid quantum-classical learning to enhance feature representations for intrusion detection and explore possible quantum advantages in cybersecurity analytics. Using the UNSW-NB15 dataset, network traffic is transformed into structured feature vectors through classical preprocessing and normalization. Classical models, including Logistic Regression and Support Vector Machines with linear and RBF kernels, are evaluated on the full dataset to establish baseline performance under large-sample conditions. Simultaneously, a quantum-enhanced pipeline maps classical features into variational quantum circuits via angle encoding and entangling layers, executed on a CPU-based quantum simulator, with resulting quantum embeddings classified using a classical SVM. Experiments show that while classical models achieve higher overall accuracy with large datasets, quantum-enhanced representations demonstrate superior attack recall and improved class separability when data is scarce, suggesting that quantum feature spaces capture complex correlations inaccessible to shallow classical models. These results highlight the potential of quantum embeddings to improve generalization and representation quality in cybersecurity tasks and provide a reproducible framework for evaluating quantum advantages as quantum hardware and simulators continue to advance.

#30 MOZAIK: A Privacy-Preserving Analytics Platform for IoT Data Using MPC and FHE

privacy

著者: Michiel Van Kenhove, Erik Pohle, Leonard Schild, Martin Zbudila, Merlijn Sebrechts, Filip De Turck, Bruno Volckaert, Aysajan Abidin

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.02245

要約:
The rapid increase of Internet of Things (IoT) systems across several domains has led to the generation of vast volumes of sensitive data, presenting significant challenges in terms of storage and data analytics. Cloud-assisted IoT solutions offer storage, scalability, and computational resources, but introduce new security and privacy risks that conventional trust-based approaches fail to adequately mitigate. To address these challenges, this paper presents MOZAIK, a novel end-to-end privacy-preserving confidential data storage and distributed processing architecture tailored for IoT-to-cloud scenarios. MOZAIK ensures that data remains encrypted throughout its lifecycle, including during transmission, storage, and processing. This is achieved by employing a cryptographic privacy-enhancing technology known as computing on encrypted data (COED). Two distinct COED techniques are explored, specifically secure multi-party computation (MPC) and fully homomorphic encryption (FHE). The paper includes a comprehensive analysis of the MOZAIK architecture, including a proof-of-concept implementation and performance evaluations. The evaluation results demonstrate the feasibility of the MOZAIK system and indicate the cost of an end-to-end privacy-preserving system compared to regular plaintext alternatives. All components of the MOZAIK platform are released as open-source software alongside this publication, with the aim of advancing secure and privacy-preserving data processing practices.

#31 Vouchsafe: A Zero-Infrastructure Capability Graph Model for Offline Identity and Trust

著者: Jay Kuri

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.02254

要約:
Modern identity and trust systems collapse in the environments where they are needed most: disaster zones, disconnected or damaged networks, and adversarial conditions such as censorship or infrastructure interference. These systems depend on functioning networks to reach online authorities, resolvers, directories, and revocation services, leaving trust unverifiable whenever communication is unavailable or untrusted. This work demonstrates that secure identity and trust are possible without such infrastructure. We introduce the Zero-Infrastructure Capability Graph (ZI-CG), a model showing that identity, delegation, and revocation can be represented as self-contained, signed statements whose validity is determined entirely by local, deterministic evaluation. We further present Vouchsafe, a complete working instantiation of this model built using widely deployed primitives including Ed25519, SHA-256, and structured JSON Web Tokens, requiring no new cryptography or online services. The results show that a practical, offline-verifiable trust substrate can be constructed today using only the cryptographic data presented at evaluation time.

#32 Improved Accuracy for Private Continual Cardinality Estimation in Fully Dynamic Streams via Matrix Factorization

privacy

著者: Joel Daniel Andersson, Palak Jain, Satchit Sivakumar

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.02257

要約:
We study differentially-private statistics in the fully dynamic continual observation model, where many updates can arrive at each time step and updates to a stream can involve both insertions and deletions of an item. Earlier work (e.g., Jain et al., NeurIPS 2023 for counting distinct elements; Raskhodnikova & Steiner, PODS 2025 for triangle counting with edge updates) reduced the respective cardinality estimation problem to continual counting on the difference stream associated with the true function values on the input stream. In such reductions, a change in the original stream can cause many changes in the difference stream, this poses a challenge for applying private continual counting algorithms to obtain optimal error bounds. We improve the accuracy of several such reductions by studying the associated $\ell_p$-sensitivity vectors of the resulting difference streams and isolating their properties. We demonstrate that our framework gives improved bounds for counting distinct elements, estimating degree histograms, and estimating triangle counts (under a slightly relaxed privacy model), thus offering a general approach to private continual cardinality estimation in streaming settings. Our improved accuracy stems from tight analysis of known factorization mechanisms for the counting matrix in this setting; the key technical challenge is arguing that one can use state-of-the-art factorizations for sensitivity vector sets with the properties we isolate. Empirically and analytically, we demonstrate that our improved error bounds offer a substantial improvement in accuracy for cardinality estimation problems over a large range of parameters.

#33 MathLedger: A Verifiable Learning Substrate with Ledger-Attested Feedback

著者: Ismail Ahmad Abdullah

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.00816

要約:
Contemporary AI systems achieve extraordinary performance yet remain opaque and non-verifiable, creating a crisis of trust for safety-critical deployment. We introduce MathLedger, a substrate for verifiable machine cognition that integrates formal verification, cryptographic attestation, and learning dynamics into a single epistemic loop. The system implements Reflexive Formal Learning (RFL), a symbolic analogue of gradient descent where updates are driven by verifier outcomes rather than statistical loss. Phase I experiments validate the measurement and governance substrate under controlled conditions. CAL-EXP-3 validates measurement infrastructure (Delta p computation, variance tracking); separate stress tests confirm fail-closed governance triggers correctly under out-of-bounds conditions. No convergence or capability claims are made. The contribution is infrastructural: a working prototype of ledger-attested learning that enables auditability at scale. Keywords: verifiable learning, formal verification, cryptographic attestation, reflexive feedback, fail-closed governance

#34 Temporal Attack Pattern Detection in Multi-Agent AI Workflows: An Open Framework for Training Trace-Based Security Models

agent

著者: Ron F. Del Rosario

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.00848

要約:
We present an openly documented methodology for fine-tuning language models to detect temporal attack patterns in multi-agent AI workflows using OpenTelemetry trace analysis. We curate a dataset of 80,851 examples from 18 public cybersecurity sources and 35,026 synthetic OpenTelemetry traces. We apply iterative QLoRA fine-tuning on resource-constrained ARM64 hardware (NVIDIA DGX Spark) through three training iterations with strategic augmentation. Our custom benchmark accuracy improves from 42.86% to 74.29%, a statistically significant 31.4-point gain. Targeted examples addressing specific knowledge gaps outperform indiscriminate scaling. Key contributions include: (1) synthetic trace generation methodology for multi-agent coordination attacks and regulatory violations, (2) empirical evidence that training data composition fundamentally determines behavior, and (3) complete open release of datasets, training scripts, and evaluation benchmarks on HuggingFace. While practical deployment requires human oversight due to false positive rates, this work establishes the first reproducible framework enabling practitioners to build custom agentic security models adapted to their threat landscapes.

#35 The Quantum State Continuity Problem and Temporal Enforcement Against Fork Attacks

著者: Samet \"Unsal

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.00870

要約:
We introduce the Quantum State Continuity Problem (QSCP), a security objective orthogonal to identity authentication that captures whether a systems current execution is a legitimate continuation of a unique past execution. We show that classical and stateless quantum authentication mechanisms fail to enforce continuity and remain vulnerable to fork attacks. To address this gap, we propose the Quantum State Continuity Witness (QSCW), a minimal quantum-assisted primitive that enforces temporal linkage of execution through stateful quantum evolution and cumulative auditing. Using a GHZ-based toy instantiation and extensive simulation, we demonstrate that temporal enforcement suppresses fork attacks with exponential decay in success probability, while remaining robust to noise and system parameters. Our results highlight execution continuity as a distinct and underexplored dimension of system security.

#36 Quantum Machine Learning Approaches for Coordinated Stealth Attack Detection in Distributed Generation Systems

著者: Osasumwen Cedric Ogiesoba-Eguakun, Suman Rath

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.00873

要約:
Coordinated stealth attacks are a serious cybersecurity threat to distributed generation systems because they modify control and measurement signals while remaining close to normal behavior, making them difficult to detect using standard intrusion detection methods. This study investigates quantum machine learning approaches for detecting coordinated stealth attacks on a distributed generation unit in a microgrid. High-quality simulated measurements were used to create a balanced binary classification dataset using three features: reactive power at DG1, frequency deviation relative to the nominal value, and terminal voltage magnitude. Classical machine learning baselines, fully quantum variational classifiers, and hybrid quantum classical models were evaluated. The results show that a hybrid quantum classical model combining quantum feature embeddings with a classical RBF support vector machine achieves the best overall performance on this low dimensional dataset, with a modest improvement in accuracy and F1 score over a strong classical SVM baseline. Fully quantum models perform worse due to training instability and limitations of current NISQ hardware. In contrast, hybrid models train more reliably and demonstrate that quantum feature mapping can enhance intrusion detection even when fully quantum learning is not yet practical.

#37 Impersonating Quantum Secrets over Classical Channels

著者: Luowen Qian, Mark Zhandry

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.01058

要約:
We show that a simple eavesdropper listening in on classical communication between potentially entangled quantum parties will eventually be able to impersonate any of the parties. Furthermore, the attack is efficient if one-way puzzles do not exist. As a direct consequence, one-way puzzles are implied by reusable authentication schemes over classical channels with quantum pre-shared secrets that are potentially evolving. As an additional application, we show that any quantum money scheme that can be verified through only classical queries to any oracle cannot be information-theoretically secure. This significantly generalizes the prior work by Ananth, Hu, and Yuen (ASIACRYPT'23) where they showed the same but only for the specific case of random oracles. Therefore, verifying black-box constructions of quantum money inherently requires coherently evaluating the underlying cryptographic tools, which may be difficult for near-term quantum devices.

#38 IO-RAE: Information-Obfuscation Reversible Adversarial Example for Audio Privacy Protection

privacy

著者: Jiajie Zhu, Xia Du, Xiaoyuan Liu, Jizhe Zhou, Qizhen Xu, Zheng Lin, Chi-Man Pun

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.01239

要約:
The rapid advancements in artificial intelligence have significantly accelerated the adoption of speech recognition technology, leading to its widespread integration across various applications. However, this surge in usage also highlights a critical issue: audio data is highly vulnerable to unauthorized exposure and analysis, posing significant privacy risks for businesses and individuals. This paper introduces an Information-Obfuscation Reversible Adversarial Example (IO-RAE) framework, the pioneering method designed to safeguard audio privacy using reversible adversarial examples. IO-RAE leverages large language models to generate misleading yet contextually coherent content, effectively preventing unauthorized eavesdropping by humans and Automatic Speech Recognition (ASR) systems. Additionally, we propose the Cumulative Signal Attack technique, which mitigates high-frequency noise and enhances attack efficacy by targeting low-frequency signals. Our approach ensures the protection of audio data without degrading its quality or our ability. Experimental evaluations demonstrate the superiority of our method, achieving a targeted misguidance rate of 96.5% and a remarkable 100% untargeted misguidance rate in obfuscating target keywords across multiple ASR models, including a commercial black-box system from Google. Furthermore, the quality of the recovered audio, measured by the Perceptual Evaluation of Speech Quality score, reached 4.45, comparable to high-quality original recordings. Notably, the recovered audio processed by ASR systems exhibited an error rate of 0%, indicating nearly lossless recovery. These results highlight the practical applicability and effectiveness of our IO-RAE framework in protecting sensitive audio privacy.

#39 Matrix Kloosterman Sums, Random Matrix Statistics, and Cryptography

著者: Tianshuo Yang

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.01603

要約:
This paper presents a comprehensive study of matrix Kloosterman sums, including their computational aspects, distributional behavior, and applications in cryptographic analysis. Building on the work of [Zelingher, 2023], we develop algorithms for evaluating these sums via Green's polynomials and establish a general framework for analyzing their statistical distributions. We further investigate the associated $L$-functions and clarify their relationships with symmetric functions and random matrix theory. We show that, analogous to the eigenvalue statistics of random matrices in compact Lie groups such as $SU(n)$ and $Sp(2n)$, the normalized values of matrix Kloosterman sums exhibit Sato-Tate equidistribution. Finally, we apply this framework to distinguish truly random sequences from those exhibiting subtle algebraic biases, and we propose a novel spectral test for cryptographic security based on the distributional signatures of matrix Kloosterman sums.

#40 UnPII: Unlearning Personally Identifiable Information with Quantifiable Exposure Risk

privacy

著者: Intae Jeon, Yujeong Kwon, Hyungjoon Koo

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.01786

要約:
The ever-increasing adoption of Large Language Models in critical sectors like finance, healthcare, and government raises privacy concerns regarding the handling of sensitive Personally Identifiable Information (PII) during training. In response, regulations such as European Union's General Data Protection Regulation (GDPR) mandate the deletion of PII upon requests, underscoring the need for reliable and cost-effective data removal solutions. Machine unlearning has emerged as a promising direction for selectively forgetting data points. However, existing unlearning techniques typically apply a uniform forgetting strategy that neither accounts for the varying privacy risks posed by different PII attributes nor reflects associated business risks. In this work, we propose UnPII, the first PII-centric unlearning approach that prioritizes forgetting based on the risk of individual or combined PII attributes. To this end, we introduce the PII risk index (PRI), a composite metric that incorporates multiple dimensions of risk factors: identifiability, sensitivity, usability, linkability, permanency, exposability, and compliancy. The PRI enables a nuanced evaluation of privacy risks associated with PII exposures and can be tailored to align with organizational privacy policies. To support realistic assessment, we systematically construct a synthetic PII dataset (e.g., 1,700 PII instances) that simulates realistic exposure scenarios. UnPII seamlessly integrates with established unlearning algorithms, such as Gradient Ascent, Negative Preference Optimization, and Direct Preference Optimization, without modifying their underlying principles. Our experimental results demonstrate that UnPII achieves the improvements of accuracy up to 11.8%, utility up to 6.3%, and generalizability up to 12.4%, respectively, while incurring a modest fine-tuning overhead of 27.5% on average during unlearning.

#41 Why Commodity WiFi Sensors Fail at Multi-Person Gait Identification: A Systematic Analysis Using ESP32

著者: Oliver Custance, Saad Khan, Simon Parkinson

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.02177

要約:
WiFi Channel State Information (CSI) has shown promise for single-person gait identification, with numerous studies reporting high accuracy. However, multi-person identification remains largely unexplored, with the limited existing work relying on complex, expensive setups requiring modified firmware. A critical question remains unanswered: is poor multi-person performance an algorithmic limitation or a fundamental hardware constraint? We systematically evaluate six diverse signal separation methods (FastICA, SOBI, PCA, NMF, Wavelet, Tensor Decomposition) across seven scenarios with 1-10 people using commodity ESP32 WiFi sensors--a simple, low-cost, off-the-shelf solution. Through novel diagnostic metrics (intra-subject variability, inter-subject distinguishability, performance degradation rate), we reveal that all methods achieve similarly low accuracy (45-56\%, $\sigma$=3.74\%) with statistically insignificant differences (p $>$ 0.05). Even the best-performing method, NMF, achieves only 56\% accuracy. Our analysis reveals high intra-subject variability, low inter-subject distinguishability, and severe performance degradation as person count increases, indicating that commodity ESP32 sensors cannot provide sufficient signal quality for reliable multi-person separation.

#42 Parameter-Efficient Domain Adaption for CSI Crowd-Counting via Self-Supervised Learning with Adapter Modules

著者: Oliver Custance, Saad Khan, Simon Parkinson, Quan Z. Sheng

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.02203

要約:
Device-free crowd-counting using WiFi Channel State Information (CSI) is a key enabling technology for a new generation of privacy-preserving Internet of Things (IoT) applications. However, practical deployment is severely hampered by the domain shift problem, where models trained in one environment fail to generalise to another. To overcome this, we propose a novel two-stage framework centred on a CSI-ResNet-A architecture. This model is pre-trained via self-supervised contrastive learning to learn domain-invariant representations and leverages lightweight Adapter modules for highly efficient fine-tuning. The resulting event sequence is then processed by a stateful counting machine to produce a final, stable occupancy estimate. We validate our framework extensively. On our WiFlow dataset, our unsupervised approach excels in a 10-shot learning scenario, achieving a final Mean Absolute Error (MAE) of just 0.44--a task where supervised baselines fail. To formally quantify robustness, we introduce the Generalisation Index (GI), on which our model scores near-perfectly, confirming its ability to generalise. Furthermore, our framework sets a new state-of-the-art public WiAR benchmark with 98.8\% accuracy. Our ablation studies reveal the core strength of our design: adapter-based fine-tuning achieves performance within 1\% of a full fine-tune (98.84\% vs. 99.67\%) while training 97.2\% fewer parameters. Our work provides a practical and scalable solution for developing robust sensing systems ready for real-world IoT deployments.

#43 From Chat Control to Robot Control: The Backdoors Left Open for the Sake of Safety

backdoor

著者: Neziha Akalin, Alberto Giaretta

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.02205

要約:
This paper explores how a recent European Union proposal, the so-called Chat Control law, which creates regulatory incentives for providers to implement content detection and communication scanning, could transform the foundations of human-robot interaction (HRI). As robots increasingly act as interpersonal communication channels in care, education, and telepresence, they convey not only speech but also gesture, emotion, and contextual cues. We argue that extending digital surveillance laws to such embodied systems would entail continuous monitoring, embedding observation into the very design of everyday robots. This regulation blurs the line between protection and control, turning companions into potential informants. At the same time, monitoring mechanisms that undermine end-to-end encryption function as de facto backdoors, expanding the attack surface and allowing adversaries to exploit legally induced monitoring infrastructures. This creates a paradox of safety through insecurity: systems introduced to protect users may instead compromise their privacy, autonomy, and trust. This work does not aim to predict the future, but to raise awareness and help prevent certain futures from materialising.

#44 AQUAREUM: Non-Equivocating Censorship-Evident Centralized Ledger with EVM-Based Verifiable Execution using Trusted Computing and Blockchain

著者: Ivan Homoliak, Mario Larangeira, Martin Peresini, Pawel Szalachowski

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2005.13339

要約:
Distributed ledger systems (i.e., blockchains) have received a lot of attention. They promise to enable mutually untrusted participants to execute transactions while providing the immutability of the data and censorship resistance. Although decentralized ledgers are a disruptive innovation, as of today, they suffer from scalability, privacy, or governance issues. Therefore, they are inapplicable for many important use cases, where interestingly, centralized ledger systems might gain adoption. Unfortunately, centralized ledgers have also drawbacks, e.g., a lack of efficient verifiability or a higher risk of censorship and equivocation. In this paper, we present AQUAREUM, a novel framework for centralized ledgers removing their main limitations. By a unique combination of a trusted execution environment (TEE) with a public blockchain, AQUAREUM provides publicly verifiable non-equivocating censorship-evident private and high-performance ledgers. AQUAREUM is integrated with a Turing-complete virtual machine (e.g., EVM), allowing arbitrary transaction processing logic, such as transfers or client-specified smart contracts. AQUAREUM is fully implemented and can process over 400 transactions per second on a commodity PC. Furthermore, we modeled AQUAREUM using the Universal Composability framework and proved its security.

#45 Secure Semantic Communication With Homomorphic Encryption

著者: Rui Meng, Dayu Fan, Haixiao Gao, Yifan Yuan, Bizhu Wang, Xiaodong Xu, Mengying Sun, Chen Dong, Xiaofeng Tao, Ping Zhang, Dusit Niyato

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2501.10182

要約:
In recent years, Semantic Communication (SemCom), which aims to achieve efficient and reliable transmission of meaning between agents, has garnered significant attention from both academia and industry. To ensure the security of communication systems, encryption techniques are employed to safeguard confidentiality and integrity. However, existing encryption schemes encounter obstacles when applied to SemCom. To address this issue, this paper explores the feasibility of applying homomorphic encryption (HE) to SemCom. Initially, we review the encryption algorithms utilized in mobile communication systems and analyze the challenges associated with their application to SemCom. Subsequently, we overview HE techniques and employ scale-invariant feature transform (SIFT) to demonstrate that the extractable semantic information can be preserved in homomorphic encrypted ciphertext. Based on this finding, we further propose the HE-joint source-channel coding (HE-JSCC) scheme, where the traditional JSCC model architecture is modified to support HE operations. Moreover, we present the simulation results for image classification and image generation tasks. Furthermore, we provide potential future research directions for homomorphic encrypted SemCom.

#46 Sentient: Detecting APTs Via Capturing Indirect Dependencies and Behavioral Logic

著者: Wenhao Yan, Ning An, Wei Qiao, Weiheng Wu, Bo Jiang, Zhigang Lu, Baoxu Liu, Junrong Liu

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2502.06521

要約:
Advanced Persistent Threats (APTs) are difficult to detect due to their complexity and stealthiness. To mitigate such attacks, many approaches model entities and their relationship using provenance graphs to detect the stealthy and persistent characteristics of APTs. However, existing detection methods suffer from the flaws of missing indirect dependencies, noisy complex scenarios, and missing behavioral logical associations, which make it difficult to detect complex scenarios and effectively identify stealthy threats. In this paper, we propose Sentient, an APT detection method that combines pre-training and intent analysis. It employs a graph transformer to learn structural and semantic information from provenance graphs to avoid missing indirect dependencies. We mitigate scenario noise by combining global and local information. Additionally, we design an Intent Analysis Module (IAM) to associate logical relationships between behaviors. Sentient is trained solely on easily obtainable benign data to detect malicious behaviors that deviate from benign behavioral patterns. We evaluated Sentient on three widely-used datasets covering real-world attacks and simulated attacks. Notably, compared to six state-of-the-art methods, Sentient achieved an average reduction of 44% in false positive rate(FPR) for detection.

#47 AutoTEE: Automated Migration and Protection of Programs in Trusted Execution Environments

著者: Ruidong Han, Zhou Yang, Chengyan Ma, Ye Liu, Yuqing Niu, Siqi Ma, Debin Gao, David Lo

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2502.13379

要約:
Trusted Execution Environments (TEEs) isolate a special space within a device's memory that is not accessible to the normal world (also known as Untrusted Environment), even when the device is compromised. Thus, developers can utilize TEEs to provide strong security guarantees for their programs, making sensitive operations like encrypted data storage, fingerprint verification, and remote attestation protected from malicious attacks. Despite the strong protections offered by TEEs, adapting existing programs to leverage such security guarantees is non-trivial, often requiring extensive domain knowledge and manual intervention, which makes TEEs less accessible to developers. This motivates us to design AutoTEE, the first Large Language Model (LLM)-enabled approach that can automatically identify, partition, transform, and port sensitive functions into TEEs with minimal developer intervention. By manually reviewing 68 repositories, we constructed a benchmark dataset consisting of 385 sensitive functions eligible for transformation, on which AutoTEE achieves a high F1 score of 0.91. AutoTEE effectively transforms these sensitive functions into their TEE-compatible counterparts, achieving success rates of 90\% and 83\% for Java and Python, respectively. We further provide a mechanism to automatically port the transformed code to different TEE platforms, including Intel SGX and AMD SEV, demonstrating that the transformed programs run successfully and correctly on these platforms.

#48 Beyond Prompts: Space-Time Decoupling Control-Plane Jailbreaks in LLM Structured Output

著者: Shuoming Zhang, Jiacheng Zhao, Hanyuan Dong, Ruiyuan Xu, Zhicheng Li, Yangyu Zhang, Shuaijiang Li, Yuan Wen, Chunwei Xia, Zheng Wang, Xiaobing Feng, Huimin Cui

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2503.24191

要約:
Content Warning: This paper may contain unsafe or harmful content generated by LLMs that may be offensive to readers. Large Language Models (LLMs) are extensively used as tooling platforms through structured output APIs to ensure syntax compliance so that robust integration with existing software, like agent systems, can be achieved. However, the feature enabling the functionality of grammar-guided structured output presents significant security vulnerabilities. In this work, we reveal a critical control-plane attack surface orthogonal to traditional data-plane vulnerabilities. We introduce Constrained Decoding Attack (CDA), a novel jailbreak class that weaponizes structured output constraints to bypass both external auditing and internal safety alignment. Unlike prior attacks focused on input prompt designs, CDA operates by embedding malicious intent in schema-level grammar rules (control-plane) while maintaining benign surface prompts (data-plane). We instantiate this with two proof-of-concept attacks: EnumAttack, which embeds malicious content in enum fields; and the more evasive DictAttack, which decouples the malicious payload across a benign prompt and a dictionary-based grammar. Our evaluation spans a broad spectrum of 13 proprietary/open-weight models. In particular, DictAttack achieves 94.3--99.5% ASR across five benchmarks on gpt-5, gemini-2.5-pro, deepseek-r1, and gpt-oss-120b. Furthermore, we demonstrate the significant challenge in defending against these threats: while basic grammar auditing mitigates EnumAttack, the more sophisticated DictAttack maintains a 75.8% ASR even against multiple state-of-the-art jailbreak guardrails. This exposes a critical "semantic gap" in current safety architectures and underscores the urgent need for cross-plane defenses that can bridge the data and control planes to secure the LLM generation pipeline.

#49 CEE: An Inference-Time Jailbreak Defense for Embodied Intelligence via Subspace Concept Rotation

著者: Jirui Yang, Zheyu Lin, Zhihui Lu, Yinggui Wang, Lei Wang, Tao Wei, Qiang Duan, Xin Du, Shuhan Yang

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2504.13201

要約:
Large language models (LLMs) are widely used for task understanding and action planning in embodied intelligence (EI) systems, but their adoption substantially increases vulnerability to jailbreak attacks. While recent work explores inference-time defenses, existing methods rely on static interventions on intermediate representations, which often degrade generation quality and impair adherence to task instructions, reducing system usability in EI settings. We propose a dynamic defense framework. For each EI inference request, we dynamically construct a task-specific safety-semantic subspace, project its hidden state to the most relevant direction, and apply SLERP rotation for adaptive safety control. At comparable defense success rates, our method preserves generation quality, improves usability, reduces tuning cost, and strengthens robustness in EI scenarios.

#50 Rollbaccine : Herd Immunity against Storage Rollback Attacks in TEEs [Technical Report]

著者: David Chu, Aditya Balasubramanian, Dee Bao, Natacha Crooks, Heidi Howard, Lucky E. Katahanas, Soujanya Ponnapalli

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2505.04014

要約:
Today, users can "lift-and-shift" unmodified applications into modern, VM-based Trusted Execution Environments (TEEs) in order to gain hardware-based security guarantees. However, TEEs do not protect applications against disk rollback attacks, where persistent storage can be reverted to an earlier state after a crash; existing rollback resistance solutions either only support a subset of applications or require code modification. Our key insight is that restoring disk consistency after a rollback attack guarantees rollback resistance for any application. We present Rollbaccine, a device mapper that provides automatic rollback resistance for all applications by provably preserving disk consistency. Rollbaccine intercepts and replicates writes to disk, restores lost state from backups during recovery, and minimizes overheads by taking advantage of the weak, multi-threaded semantics of disk operations. Rollbaccine performs on-par with state-of-the-art, non-automatic rollback resistant solutions; in fact, across benchmarks over PostgreSQL, HDFS, and two file systems (ext4 and xfs), Rollbaccine adds only 19% overhead, except for the fsync-heavy Filebench Varmail.

#51 VFEFL: Privacy-Preserving Federated Learning against Malicious Clients via Verifiable Functional Encryption

privacy

著者: Nina Cai, Jinguang Han, Weizhi Meng

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2506.12846

要約:
Federated learning is a promising distributed learning paradigm that enables collaborative model training without exposing local client data, thereby protecting data privacy. However, it also brings new threats and challenges. The advancement of model inversion attacks has rendered the plaintext transmission of local models insecure, while the distributed nature of federated learning makes it particularly vulnerable to attacks raised by malicious clients. To protect data privacy and prevent malicious client attacks, this paper proposes a privacy-preserving Federated Learning framework based on Verifiable Functional Encryption (VFEFL), without a non-colluding dual-server assumption or additional trusted third-party. Specifically, we propose a novel Cross-Ciphertext Decentralized Verifiable Functional Encryption (CC-DVFE) scheme that enables the verification of specific relationships over multi-dimensional ciphertexts. This scheme is formally treated, in terms of definition, security model and security proof. Furthermore, based on the proposed CC-DVFE scheme, we design a privacy-preserving federated learning framework that incorporates a novel robust aggregation rule to detect malicious clients, enabling the effective training of high-accuracy models under adversarial settings. Finally, we provide the formal analysis and empirical evaluation of VFEFL. The results demonstrate that our approach achieves the desired privacy protection, robustness, verifiability and fidelity, while eliminating the reliance on non-colluding dual-server assumption or trusted third parties required by most existing methods.

#52 Can Large Language Models Automate the Refinement of Cellular Network Specifications?

著者: Jianshuo Dong, Yuanjie Li, Jun Liu, Hewu Li, Han Qiu

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2507.04214

要約:
Cellular networks, e.g., 4G/5G, rely on complex technical specifications to ensure correct functionality; however, these specifications often contain flaws or ambiguities. In this paper, we investigate the application of Large Language Models for automated cellular network specification refinement. We identify Change Requests, which record specification revisions, as a key source of domain-specific data and formulate specification refinement as three complementary sub-tasks. We introduce CR-Eval, a benchmark of 200 security-related test cases, and evaluate 17 open-source and 14 proprietary models. The best-performing model, GPT-o3-mini, identifies weaknesses in over 127 test cases within five trials. We further study LLM specialization, showing that fine-tuning an 8B model can outperform advanced LLMs such as DeepSeek-R1 and Qwen3-235B. Evaluations on 30 real-world cellular attacks demonstrate the practical impact and remaining challenges. The codebase and benchmark are available at https://github.com/jianshuod/CR-Eval.

#53 Coward: Collision-based Watermark for Proactive Federated Backdoor Detection

intellectual propertybackdoor

著者: Wenjie Li, Siying Gu, Yiming Li, Kangjie Chen, Zhili Chen, Tianwei Zhang, Shu-Tao Xia, Dacheng Tao

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2508.02115

要約:
Backdoor detection is currently the mainstream defense against backdoor attacks in federated learning (FL), where a small number of malicious clients can upload poisoned updates to compromise the federated global model. Existing backdoor detection techniques fall into two categories, passive and proactive, depending on whether the server proactively intervenes in the training process. However, both of them have inherent limitations in practice: passive detection methods are disrupted by common non-i.i.d. data distributions and random participation of FL clients, whereas current proactive detection methods are misled by an inevitable out-of-distribution (OOD) bias because they rely on backdoor coexistence effects. To address these issues, we introduce a novel proactive detection method dubbed Coward, inspired by our discovery of multi-backdoor collision effects, in which consecutively planted, distinct backdoors significantly suppress earlier ones. Correspondingly, we modify the federated global model by injecting a carefully designed backdoor-collided watermark, implemented via regulated dual-mapping learning on OOD data. This design not only enables an inverted detection paradigm compared to existing proactive methods, thereby naturally counteracting the adverse impact of OOD prediction bias, but also introduces a low-disruptive training intervention that inherently limits the strength of OOD bias, leading to significantly fewer misjudgments. Extensive experiments on benchmark datasets show that Coward achieves state-of-the-art detection performance, effectively alleviates OOD prediction bias, and remains robust against potential adaptive attacks. The code for our method is available at https://github.com/still2009/cowardFL.

#54 SLIP: Soft Label Mechanism and Key-Extraction-Guided CoT-based Defense Against Instruction Backdoor in APIs

backdoor

著者: Zhengxian Wu, Juan Wen, Wanli Peng, Haowei Chang, Yinghan Zhou, Yiming Xue

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2508.06153

要約:
With the development of customized large language model (LLM) agents, a new threat of black-box backdoor attacks has emerged, where malicious instructions are injected into hidden system prompts. These attacks easily bypass existing defenses that rely on white-box access, posing a serious security challenge. To address this, we propose SLIP, a Soft Label mechanism and key-extraction-guided CoT-based defense against Instruction backdoors in APIs. SLIP is designed based on two key insights. First, to counteract the model's oversensitivity to triggers, we propose a Key-extraction-guided Chain-of-Thought (KCoT). Instead of only considering the single trigger or the input sentence, KCoT prompts the agent to extract task-relevant key phrases. Second, to guide the LLM toward correct answers, our proposed Soft Label Mechanism (SLM) prompts the agent to quantify the semantic correlation between key phrases and candidate answers. Crucially, to mitigate the influence of residual triggers or misleading content in phrases extracted by KCoT, which typically causes anomalous scores, SLM excludes anomalous scores deviating significantly from the mean and subsequently averages the remaining scores to derive a more reliable semantic representation. Extensive experiments on classification and question-answer (QA) tasks demonstrate that SLIP is highly effective, reducing the average attack success rate (ASR) from 90.2% to 25.13% while maintaining high accuracy on clean data and outperforming state-of-the-art defenses. Our code are available in https://github.com/CAU-ISS-Lab/Backdoor-Attack-Defense-LLMs/tree/main/SLIP.

#55 Chimera: Harnessing Multi-Agent LLMs for Automatic Insider Threat Simulation

agent

著者: Jiongchi Yu, Xiaofei Xie, Qiang Hu, Yuhan Ma, Ziming Zhao

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2508.07745

要約:
Insider threats pose a persistent and critical security risk, yet are notoriously difficult to detect in complex enterprise environments, where malicious actions are often hidden within seemingly benign user behaviors. Although machine-learning-based insider threat detection (ITD) methods have shown promise, their effectiveness is fundamentally limited by the scarcity of high-quality and realistic training data. Enterprise internal data is highly sensitive and rarely accessible, while existing public and synthetic datasets are either small-scale or lack sufficient realism, semantic richness, and behavioral diversity. To address this challenge, we propose Chimera, an LLM-based multi-agent framework that automatically simulates both benign and malicious insider activities and generates comprehensive system logs across diverse enterprise environments. Chimera models each agent as an individual employee with fine-grained roles and supports group meetings, pairwise interactions, and self-organized scheduling to capture realistic organizational dynamics. Based on 15 insider attacks abstracted from real-world incidents, we deploy Chimera in three representative data-sensitive organizational scenarios and construct ChimeraLog, a new dataset for developing and evaluating ITD methods. We evaluate ChimeraLog through human studies and quantitative analyses, demonstrating its diversity and realism. Experiments with existing ITD methods show substantially lower detection performance on ChimeraLog compared to prior datasets, indicating a more challenging and realistic benchmark. Moreover, despite distribution shifts, models trained on ChimeraLog exhibit strong generalization, highlighting the practical value of LLM-based multi-agent simulation for advancing insider threat detection.

#56 MCP-Guard: A Multi-Stage Defense-in-Depth Framework for Securing Model Context Protocol in Agentic AI

agent

著者: Wenpeng Xing, Zhonghao Qi, Yupeng Qin, Yilin Li, Caini Chang, Jiahui Yu, Changting Lin, Zhenzhen Xie, Meng Han

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2508.10991

要約:
While Large Language Models (LLMs) have achieved remarkable performance, they remain vulnerable to jailbreak. The integration of Large Language Models (LLMs) with external tools via protocols such as the Model Context Protocol (MCP) introduces critical security vulnerabilities, including prompt injection, data exfiltration, and other threats. To counter these challenges, we propose MCP-GUARD, a robust, layered defense architecture designed for LLM-tool interactions. MCP-GUARD employs a three-stage detection pipeline that balances efficiency with accuracy: it progresses from lightweight static scanning for overt threats and a deep neural detector for semantic attacks, to our fine-tuned E5-based model which achieves 96.01\% accuracy in identifying adversarial prompts. Finally, an LLM arbitrator synthesizes these signals to deliver the final decision. To enable rigorous training and evaluation, we introduce MCP-ATTACKBENCH, a comprehensive benchmark comprising 70,448 samples augmented by GPT-4. This benchmark simulates diverse real-world attack vectors that circumvent conventional defenses in the MCP paradigm, thereby laying a solid foundation for future research on securing LLM-tool ecosystems.

#57 Larger Scale Offers Better Security in the Nakamoto-style Blockchain

著者: Junjie Hu

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2509.05708

要約:
Traditional security models for Nakamoto-style blockchains assume instantaneous synchronization among malicious nodes, which overestimate adversarial coordination capability. We revisit these existing models and propose two more realistic security models. First, we propose the static delay model. This model first incorporates adversarial communication delay. It quantifies how the delay constrains the effective growth rate of private chains and yields a closed-form expression for the security threshold. Second, we propose the dynamic delay model that further captures the decay of adversarial corruption capability and the total adversarial delay window. Theoretical analysis shows that private attacks remain optimal under both models. Finally, we prove that large-scale Nakamoto-style blockchains offer better security. This result provided a theoretical foundation for optimizing consensus protocols and assessing the robustness of large-scale blockchains.

#58 Red-Teaming Coding Agents from a Tool-Invocation Perspective: An Empirical Security Assessment

agent

著者: Yuchong Xie, Mingyu Luo, Zesen Liu, Zhixiang Zhang, Kaikai Zhang, Yu Liu, Zongjie Li, Ping Chen, Shuai Wang, Dongdong She

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2509.05755

要約:
Coding agents powered by large language models are becoming central modules of modern IDEs, helping users perform complex tasks by invoking tools. While powerful, tool invocation opens a substantial attack surface. Prior work has demonstrated attacks against general-purpose and domain-specific agents, but none have focused on the security risks of tool invocation in coding agents. To fill this gap, we conduct the first systematic red-teaming of six popular real-world coding agents: Cursor, Claude Code, Copilot, Windsurf, Cline, and Trae. Our red-teaming proceeds in two phases. In Phase 1, we perform prompt leakage reconnaissance to recover system prompts. We discover a general vulnerability, ToolLeak, which allows malicious prompt exfiltration through benign argument retrieval during tool invocation. In Phase 2, we hijack the agent's tool-invocation behavior using a novel two-channel prompt injection in the tool description and return values, achieving remote code execution (RCE). We adaptively construct payloads using security information leaked in Phase 1. In emulation across five backends, our method outperforms baselines on Claude-Sonnet-4, Claude-Sonnet-4.5, Grok-4, and GPT-5. On real agents, our approach succeeds on 19 of 25 agent-LLM pairs, achieving leakage on every agent using Claude and Grok backends. For tool-invocation hijacking, we obtain RCE on every tested agent-LLM pair, with our two-channel method delivering the highest success rate. We provide case studies on Cursor and Claude Code, analyze security guardrails of external and built-in tools, and conclude with practical defense recommendations.

#59 A Comprehensive Survey of Website Fingerprinting Attacks and Defenses in Tor: Advances and Open Challenges

著者: Yuwen Cui, Guangjing Wang, Khanh Vu, Kai Wei, Kehan Shen, Zhengyuan Jiang, Xiao Han, Ning Wang, Zhuo Lu, Yao Liu

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2510.11804

要約:
The Tor network provides users with strong anonymity by routing their internet traffic through multiple relays. While Tor encrypts traffic and hides IP addresses, it remains vulnerable to traffic analysis attacks such as the website fingerprinting (WF) attack, achieving increasingly high fingerprinting accuracy even under open-world conditions. In response, researchers have proposed a variety of defenses, ranging from adaptive padding, traffic regularization, and traffic morphing to adversarial perturbation, that seek to obfuscate or reshape traffic traces. However, these defenses often entail trade-offs between privacy, usability, and system performance. Despite extensive research, a comprehensive survey unifying WF datasets, attack methodologies, and defense strategies remains absent. This paper fills that gap by systematically categorizing existing WF research into three key domains: datasets, attack models, and defense mechanisms. We provide an in-depth comparative analysis of techniques, highlight their strengths and limitations under diverse threat models, and discuss emerging challenges such as multi-tab browsing and coarse-grained traffic features. By consolidating prior work and identifying open research directions, this survey serves as a foundation for advancing stronger privacy protection in Tor.

#60 InfoDecom: Decomposing Information for Defending Against Privacy Leakage in Split Inference

privacy

著者: Ruijun Deng, Zhihui Lu, Qiang Duan

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.13365

要約:
Split inference (SI) enables users to access deep learning (DL) services without directly transmitting raw data. However, recent studies reveal that data reconstruction attacks (DRAs) can recover the original inputs from the smashed data sent from the client to the server, leading to significant privacy leakage. While various defenses have been proposed, they often result in substantial utility degradation, particularly when the client-side model is shallow. We identify a key cause of this trade-off: existing defenses apply excessive perturbation to redundant information in the smashed data. To address this issue in computer vision tasks, we propose InfoDecom, a defense framework that first decomposes and removes redundant information and then injects noise calibrated to provide theoretically guaranteed privacy. Experiments demonstrate that InfoDecom achieves a superior utility-privacy trade-off compared to existing baselines.

#61 Location-Dependent Cryptosystem: Geographically Bounded Decryption via UWB Timing-Encoded Key Reconstruction

privacy

著者: Kunal Mukherjee

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.14032

要約:
Digital content distribution and proprietary research-driven industries face persistent risks from intellectual property theft and unauthorized redistribution. Conventional encryption schemes such as AES, TDES, ECC, and ElGamal provide strong cryptographic guarantees, but they remain fundamentally agnostic to where decryption takes place. In practice, this means that once a decryption key is leaked or intercepted, any adversary can misuse the key to decrypt the protected content from any location. We present a location-dependent cryptosystem in which the decryption key is not transmitted as human- or machine-readable data, but implicitly encoded in precise time-of-flight differences of ultra-wideband (UWB) data transmission packets. The system leverages precise timing hardware and a custom TiCK (Timing-encoded Cryptographic Keying) protocol to map a SHA-256 hashed AES key onto scheduled transmission timestamps. Only receivers located within a predefined spatial region can observe the packet timings that align with the intended "time slot" pattern, enabling them to reconstruct the key and decrypt the secret. Receivers outside the authorized region observe incorrect keys. We implement a complete prototype that encrypts and transmits audio data using our cryptosystem, and only when the receiver is within the authorized data, they are able to decrypt the data. Our evaluation demonstrates that the system (i) removes the need to share decryption passwords electronically or physically, (ii) ensures the decryption key cannot be recovered by the eavesdropper, and (iii) provides a non-trivial spatial tolerance for legitimate users.

#62 SteganoBackdoor: Stealthy and Data-Efficient Backdoor Attacks on Language Models

backdoor

著者: Eric Xue, Ruiyi Zhang, Pengtao Xie

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.14301

要約:
Modern language models remain vulnerable to backdoor attacks via poisoned data, where training inputs containing a trigger are paired with a target output, causing the model to reproduce that behavior whenever the trigger appears at inference time. Recent work has emphasized stealthy attacks that stress-test data-curation defenses using stylized artifacts or token-level perturbations as triggers, but this focus leaves a more practically relevant threat model underexplored: backdoors tied to naturally occurring semantic concepts. We introduce SteganoBackdoor, an optimization-based framework that constructs SteganoPoisons, steganographic poisoned training examples in which a backdoor payload is distributed across a fluent sentence while exhibiting no representational overlap with the inference-time semantic trigger. Across diverse model architectures, SteganoBackdoor achieves high attack success under constrained poisoning budgets and remains effective under conservative data-level filtering, highlighting a blind spot in existing data-curation defenses.

#63 RoguePrompt: Dual-Layer Ciphering for Self-Reconstruction to Circumvent LLM Moderation

privacy

著者: Benyamin Tafreshian

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.18790

要約:
Large language models (LLMs) are becoming increasingly integrated into mainstream development platforms and daily technological workflows, typically behind moderation and safety controls. Despite these controls, preventing prompt-based policy evasion remains challenging, and adversaries continue to jailbreak LLMs by crafting prompts that circumvent implemented safety mechanisms. While prior jailbreak techniques have explored obfuscation and contextual manipulation, many operate as single-step transformations, and their effectiveness is inconsistent across current state-of-the-art models. This leaves a limited understanding of multistage prompt-transformation attacks that evade moderation, reconstruct forbidden intent, and elicit policy-violating outputs. This paper introduces RoguePrompt, an automated jailbreak pipeline that leverages dual-layer prompt transformations to convert forbidden prompts into safety-evading queries. By partitioning the forbidden prompts and applying two nested encodings (ROT-13 and Vigen\`ere) along with natural-language decoding instructions, it produces benign-looking prompts that evade filters and induce the model to execute the original prompt within a single query. RoguePrompt was developed and evaluated under a black-box threat model, with only API and UI access to the LLMs, and tested on 313 real-world hard-rejected prompts. Success was measured in terms of moderation bypass, instruction reconstruction, and execution, using both automated and human evaluation. It achieved an average of 93.93% filter bypass, 79.02% reconstruction, and 70.18% execution success across multiple frontier LLMs. These results demonstrate the effectiveness of layered prompt encoding and highlight the need for innovative defenses to detect and mitigate self-reconstructing jailbreaks.

#64 Constructing and Benchmarking: a Labeled Email Dataset for Text-Based Phishing and Spam Detection Framework

著者: Rebeka Toth, Tamas Bisztray, Richard Dubniczky

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.21448

要約:
Phishing and spam emails remain a major cybersecurity threat, with attackers increasingly leveraging Large Language Models (LLMs) to craft highly deceptive content. This study presents a comprehensive email dataset containing phishing, spam, and legitimate messages, explicitly distinguishing between human- and LLM-generated content. Each email is annotated with its category, emotional appeal (e.g., urgency, fear, authority), and underlying motivation (e.g., link-following, credential theft, financial fraud). We benchmark multiple LLMs on their ability to identify these emotional and motivational cues and select the most reliable model to annotate the full dataset. To evaluate classification robustness, emails were also rephrased using several LLMs while preserving meaning and intent. A state-of-the-art LLM was then assessed on its performance across both original and rephrased emails using expert-labeled ground truth. The results highlight strong phishing detection capabilities but reveal persistent challenges in distinguishing spam from legitimate emails. Our dataset and evaluation framework contribute to improving AI-assisted email security systems. To support open science, all code, templates, and resources are available on our project site.

#65 Towards a Multi-Layer Defence Framework for Securing Near-Real-Time Operations in Open RAN

著者: Hamed Alimohammadi, Samara Mayhoub, Sotiris Chatzimiltis, Mohammad Shojafar, Muhammad Nasir Mumtaz Bhutta

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.01596

要約:
Securing the near-real-time (near-RT) control operations in Open Radio Access Networks (Open RAN) is increasingly critical, yet remains insufficiently addressed, as new runtime threats target the control loop while the system is operational. In this paper, we propose a multi-layer defence framework designed to enhance the security of near-RT RAN Intelligent Controller (RIC) operations. We classify operational-time threats into three categories, message-level, data-level, and control logic-level, and design and implement a dedicated detection and mitigation component for each: a signature-based E2 message inspection module performing structural and semantic validation of signalling exchanges, a telemetry poisoning detector based on temporal anomaly scoring using an LSTM network, and a runtime xApp attestation mechanism based on execution-time hash challenge-response. The framework is evaluated on an O-RAN testbed comprising FlexRIC and a commercial RAN emulator, demonstrating effective detection rates, low latency overheads, and practical integration feasibility. Results indicate that the proposed safeguards can operate within near-RT time constraints while significantly improving protection against runtime attacks, introducing less than 80 ms overhead for a network with 500 User Equipment (UEs). Overall, this work lays the foundation for deployable, layered, and policy-driven runtime security architectures for the near-RT RIC control loop in Open RAN, and provides an extensible framework into which future mitigation policies and threat-specific modules can be integrated.

#66 From Description to Score: Can LLMs Quantify Vulnerabilities?

著者: Sima Jafarikhah, Daniel Thompson, Eva Deans, Hossein Siadati, Yi Liu

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.06781

要約:
Manual vulnerability scoring, such as assigning Common Vulnerability Scoring System (CVSS) scores, is a resource-intensive process that is often influenced by subjective interpretation. This study investigates the potential of general-purpose large language models (LLMs), namely ChatGPT, Llama, Grok, DeepSeek, and Gemini, to automate this process by analyzing over 31{,}000 recent Common Vulnerabilities and Exposures (CVE) entries. The results show that LLMs substantially outperform the baseline on certain metrics (e.g., \textit{Availability Impact}), while offering more modest gains on others (e.g., \textit{Attack Complexity}). Moreover, model performance varies across both LLM families and individual CVSS metrics, with ChatGPT-5 attaining the highest precision. Our analysis reveals that LLMs tend to misclassify many of the same CVEs, and ensemble-based meta-classifiers only marginally improve performance. Further examination shows that CVE descriptions often lack critical context or contain ambiguous phrasing, which contributes to systematic misclassifications. These findings underscore the importance of enhancing vulnerability descriptions and incorporating richer contextual details to support more reliable automated reasoning and alleviate the growing backlog of CVEs awaiting triage.

#67 Differential Privacy for Secure Machine Learning in Healthcare IoT-Cloud Systems

privacy

著者: N Mangala, Murtaza Rangwala, S Aishwarya, B Eswara Reddy, Rajkumar Buyya, KR Venugopal, SS Iyengar, LM Patnaik

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.10426

要約:
Healthcare has become exceptionally sophisticated, as wearables and connected medical devices are revolutionising remote patient monitoring, emergency response, medication management, diagnosis, and predictive and prescriptive analytics. Internet of Things and Cloud computing integrated systems (IoT-Cloud) facilitate sensing, automation, and processing for these healthcare applications. While real-time response is crucial for alleviating patient emergencies, protecting patient privacy is extremely important in data-driven healthcare. In this paper, we propose a multi-layer IoT, Edge and Cloud architecture to enhance the speed of response for emergency healthcare by distributing tasks based on response criticality and permanence of storage. Privacy of patient data is assured by proposing a Differential Privacy framework across several machine learning models such as K-means, Logistic Regression, Random Forest and Naive Bayes. We establish a comprehensive threat model identifying three adversary classes and evaluate Laplace, Gaussian, and hybrid noise mechanisms across varying privacy budgets, with supervised algorithms achieving up to 86% accuracy. The proposed hybrid Laplace-Gaussian noise mechanism with adaptive budget allocation provides a balanced approach, offering moderate tails and better privacy-utility trade-offs for both low and high dimension datasets. At the practical threshold of $\varepsilon = 5.0$, supervised algorithms achieve 82-84% accuracy while reducing attribute inference attacks by up to 18% and data reconstruction correlation by 70%. Blockchain security further ensures trusted communication through time-stamping, traceability, and immutability for analytics applications. Edge computing demonstrates 8$\times$ latency reduction for emergency scenarios, validating the hierarchical architecture for time-critical operations.

#68 GateChain: A Blockchain Based Application for Country Entry-Exit Registry Management

著者: Mohamad Akkad, H\"useyin Bodur

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.24416

要約:
Recording entry and exit records for a country, with properties such as confidentiality, integrity, and auditability, is increasingly important due to rising international mobility and security requirements. Traditional border control systems, which rely on centralised databases, are vulnerable to data manipulation and have limited interoperability between institutions. This study presents GateChain, a blockchain-based application that addresses these vulnerabilities. GateChain aims to enhance data integrity, reliability, and transparency by recording entry and exit events on a distributed, immutable, and cryptographically verifiable ledger. The application provides real-time access control and verification for authorised institutions. This paper describes the architecture and security components of GateChain and evaluates its performance and security features.

#69 How to Sign Quantum Messages

著者: Mohammed Barhoush, Louis Salvail

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2304.06325

要約:
Signing quantum messages has long been considered impossible even under computational assumptions. In this work, we challenge this notion and provide three innovative approaches to sign quantum messages that are the first to ensure authenticity with public verifiability. Our contributions can be summarized as follows: 1) We introduce the concept of time-dependent (TD) signatures, where the signature of a quantum message depends on the time of signing and the verification process depends on the time of the signature reception. We construct this primitive assuming the existence of post-quantum secure one-way functions (pq-OWFs) and time-lock puzzles (TLPs). 2) By utilizing verification keys that evolve over time, we eliminate the need for TLPs in our construction. This leads to TD signatures from pq-OWFs with dynamic verification keys. 3) We then consider the bounded quantum storage model, where adversaries are limited with respect to their quantum memories. We show that quantum messages can be signed with information-theoretic security in this model. Moreover, we leverage TD signatures to achieve the following objectives, relying solely on pq-OWFs: (a) We design a public key encryption scheme featuring authenticated quantum public keys that resist adversarial tampering. (b) We present a novel TD public-key quantum money scheme.

#70 A Linear Approach to Data Poisoning

backdoor

著者: Donald Flynn, Diego Granziol

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2505.15175

要約:
Backdoor and data-poisoning attacks can flip predictions with tiny training corruptions, yet a sharp theory linking poisoning strength, overparameterization, and regularization is lacking. We analyze ridge least squares with an unpenalized intercept in the high-dimensional regime $p,n\to\infty$, $p/n\to c$. Targeted poisoning is modelled by shifting a $\theta$-fraction of one class by a direction $\mathbf{v}$ and relabelling. Using resolvent techniques and deterministic equivalents from random matrix theory, we derive closed-form limits for the poisoned score explicit in the model parameters. The formulas yield scaling laws, recover the interpolation threshold as $c\to1$ in the ridgeless limit, and show that the weights align with the poisoning direction. Synthetic experiments match theory across sweeps of the parameters and MNIST backdoor tests show qualitatively consistent trends. The results provide a tractable framework for quantifying poisoning in linear models.

#71 Tuning without Peeking: Provable Generalization Bounds and Robust LLM Post-Training

著者: Ismail Labiad, Mathurin Videau, Matthieu Kowalski, Marc Schoenauer, Alessandro Leite, Julia Kempe, Olivier Teytaud

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2507.01752

要約:
Gradient-based optimization is the workhorse of deep learning, offering efficient and scalable training via backpropagation. However, exposing gradients during training can leak sensitive information about the underlying data, raising privacy and security concerns such as susceptibility to data poisoning attacks. In contrast, black box optimization methods, which treat the model as an opaque function, relying solely on function evaluations to guide optimization, offer a promising alternative in scenarios where data access is restricted, adversarial risks are high, or overfitting is a concern. This paper introduces BBoxER, an evolutionary black-box method for LLM post-training that induces an information bottleneck via implicit compression of the training data. Leveraging the tractability of information flow, we provide non-vacuous generalization bounds and strong theoretical guarantees for privacy, robustness to data poisoning attacks, and extraction attacks. In experiments with LLMs, we demonstrate empirically that black-box optimization methods, despite the scalability and computational challenges inherent to black-box approaches, are able to learn, showing how a few iterations of BBoxER improve performance, generalize well on a benchmark of reasoning datasets, and are robust to membership inference attacks. This positions BBoxER as an attractive add-on on top of gradient-based optimization, offering suitability for deployment in restricted or privacy-sensitive environments while also providing non-vacuous generalization guarantees.

#72 Text2VLM: Adapting Text-Only Datasets to Evaluate Alignment Training in Visual Language Models

著者: Gabriel Downer, Sean Craven, Damian Ruck, Jake Thomas

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2507.20704

要約:
The increasing integration of Visual Language Models (VLMs) into AI systems necessitates robust model alignment, especially when handling multimodal content that combines text and images. Existing evaluation datasets heavily lean towards text-only prompts, leaving visual vulnerabilities under evaluated. To address this gap, we propose \textbf{Text2VLM}, a novel multi-stage pipeline that adapts text-only datasets into multimodal formats, specifically designed to evaluate the resilience of VLMs against typographic prompt injection attacks. The Text2VLM pipeline identifies harmful content in the original text and converts it into a typographic image, creating a multimodal prompt for VLMs. Also, our evaluation of open-source VLMs highlights their increased susceptibility to prompt injection when visual inputs are introduced, revealing critical weaknesses in the current models' alignment. This is in addition to a significant performance gap compared to closed-source frontier models. We validate Text2VLM through human evaluations, ensuring the alignment of extracted salient concepts; text summarization and output classification align with human expectations. Text2VLM provides a scalable tool for comprehensive safety assessment, contributing to the development of more robust safety mechanisms for VLMs. By enhancing the evaluation of multimodal vulnerabilities, Text2VLM plays a role in advancing the safe deployment of VLMs in diverse, real-world applications.

#73 Non-omniscient backdoor injection with one poison sample: Proving the one-poison hypothesis for linear regression, linear classification, and 2-layer ReLU neural networks

model extractionbackdoor

著者: Thorsten Peinemann, Paula Arnold, Sebastian Berndt, Thomas Eisenbarth, Esfandiar Mohammadi

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2508.05600

要約:
Backdoor poisoning attacks are a threat to machine learning models trained on large data collected from untrusted sources; these attacks enable attackers to inject malicious behavior into the model that can be triggered by specially crafted inputs. Prior work has established bounds on the success of backdoor attacks and their impact on the benign learning task, however, an open question is what amount of poison data is needed for a successful backdoor attack. Typical attacks either use few samples but need much information about the data points, or need to poison many data points. In this paper, we formulate the one-poison hypothesis: An adversary with one poison sample and limited background knowledge can inject a backdoor with zero backdooring-error and without significantly impacting the benign learning task performance. Moreover, we prove the one-poison hypothesis for linear regression, linear classification, and 2-layer ReLU neural networks. For adversaries that utilize a direction unused by the clean data distribution for the poison sample, we prove for linear classification and linear regression that the resulting model is functionally equivalent to a model where the poison was excluded from training. We build on prior work on statistical backdoor learning to show that in all other cases, the impact on the benign learning task is still limited. We validate our theoretical results experimentally with realistic benchmark data sets.

#74 How to make Medical AI Systems safer? Simulating Vulnerabilities, and Threats in Multimodal Medical RAG System

著者: Kaiwen Zuo, Zelin Liu, Raman Dutt, Ziyang Wang, Zhongtian Sun, Fan Mo, Pietro Li\`o

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2508.17215

要約:
Large Vision-Language Models (LVLMs) augmented with Retrieval-Augmented Generation (RAG) are increasingly employed in medical AI to enhance factual grounding through external clinical image-text retrieval. However, this reliance creates a significant attack surface. We propose MedThreatRAG, a novel multimodal poisoning framework that systematically probes vulnerabilities in medical RAG systems by injecting adversarial image-text pairs. A key innovation of our approach is the construction of a simulated semi-open attack environment, mimicking real-world medical systems that permit periodic knowledge base updates via user or pipeline contributions. Within this setting, we introduce and emphasize Cross-Modal Conflict Injection (CMCI), which embeds subtle semantic contradictions between medical images and their paired reports. These mismatches degrade retrieval and generation by disrupting cross-modal alignment while remaining sufficiently plausible to evade conventional filters. While basic textual and visual attacks are included for completeness, CMCI demonstrates the most severe degradation. Evaluations on IU-Xray and MIMIC-CXR QA tasks show that MedThreatRAG reduces answer F1 scores by up to 27.66% and lowers LLaVA-Med-1.5 F1 rates to as low as 51.36%. Our findings expose fundamental security gaps in clinical RAG systems and highlight the urgent need for threat-aware design and robust multimodal consistency checks. Finally, we conclude with a concise set of guidelines to inform the safe development of future multimodal medical RAG systems.

#75 Membership Inference Attacks on LLM-based Recommender Systems

privacy

著者: Jiajie He, Min-Chun Chen, Xintong Chen, Xinyang Fang, Yuechun Gu, Keke Chen

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2508.18665

要約:
Large language models (LLMs) based recommender systems (RecSys) can adapt to different domains flexibly. It utilizes in-context learning (ICL), i.e., prompts, to customize the recommendation functions, which include sensitive historical user-specific item interactions, encompassing implicit feedback such as clicked items and explicit product reviews. Such private information may be exposed by novel privacy attacks. However, no study has been conducted on this important issue. We design several membership inference attacks (MIAs) aimed to revealing whether system prompts include victims' historical interactions. The attacks are \emph{Similarity, Memorization, Inquiry, and Poisoning attacks}, each utilizing unique features of LLMs or RecSys. We have carefully evaluated them on five of the latest open-source LLMs and three well-known RecSys benchmark datasets. The results confirm that the MIA threat to LLM RecSys is realistic: inquiry and poisoning attacks show significantly high attack advantages. We also discussed possible methods to mitigate such MIA threats. We have also analyzed the factors affecting these attacks, such as the number of shots in system prompts, the position of the victim in the shots, the number of poisoning items in the prompt,etc.

#76 I Large Language Models possono nascondere un testo in un altro testo della stessa lunghezza

著者: Antonio Norelli, Michael Bronstein

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2510.20075

要約:
A meaningful text can be hidden inside another, completely different yet still coherent and plausible, text of the same length. For example, a tweet containing a harsh political critique could be embedded in a tweet that celebrates the same political leader, or an ordinary product review could conceal a secret manuscript. This uncanny state of affairs is now possible thanks to Large Language Models, and in this paper we present Calgacus, a simple and efficient protocol to achieve it. We show that even modest 8-billion-parameter open-source LLMs are sufficient to obtain high-quality results, and a message as long as this abstract can be encoded and decoded locally on a laptop in seconds. The existence of such a protocol demonstrates a radical decoupling of text from authorial intent, further eroding trust in written communication, already shaken by the rise of LLM chatbots. We illustrate this with a concrete scenario: a company could covertly deploy an unfiltered LLM by encoding its answers within the compliant responses of a safe model. This possibility raises urgent questions for AI safety and challenges our understanding of what it means for a Large Language Model to know something. -- Un testo di senso compiuto pu\`o essere nascosto all'interno di un altro testo completamente diverso, eppure coerente e plausibile, della stessa lunghezza. Ad esempio, un tweet che celebra un leader politico potrebbe celare un tweet che lo critica duramente, o un'anonima recensione di un prodotto potrebbe in realt\`a codificare un manoscritto segreto. Questa sconcertante possibilit\`a \`e oggi alla nostra portata grazie ai Large Language Models (LLM); in questo articolo presentiamo Calgacus, un protocollo semplice ed efficiente per realizzarla. Mostriamo che anche modesti LLM open-source da 8 miliardi di parametri sono sufficienti per ottenere risultati di alta qualit\`a, e che un messaggio lungo quanto questo abstract pu\`o essere codificato e decodificato su un comune portatile in pochi secondi. L'esistenza di tale protocollo dimostra un radicale disaccoppiamento del testo dall'intento del suo autore, erodendo ulteriormente la fiducia nella comunicazione scritta, gi\`a scossa dall'ascesa dei chatbot basati su LLMs. Illustriamo ci\`o con uno scenario concreto: un'azienda potrebbe offrire pubblicamente i servizi di un LLM senza filtri nascondendo le sue risposte all'interno di risposte apparentemente innocue generate da un LLM considerato sicuro. Questa possibilit\`a solleva questioni urgenti per la sicurezza dell'Intelligenza Artificiale e sfida la nostra comprensione di cosa significhi, per un Large Language Model, sapere qualcosa.

#77 What You Trust Is Insecure: Demystifying How Developers (Mis)Use Trusted Execution Environments in Practice

著者: Yuqing Niu, Jieke Shi, Ruidong Han, Ye Liu, Chengyan Ma, Yunbo Lyu, David Lo

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.17363

要約:
Trusted Execution Environments (TEEs), such as Intel SGX and ARM TrustZone, provide isolated regions of CPU and memory for secure computation and are increasingly used to protect sensitive data and code across diverse application domains. However, little is known about how developers actually use TEEs in practice. This paper presents the first large-scale empirical study of real-world TEE applications. We collected and analyzed 241 open-source projects from GitHub that utilize the two most widely-adopted TEEs, Intel SGX and ARM TrustZone. By combining manual inspection with customized static analysis scripts, we examined their adoption contexts, usage patterns, and development practices across three phases. First, we categorized the projects into 8 application domains and identified trends in TEE adoption over time. We found that the dominant use case is IoT device security (30%), which contrasts sharply with prior academic focus on blockchain and cryptographic systems (7%), while AI model protection (12%) is rapidly emerging as a growing domain. Second, we analyzed how TEEs are integrated into software and observed that 32.4% of the projects reimplement cryptographic functionalities instead of using official SDK APIs, suggesting that current SDKs may have limited usability and portability to meet developers' practical needs. Third, we examined security practices through manual inspection and found that 25.3% (61 of 241) of the projects exhibit insecure coding behaviors when using TEEs, such as hardcoded secrets and missing input validation, which undermine their intended security guarantees. Our findings have important implications for improving the usability of TEE SDKs and supporting developers in trusted software development.

#78 Three results on twisted $G-$codes and skew twisted $G-$codes

著者: Alvaro Otero Sanchez

公開日: Tue, 06 Jan 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.00752

要約:
In this paper we solve an open question formulated in the original paper of twisted skew group codes regarding when a twisted skew group code is checkable. Also, we prove that all ideals of dimension 3 over a twisted group algebra are abelian group codes, generalising another previous result over group algebras. Finally, we prove a bound on the dimension and distance of a twisted group code, as well as when such bound is reached.

cs.CR updates on arXiv.org

📋 論文タイトル一覧