arXiv論文一覧 - cs.CR updates on arXiv.org

#1 Comparison of Multiple Classifiers for Android Malware Detection with Emphasis on Feature Insights Using CICMalDroid 2020 Dataset

著者: Md Min-Ha-Zul Abedin, Tazqia Mehrub

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.00058

要約:
Accurate Android malware detection was critical for protecting users at scale. Signature scanners lagged behind fast release cycles on public app stores. We aimed to build a trustworthy detector by pairing a comprehensive dataset with a rigorous, transparent evaluation, and to identify interpretable drivers of decisions. We used CICMalDroid2020, which contained 17,341 apps across Benign, Adware, Banking, SMS malware, and Riskware. We extracted 301 static and 263 dynamic features into a 564 dimensional hybrid vector, then evaluated seven classifiers under three schemes, original features, principal component analysis, PCA, and linear discriminant analysis, LDA, with a 70 percent training and 30 percent test split. Results showed that gradient boosting on the original features performed best. XGBoost achieved 0.9747 accuracy, 0.9703 precision, 0.9731 recall, and 0.9716 F1, and the confusion matrix indicated rare benign labels for malicious apps. HistGradientBoosting reached 0.9741 accuracy and 0.9708 F1, while CatBoost and Random Forest were slightly lower at 0.9678 and 0.9687 accuracy with 0.9636 and 0.9637 F1. KNN and SVM lagged. PCA reduced performance for all models, with XGBoost dropping to 0.9164 accuracy and 0.8988 F1. LDA maintained mid 90s accuracy and clarified separable clusters in projections. A depth two surrogate tree highlighted package name, main activity, and target SDK as key drivers. These findings established high fidelity supervised baselines for Android malware detection and indicated that rich hybrid features with gradient boosting offered a practical and interpretable foundation for deployment.

#2 ReasoningBomb: A Stealthy Denial-of-Service Attack by Inducing Pathologically Long Reasoning in Large Reasoning Models

著者: Xiaogeng Liu, Xinyan Wang, Yechao Zhang, Sanjay Kariyappa, Chong Xiang, Muhao Chen, G. Edward Suh, Chaowei Xiao

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.00154

要約:
Large reasoning models (LRMs) extend large language models with explicit multi-step reasoning traces, but this capability introduces a new class of prompt-induced inference-time denial-of-service (PI-DoS) attacks that exploit the high computational cost of reasoning. We first formalize inference cost for LRMs and define PI-DoS, then prove that any practical PI-DoS attack should satisfy three properties: (1) a high amplification ratio, where each query induces a disproportionately long reasoning trace relative to its own length; (ii) stealthiness, in which prompts and responses remain on the natural language manifold and evade distribution shift detectors; and (iii) optimizability, in which the attack supports efficient optimization without being slowed by its own success. Under this framework, we present ReasoningBomb, a reinforcement-learning-based PI-DoS framework that is guided by a constant-time surrogate reward and trains a large reasoning-model attacker to generate short natural prompts that drive victim LRMs into pathologically long and often effectively non-terminating reasoning. Across seven open-source models (including LLMs and LRMs) and three commercial LRMs, ReasoningBomb induces 18,759 completion tokens on average and 19,263 reasoning tokens on average across reasoning models. It outperforms the the runner-up baseline by 35% in completion tokens and 38% in reasoning tokens, while inducing 6-7x more tokens than benign queries and achieving 286.7x input-to-output amplification ratio averaged across all samples. Additionally, our method achieves 99.8% bypass rate on input-based detection, 98.7% on output-based detection, and 98.4% against strict dual-stage joint detection.

#3 First Steps, Lasting Impact: Platform-Aware Forensics for the Next Generation of Analysts

著者: Vinayak Jain, Sneha Sudhakaran, Saranyan Senthivel

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.00160

要約:
The reliability of cyber forensic evidence acquisition is strongly influenced by the underlying operating systems, Windows, macOS, and Linux - due to inherent variations in file system structures, encryption protocols, and forensic tool compatibility. Disk forensics, one of the most widely used techniques in digital investigations, faces distinct obstacles on each platform. Windows, with its predominantly NTFS and FAT file systems, typically supports reliable disk imaging and analysis through established tools such as FTK Imager and Autopsy/Sleuth Kit. However, encryption features frequently pose challenges to evidence acquisition. Conversely, Linux environments, which rely on file systems like ext4 and XFS, generally offer greater transparency, yet the transient nature of log retention often complicates forensic analysis. In instances where anti-forensic strategies such as encryption and compression render traditional disk forensics insufficient, memory forensics becomes crucial. While memory forensic methodologies demonstrate robustness across Windows and Linux platforms forms through frameworks like Volatility, platform-specific difficulties persist. Memory analysis on Linux systems benefits from tools like LiME, snapshot utilities, and dd for memory acquisition; nevertheless, live memory acquisition on Linux can still present challenges. This research systematically assesses both disk and memory forensic acquisition techniques across samples representing Windows and Linux systems. By identifying effective combinations of forensic tools and configurations tailored to each operating system, the study aims to improve the accuracy and reliability of evidence collection. It further evaluates current forensic tools and highlights a persistent gap: consistently assuring forensic input reliability and footprint integrity.

#4 EigenAI: Deterministic Inference, Verifiable Results

著者: David Ribeiro Alves, Vishnu Patankar, Matheus Pereira, Jamie Stephens, Nima Vaziri, Sreeram Kannan

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.00182

要約:
EigenAI is a verifiable AI platform built on top of the EigenLayer restaking ecosystem. At a high level, it combines a deterministic large-language model (LLM) inference engine with a cryptoeconomically secured optimistic re-execution protocol so that every inference result can be publicly audited, reproduced, and, if necessary, economically enforced. An untrusted operator runs inference on a fixed GPU architecture, signs and encrypts the request and response, and publishes the encrypted log to EigenDA. During a challenge window, any watcher may request re-execution through EigenVerify; the result is then deterministically recomputed inside a trusted execution environment (TEE) with a threshold-released decryption key, allowing a public challenge with private data. Because inference itself is bit-exact, verification reduces to a byte-equality check, and a single honest replica suffices to detect fraud. We show how this architecture yields sovereign agents -- prediction-market judges, trading bots, and scientific assistants -- that enjoy state-of-the-art performance while inheriting security from Ethereum's validator base.

#5 RPP: A Certified Poisoned-Sample Detection Framework for Backdoor Attacks under Dataset Imbalance

backdoor

著者: Miao Lin, Feng Yu, Rui Ning, Lusi Li, Jiawei Chen, Qian Lou, Mengxin Zheng, Chunsheng Xin, Hongyi Wu

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.00183

要約:
Deep neural networks are highly susceptible to backdoor attacks, yet most defense methods to date rely on balanced data, overlooking the pervasive class imbalance in real-world scenarios that can amplify backdoor threats. This paper presents the first in-depth investigation of how the dataset imbalance amplifies backdoor vulnerability, showing that (i) the imbalance induces a majority-class bias that increases susceptibility and (ii) conventional defenses degrade significantly as the imbalance grows. To address this, we propose Randomized Probability Perturbation (RPP), a certified poisoned-sample detection framework that operates in a black-box setting using only model output probabilities. For any inspected sample, RPP determines whether the input has been backdoor-manipulated, while offering provable within-domain detectability guarantees and a probabilistic upper bound on the false positive rate. Extensive experiments on five benchmarks (MNIST, SVHN, CIFAR-10, TinyImageNet and ImageNet10) covering 10 backdoor attacks and 12 baseline defenses show that RPP achieves significantly higher detection accuracy than state-of-the-art defenses, particularly under dataset imbalance. RPP establishes a theoretical and practical foundation for defending against backdoor attacks in real-world environments with imbalanced data.

#6 Semantic-Aware Advanced Persistent Threat Detection Using Autoencoders on LLM-Encoded System Logs

著者: Waleed Khan Mohammed, Zahirul Arief Irfan Bin Shahrul Anuar, Mousa Sufian Mousa Mitani, Hezerul Abdul Karim, Nouar AlDahoul

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.00204

要約:
Advanced Persistent Threats (APTs) are among the most challenging cyberattacks to detect. They are carried out by highly skilled attackers who carefully study their targets and operate in a stealthy, long-term manner. Because APTs exhibit "low-and-slow" behavior, traditional statistical methods and shallow machine learning techniques often fail to detect them. Previous research on APT detection has explored machine learning approaches and provenance graph analysis. However, provenance-based methods often fail to capture the semantic intent behind system activities. This paper proposes a novel anomaly detection approach that leverages semantic embeddings generated by Large Language Models (LLMs). The method enhances APT detection by extracting meaningful semantic representations from unstructured system log data. First, raw system logs are transformed into high-dimensional semantic embeddings using a pre-trained transformer model. These embeddings are then analyzed using an Autoencoder (AE) to identify anomalous and potentially malicious patterns. The proposed method is evaluated using the DARPA Transparent Computing (TC) dataset, which contains realistic APT attack scenarios generated by red teams in live environments. Experimental results show that the AE trained on LLM-derived embeddings outperforms widely used unsupervised baseline methods, including Isolation Forest (IForest), One-Class Support Vector Machine (OC-SVM), and Principal Component Analysis (PCA). Performance is measured using the Area Under the Receiver Operating Characteristic Curve (AUC-ROC), where the proposed approach consistently achieves superior results, even in complex threat scenarios. These findings highlight the importance of semantic understanding in detecting non-linear and stealthy attack behaviors that are often missed by conventional detection techniques.

#7 TessPay: Verify-then-Pay Infrastructure for Trusted Agentic Commerce

agent

著者: Mehul Goenka, Tejas Pathak, Siddharth Asthana

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.00213

要約:
The global economy is entering the era of Agentic Commerce, where autonomous agents can discover services, negotiate prices, and transact value. However adoption towards agentic commerce faces a foundational trust gap: current systems are built for direct human interactions rather than agent-driven operations. It lacks core primitives across three critical stages of agentic transactions. First, Task Delegation lacks means to translate user intent into defined scopes, discover appropriate agents, and securely authorize actions. Second, Payment Settlement for tasks is processed before execution, lacking verifiable evidence to validate the agent's work. Third, Audit Mechanisms fail to capture the full transaction lifecycle, preventing clear accountability for disputes. While emerging standards address fragments of this trust gap, there still remains a critical need for a unified infrastructure that binds the entire transaction lifecycle. To resolve this gap, we introduce TessPay, a unified infrastructure that replaces implicit trust with a 'Verify-then-Pay' architecture. It is a two plane architecture separating control and verification from settlement. TessPay operationalizes trust across four distinct stages: Before execution, agents are anchored in a canonical registry and user intent is captured as verifiable mandates, enabling stakeholder accountability. During execution, funds are locked in escrow while the agent executes the task and generates cryptographic evidence (TLS Notary, TEE etc.) to support Proof of Task Execution (PoTE). At settlement, the system verifies this evidence and releases funds only when the PoTE satisfies verification predicates; modular rail adapters ensure this PoTE-gated escrow remains chain-agnostic across heterogeneous payment rails. After settlement, TessPay preserves a tamper-evident audit trail to enable clear accountability for dispute resolution.

#8 Tri-LLM Cooperative Federated Zero-Shot Intrusion Detection with Semantic Disagreement and Trust-Aware Aggregation

著者: Saeid Jamshidi, Omar Abdul Wahab, Foutse Khomh, Kawser Wazed Nafi

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.00219

要約:
Federated learning (FL) has become an effective paradigm for privacy-preserving, distributed Intrusion Detection Systems (IDS) in cyber-physical and Internet of Things (IoT) networks, where centralized data aggregation is often infeasible due to privacy and bandwidth constraints. Despite its advantages, most existing FL-based IDS assume closed-set learning and lack mechanisms such as uncertainty estimation, semantic generalization, and explicit modeling of epistemic ambiguity in zero-day attack scenarios. Additionally, robustness to heterogeneous and unreliable clients remains a challenge in practical applications. This paper introduces a semantics-driven federated IDS framework that incorporates language-derived semantic supervision into federated optimization, enabling open-set and zero-shot intrusion detection for previously unseen attack behaviors. The approach constructs semantic attack prototypes using a Tri-LLM ensemble of GPT-4o, DeepSeek-V3, and LLaMA-3-8B, aligning distributed telemetry features with high-level attack concepts. Inter-LLM semantic disagreement is modeled as epistemic uncertainty for zero-day risk estimation, while a trust-aware aggregation mechanism dynamically weights client updates based on reliability. Experimental results show stable semantic alignment across heterogeneous clients and consistent convergence. The framework achieves over 80% zero-shot detection accuracy on unseen attack patterns, improving zero-day discrimination by more than 10% compared to similarity-based baselines, while maintaining low aggregation instability in the presence of unreliable or compromised clients.

#9 RVDebloater: Mode-based Adaptive Firmware Debloating for Robotic Vehicles

著者: Mohsen Salehi, Karthik Pattabiraman

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.00270

要約:
As the number of embedded devices grows and their functional requirements increase, embedded firmware is becoming increasingly larger, thereby expanding its attack surface. Despite the increase in firmware size, many embedded devices, such as robotic vehicles (RVs), operate in distinct modes, each requiring only a small subset of the firmware code at runtime. We refer to such devices as mode-based embedded devices. Debloating is an approach to reduce attack surfaces by removing or restricting unneeded code, but existing techniques suffer from significant limitations, such as coarse granularity and irreversible code removal, limiting their applicability. To address these limitations, we propose RVDebloater, a novel adaptive debloating technique for mode-based embedded devices that automatically identifies unneeded firmware code for each mode using either static or dynamic analysis, and dynamically debloats the firmware for each mode at the function level at runtime. RVDebloater introduces a new software-based enforcement approach that supports diverse mode-based embedded devices. We implemented RVDebloater using the LLVM compiler and evaluated its efficiency and effectiveness on six different RVs, including both simulated and real ones, with different real-world missions. We find that device requirements change throughout its lifetime for each mode, and that many critical firmware functions can be restricted in other modes, with an average of 85% of functions not being required. The results showed that none of the missions failed after debloating with RVDebloater, indicating that it neither incurred false positives nor false negatives. Further, RVDebloater prunes the firmware call graph by an average of 45% across different firmware. Finally, RVDebloater incurred an average performance overhead of 3.9% and memory overhead of 4% (approximately 0.25 MB) on real RVs.

#10 Semantics-Preserving Evasion of LLM Vulnerability Detectors

著者: Luze Sun, Alina Oprea, Eric Wong

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.00305

要約:
LLM-based vulnerability detectors are increasingly deployed in security-critical code review, yet their resilience to evasion under behavior-preserving edits remains poorly understood. We evaluate detection-time integrity under a semantics-preserving threat model by instantiating diverse behavior-preserving code transformations on a unified C/C++ benchmark (N=5000), and introduce a metric of joint robustness across different attack methods/carriers. Across models, we observe a systemic failure of semantic invariant adversarial transformations: even state-of-the-art vulnerability detectors perform well on clean inputs while predictions flip under behavior-equivalent edits. Universal adversarial strings optimized on a single surrogate model remain effective when transferred to black-box APIs, and gradient access can further amplify evasion success. These results show that even high-performing detectors are vulnerable to low-cost, semantics-preserving evasion. Our carrier-based metrics provide practical diagnostics for evaluating LLM-based code detectors.

#11 HEEDFUL: Leveraging Sequential Transfer Learning for Robust WiFi Device Fingerprinting Amid Hardware Warm-Up Effects

著者: Abdurrahman Elmaghbub, Bechir Hamdaoui

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.00338

要約:
Deep Learning-based RF fingerprinting approaches struggle to perform well in cross-domain scenarios, particularly during hardware warm-up. This often-overlooked vulnerability has been jeopardizing their reliability and their adoption in practical settings. To address this critical gap, in this work, we first dive deep into the anatomy of RF fingerprints, revealing insights into the temporal fingerprinting variations during and post hardware stabilization. Introducing HEEDFUL, a novel framework harnessing sequential transfer learning and targeted impairment estimation, we then address these challenges with remarkable consistency, eliminating blind spots even during challenging warm-up phases. Our evaluation showcases HEEDFUL's efficacy, achieving remarkable classification accuracies of up to 96% during the initial device operation intervals-far surpassing traditional models. Furthermore, cross-day and cross-protocol assessments confirm HEEDFUL's superiority, achieving and maintaining high accuracy during both the stable and initial warm-up phases when tested on WiFi signals. Additionally, we release WiFi type B and N RF fingerprint datasets that, for the first time, incorporate both the time-domain representation and real hardware impairments of the frames. This underscores the importance of leveraging hardware impairment data, enabling a deeper understanding of fingerprints and facilitating the development of more robust RF fingerprinting solutions.

#12 "Someone Hid It": Query-Agnostic Black-Box Attacks on LLM-Based Retrieval

著者: Jiate Li, Defu Cao, Li Li, Wei Yang, Yuehan Qin, Chenxiao Yu, Tiannuo Yang, Ryan A. Rossi, Yan Liu, Xiyang Hu, Yue Zhao

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.00364

要約:
Large language models (LLMs) have been serving as effective backbones for retrieval systems, including Retrieval-Augmentation-Generation (RAG), Dense Information Retriever (IR), and Agent Memory Retrieval. Recent studies have demonstrated that such LLM-based Retrieval (LLMR) is vulnerable to adversarial attacks, which manipulates documents by token-level injections and enables adversaries to either boost or diminish these documents in retrieval tasks. However, existing attack studies mainly (1) presume a known query is given to the attacker, and (2) highly rely on access to the victim model's parameters or interactions, which are hardly accessible in real-world scenarios, leading to limited validity. To further explore the secure risks of LLMR, we propose a practical black-box attack method that generates transferable injection tokens based on zero-shot surrogate LLMs without need of victim queries or victim models knowledge. The effectiveness of our attack raises such a robustness issue that similar effects may arise from benign or unintended document edits in the real world. To achieve our attack, we first establish a theoretical framework of LLMR and empirically verify it. Under the framework, we simulate the transferable attack as a min-max problem, and propose an adversarial learning mechanism that finds optimal adversarial tokens with learnable query samples. Our attack is validated to be effective on benchmark datasets across popular LLM retrievers.

#13 SpyDir: Spy Device Localization Through Accurate Direction Finding

著者: Wenhao Chen, Wenyi Morty Zhang, Wei Sun, Dinesh Bharadia, Roshan Ayyalasomayajula

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.00411

要約:
Hidden spy cameras have become a great privacy threat recently, as these low-cost, low-power, and small form-factor IoT devices can quietly monitor human activities in the indoor environment without generating any side-channel information. As such, it is difficult to detect and even more challenging to localize them in the rich-scattering indoor environment. To this end, this paper presents the design, implementation, and evaluation of SpyDir, a system that can accurately localize the hidden spy IoT devices by harnessing the electromagnetic emanations automatically and unintentionally emitted from them. Our system design mainly consists of a portable switching antenna array to sniff the spectrum-spread emanations, an emanation enhancement algorithm through non-coherent averaging that can de-correlate the correlated noise effect due to the square-wave emanation structure, and a multipath-resolving algorithm that can exploit the relative channels using a novel optimization-based sparse AoA derivation. Our real-world experimental evaluation across different indoor environments demonstrates an average AoA error of 6.30 deg, whereas the baseline algorithm yields 21.06 deg, achieving over a 3.3 times improvement in accuracy, and a mean localization error of 19.86cm over baseline algorithms of 206.79cm (MUSIC) and 294.75cm (SpotFi), achieving over a 10.41 times and 14.8 times improvement in accuracy.

#14 Towards a Cognitive-Support Tool for Threat Hunters

著者: Alessandra Maciel Paz Milani, Norman Anderson, Margaret-Anne Storey

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.00432

要約:
Cybersecurity increasingly relies on threat hunters to proactively identify adversarial activity, yet the cognitive work underlying threat hunting remains underexplored or insufficiently supported by existing tools. Building on prior studies that examined how threat hunters construct and share mental models during investigations, we derived a set of design propositions to support their cognitive and collaborative work. In this paper, we present the Threat Hunter Board, a prototype tool that operationalizes these design propositions by enabling threat hunters to externalize reasoning, organize investigative leads, and maintain continuity across sessions. Using a design science paradigm, we describe the solution design rationale and artifact development. In addition, we propose six design heuristics that form a solution-evaluation framework for assessing cognitive support in threat hunting tools. An initial evaluation using a cognitive walkthrough provides early evidence of feasibility, while future work will focus on user-based validation with professional threat hunters.

#15 zkCraft: Prompt-Guided LLM as a Zero-Shot Mutation Pattern Oracle for TCCT-Powered ZK Fuzzing

著者: Rong Fu, Jia Yee Tan, Wenxin Zhang, Youjin Wang, Ziyu Kong, Zeli Su, Zhaolu Kang, Shuning Zhang, Xianda Li, Kun Liu, Simon Fong

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.00667

要約:
Zero-knowledge circuits enable privacy-preserving and scalable systems but are difficult to implement correctly due to the tight coupling between witness computation and circuit constraints. We present zkCraft, a practical framework that combines deterministic, R1CS-aware localization with proof-bearing search to detect semantic inconsistencies. zkCraft encodes candidate constraint edits into a single Row-Vortex polynomial and replaces repeated solver queries with a Violation IOP that certifies the existence of edits together with a succinct proof. Deterministic LLM-driven mutation templates bias exploration toward edge cases while preserving auditable algebraic verification. Evaluation on real Circom code shows that proof-bearing localization detects diverse under- and over-constrained faults with low false positives and reduces costly solver interaction. Our approach bridges formal verification and automated debugging, offering a scalable path for robust ZK circuit development.

#16 Computing Maximal Per-Record Leakage and Leakage-Distortion Functions for Privacy Mechanisms under Entropy-Constrained Adversaries

privacy

著者: Genqiang Wu, Xiaoying Zhang, Yu Qi, Hao Wang, Jikui Wang, Yeping He

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.00689

要約:
The exponential growth of data collection necessitates robust privacy protections that preserve data utility. We address information disclosure against adversaries with bounded prior knowledge, modeled by an entropy constraint $H(X) \geq b$. Within this information privacy framework -- which replaces differential privacy's independence assumption with a bounded-knowledge model -- we study three core problems: maximal per-record leakage, the primal leakage-distortion tradeoff (minimizing worst-case leakage under distortion $D$), and the dual distortion minimization (minimizing distortion under leakage constraint $L$). These problems resemble classical information-theoretic ones (channel capacity, rate-distortion) but are more complex due to high dimensionality and the entropy constraint. We develop efficient alternating optimization algorithms that exploit convexity-concavity duality, with theoretical guarantees including local convergence for the primal problem and convergence to a stationary point for the dual. Experiments on binary symmetric channels and modular sum queries validate the algorithms, showing improved privacy-utility tradeoffs over classical differential privacy mechanisms. This work provides a computational framework for auditing privacy risks and designing certified mechanisms under realistic adversary assumptions.

#17 From Detection to Prevention: Explaining Security-Critical Code to Avoid Vulnerabilities

著者: Ranjith Krishnamurthy, Oshando Johnson, Goran Piskachev, Eric Bodden

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.00711

要約:
Security vulnerabilities often arise unintentionally during development due to a lack of security expertise and code complexity. Traditional tools, such as static and dynamic analysis, detect vulnerabilities only after they are introduced in code, leading to costly remediation. This work explores a proactive strategy to prevent vulnerabilities by highlighting code regions that implement security-critical functionality -- such as data access, authentication, and input handling -- and providing guidance for their secure implementation. We present an IntelliJ IDEA plugin prototype that uses code-level software metrics to identify potentially security-critical methods and large language models (LLMs) to generate prevention-oriented explanations. Our initial evaluation on the Spring-PetClinic application shows that the selected metrics identify most known security-critical methods, while an LLM provides actionable, prevention-focused insights. Although these metrics capture structural properties rather than semantic aspects of security, this work lays the foundation for code-level security-aware metrics and enhanced explanations.

#18 Bypassing Prompt Injection Detectors through Evasive Injections

著者: Md Jahedur Rahman, Ihsen Alouani

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.00750

要約:
Large language models (LLMs) are increasingly used in interactive and retrieval-augmented systems, but they remain vulnerable to task drift; deviations from a user's intended instruction due to injected secondary prompts. Recent work has shown that linear probes trained on activation deltas of LLMs' hidden layers can effectively detect such drift. In this paper, we evaluate the robustness of these detectors against adversarially optimised suffixes. We generate universal suffixes that cause poisoned inputs to evade detection across multiple probes simultaneously. Our experiments on Phi-3 3.8B and Llama-3 8B show that a single suffix can achieve high attack success rates; up to 93.91% and 99.63%, respectively, when all probes must be fooled, and nearly perfect success (>90%) under majority vote setting. These results demonstrate that activation delta-based task drift detectors are highly vulnerable to adversarial suffixes, highlighting the need for stronger defences against adaptive attacks. We also propose a defence technique where we generate multiple suffixes and randomly append one of them to the prompts while making forward passes of the LLM and train logistic regression models with these activations. We found this approach to be highly effective against such attacks.

#19 IDEM Enough? Evolving Highly Nonlinear Idempotent Boolean Functions

著者: Claude Carlet, Marko {\DH}urasevic, Domagoj Jakobovic, Luca Mariot, Stjepan Picek

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.00837

要約:
Idempotent Boolean functions form a highly structured subclass of Boolean functions that is closely related to rotation symmetry under a normal-basis representation and to invariance under a fixed linear map in a polynomial basis. These functions are attractive as candidates for cryptographic design, yet their additional algebraic constraints make the search for high nonlinearity substantially more difficult than in the unconstrained case. In this work, we investigate evolutionary methods for constructing highly nonlinear idempotent Boolean functions for dimensions $n=5$ up to $n=12$ using a polynomial basis representation with canonical primitive polynomials. Our results show that the problem of evolving idempotent functions is difficult due to the disruptive nature of crossover and mutation operators. Next, we show that idempotence can be enforced by encoding the truth table on orbits, yielding a compact genome of size equal to the number of distinct squaring orbits.

#20 GradingAttack: Attacking Large Language Models Towards Short Answer Grading Ability

著者: Xueyi Li, Zhuoneng Zhou, Zitao Liu, Yongdong Wu, Weiqi Luo

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.00979

要約:
Large language models (LLMs) have demonstrated remarkable potential for automatic short answer grading (ASAG), significantly boosting student assessment efficiency and scalability in educational scenarios. However, their vulnerability to adversarial manipulation raises critical concerns about automatic grading fairness and reliability. In this paper, we introduce GradingAttack, a fine-grained adversarial attack framework that systematically evaluates the vulnerability of LLM based ASAG models. Specifically, we align general-purpose attack methods with the specific objectives of ASAG by designing token-level and prompt-level strategies that manipulate grading outcomes while maintaining high camouflage. Furthermore, to quantify attack camouflage, we propose a novel evaluation metric that balances attack success and camouflage. Experiments on multiple datasets demonstrate that both attack strategies effectively mislead grading models, with prompt-level attacks achieving higher success rates and token-level attacks exhibiting superior camouflage capability. Our findings underscore the need for robust defenses to ensure fairness and reliability in ASAG. Our code and datasets are available at https://anonymous.4open.science/r/GradingAttack.

#21 SMCP: Secure Model Context Protocol

著者: Xinyi Hou, Shenao Wang, Yifan Zhang, Ziluo Xue, Yanjie Zhao, Cai Fu, Haoyu Wang

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.01129

要約:
Agentic AI systems built around large language models (LLMs) are moving away from closed, single-model frameworks and toward open ecosystems that connect a variety of agents, external tools, and resources. The Model Context Protocol (MCP) has emerged as a standard to unify tool access, allowing agents to discover, invoke, and coordinate with tools more flexibly. However, as MCP becomes more widely adopted, it also brings a new set of security and privacy challenges. These include risks such as unauthorized access, tool poisoning, prompt injection, privilege escalation, and supply chain attacks, any of which can impact different parts of the protocol workflow. While recent research has examined possible attack surfaces and suggested targeted countermeasures, there is still a lack of systematic, protocol-level security improvements for MCP. To address this, we introduce the Secure Model Context Protocol (SMCP), which builds on MCP by adding unified identity management, robust mutual authentication, ongoing security context propagation, fine-grained policy enforcement, and comprehensive audit logging. In this paper, we present the main components of SMCP, explain how it helps reduce security risks, and illustrate its application with practical examples. We hope that this work will contribute to the development of agentic systems that are not only powerful and adaptable, but also secure and dependable.

#22 DTAMS: High-Capacity Generative Steganography via Dynamic Multi-Timestep Selection and Adaptive Deviation Mapping in Latent Diffusion

diffusion

著者: Yuhao Xue, Jiuan Zhou, Yu Cheng, Zhaoxia Yin

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.01160

要約:
With the rapid development of AIGC technologies, generative image steganography has attracted increasing attention due to its high imperceptibility and flexibility. However, existing generative steganography methods often maintain acceptable security and robustness only at relatively low embedding rates, severely limiting the practical applicability of steganographic systems. To address this issue, we propose a novel DTAMS framework that achieves high embedding rates while ensuring strong robustness and security. Specifically, a dynamic multi-timestep adaptive embedding mechanism is constructed based on transition-cost modeling in diffusion models, enabling automatic selection of optimal embedding timesteps to improve embedding rates while preserving overall performance. Meanwhile, we propose a global sub-interval mapping strategy that jointly considers mapping errors and the frequency distribution of secret information, converting point-wise perturbations into interval-level statistical mappings to suppress error accumulation and distribution drift during multi-step diffusion processes. Furthermore, a multi-dimensional joint constraint mechanism is introduced to mitigate distortions caused by repeated latent-pixel transformations by jointly regularizing embedding errors at the pixel, latent, and semantic levels. Experiments demonstrate that the proposed method achieves an embedding rate of 12 bpp while maintaining excellent security and robustness. Across all evaluated conditions, DTAMS reduces the average extraction error rate by 59.39%, representing a significant improvement over SOTA methods.

#23 FedBGS: A Blockchain Approach to Segment Gossip Learning in Decentralized Systems

著者: Fabio Turazza, Marcello Pietri, Marco Picone, Marco Mamei

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.01185

要約:
Privacy-Preserving Federated Learning (PPFL) is a Decentralized machine learning paradigm that enables multiple participants to collaboratively train a global model without sharing their data with the integration of cryptographic and privacy-based techniques to enhance the security of the global system. This privacy-oriented approach makes PPFL a highly suitable solution for training shared models in sectors where data privacy is a critical concern. In traditional FL, local models are trained on edge devices, and only model updates are shared with a central server, which aggregates them to improve the global model. However, despite the presence of the aforementioned privacy techniques, in the classical Federated structure, the issue of the server as a single-point-of-failure remains, leading to limitations both in terms of security and scalability. This paper introduces FedBGS, a fully Decentralized Blockchain-based framework that leverages Segmented Gossip Learning through Federated Analytics. The proposed system aims to optimize blockchain usage while providing comprehensive protection against all types of attacks, ensuring both privacy, security and non-IID data handling in Federated environments.

#24 Bifrost: A Much Simpler Secure Two-Party Data Join Protocol for Secure Data Analytics

著者: Shuyu Chen, Mingxun Zhou, Haoyu Niu, Guopeng Lin, Weili Han

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.01225

要約:
Secure data join enables two parties with vertically distributed data to securely compute the joined table, allowing the parties to perform downstream Secure multi-party computation-based Data Analytics (SDA), such as training machine learning models, based on the joined table. While Circuit-based Private Set Intersection (CPSI) can be used for secure data join, it introduces redundant dummy rows in the joined table, which results in high overhead in the downstream SDA tasks. iPrivJoin addresses this issue but introduces significant communication overhead in the redundancy removal process, as it relies on the cryptographic primitive OPPRF for data encoding and multiple rounds of oblivious shuffles. In this paper, we propose a much simpler secure data join protocol, Bifrost, which outputs (the secret shares of) a redundancy-free joined table. The highlight of Bifrost lies in its simplicity: it builds upon two conceptually simple building blocks, an ECDH-PSI protocol and a two-party oblivious shuffle protocol. The lightweight protocol design allows Bifrost to avoid the need for OPPRF. We also proposed a simple optimization named \textit{dual mapping} that reduces the rounds of oblivious shuffle needed from two to one. Experiments on datasets of up to 100 GB show that Bifrost achieves $2.54 \sim 22.32\times$ speedup and reduces the communication by $84.15\% \sim 88.97\%$ compared to the SOTA redundancy-free secure data join protocol iPrivJoin. Notably, the communication size of Bifrost is nearly equal to the size of the input data. In the two-step SDA pipeline evaluation (secure join and SDA), the redundancy-free property of Bifrost not only avoids the catastrophic error rate blowup in the downstream tasks caused by the dummy rows in the joined table (as introduced in CPSI), but also shows up to $2.80\times$ speed-up in the SDA process with up to $73.15\%$ communication reduction.

#25 Protocol Agent: What If Agents Could Use Cryptography In Everyday Life?

agent

著者: Marco De Rossi

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.01304

要約:
We often assume that agent-to-agent interaction will mirror human conversation. However, agents operate fundamentally differently. What if they could develop communication patterns that are more efficient and better aligned with their capabilities? While cryptographic primitives that could profoundly improve everyday interactions already exist, humans can't use them because they are too complex and the math can't be done in one's head. Examples range from proving your age (or other attributes) without showing your ID, to filing an anonymous report within a group while proving you are a legitimate member, to splitting a dinner bill fairly without revealing salaries. What if agents could create protocols "on the fly" by recognizing which primitive fits an everyday situation, proposing it to an agentic counterpart, persuading them to participate, and then executing the protocol correctly using appropriate computation tools? Protocol Agent frames this problem by introducing a benchmark that spans: (1) cryptographic primitive recognition, (2) negotiation skills, (3) implementation correctness, (4) correct computation and (5) security strength. We evaluate current open-weight and state-of-the-art models on this benchmark, propose a dataset-generation approach to improve these capabilities, and measure the impact of supervised fine-tuning (SFT) on benchmark performance, with tuned models outperforming base models by a wide margin.

#26 TxRay: Agentic Postmortem of Live Blockchain Attacks

agent

著者: Ziyue Wang, Jiangshan Yu, Kaihua Qin, Dawn Song, Arthur Gervais, Liyi Zhou

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.01317

要約:
Decentralized Finance (DeFi) has turned blockchains into financial infrastructure, allowing anyone to trade, lend, and build protocols without intermediaries, but this openness exposes pools of value controlled by code. Within five years, the DeFi ecosystem has lost over 15.75B USD to reported exploits. Many exploits arise from permissionless opportunities that any participant can trigger using only public state and standard interfaces, which we call Anyone-Can-Take (ACT) opportunities. Despite on-chain transparency, postmortem analysis remains slow and manual: investigations start from limited evidence, sometimes only a single transaction hash, and must reconstruct the exploit lifecycle by recovering related transactions, contract code, and state dependencies. We present TxRay, a Large Language Model (LLM) agentic postmortem system that uses tool calls to reconstruct live ACT attacks from limited evidence. Starting from one or more seed transactions, TxRay recovers the exploit lifecycle, derives an evidence-backed root cause, and generates a runnable, self-contained Proof of Concept (PoC) that deterministically reproduces the incident. TxRay self-checks postmortems by encoding incident-specific semantic oracles as executable assertions. To evaluate PoC correctness and quality, we develop PoCEvaluator, an independent agentic execution-and-review evaluator. On 114 incidents from DeFiHackLabs, TxRay produces an expert-aligned root cause and an executable PoC for 105 incidents, achieving 92.11% end-to-end reproduction. Under PoCEvaluator, 98.1% of TxRay PoCs avoid hard-coding attacker addresses, a +24.8pp lift over DeFiHackLabs. In a live deployment, TxRay delivers validated root causes in 40 minutes and PoCs in 59 minutes at median latency. TxRay's oracle-validated PoCs enable attack imitation, improving coverage by 15.6% and 65.5% over STING and APE.

#27 Privocracy: Online Democracy through Private Voting

privacy

著者: Pedro Campon\^es, Hugo Pereira, Adrian Persaud, Kevin Gallagher, Santiago Torres-Arias

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.01341

要約:
In traditional access control policies, every access granted and administrative account introduces an additional vulnerability, as a corruption of a high-privilege user can compromise several sensitive files. Privocracy is an access control mechanism that minimizes the need to attribute high privileges by triggering a secure e-voting procedure to run commands that require using sensitive resources. With Privocracy an organization can distribute trust in resource access, minimizing the system vulnerabilities from single points of failure, all while maintaining the high flexibility of discretionary access control policies. The Privocracy voting mechanism achieves everlasting privacy, ensuring votes remain confidential regardless of an adversary's computational power, while addressing the dependability requirements of a practical and secure system. The procedure incorporates useful features such as vote delegation to reduce voter fatigue, rapid voting rounds to enable quick action during emergencies, and selective vote auditing for application-level accountability. Our experimental results demonstrate that Privocracy processes votes efficiently and can be deployed on commodity hardware.

#28 Adaptive Quantum-Safe Cryptography for 6G Vehicular Networks via Context-Aware Optimization

著者: Poushali Sengupta, Mayank Raikwar, Sabita Maharjan, Frank Eliassen, Yan Zhang

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.01342

要約:
Powerful quantum computers in the future may be able to break the security used for communication between vehicles and other devices (Vehicle-to-Everything, or V2X). New security methods called post-quantum cryptography can help protect these systems, but they often require more computing power and can slow down communication, posing a challenge for fast 6G vehicle networks. In this paper, we propose an adaptive post-quantum cryptography (PQC) framework that predicts short-term mobility and channel variations and dynamically selects suitable lattice-, code-, or hash-based PQC configurations using a predictive multi-objective evolutionary algorithm (APMOEA) to meet vehicular latency and security constraints.However, frequent cryptographic reconfiguration in dynamic vehicular environments introduces new attack surfaces during algorithm transitions. A secure monotonic-upgrade protocol prevents downgrade, replay, and desynchronization attacks during transitions. Theoretical results show decision stability under bounded prediction error, latency boundedness under mobility drift, and correctness under small forecast noise. These results demonstrate a practical path toward quantum-safe cryptography in future 6G vehicular networks. Through extensive experiments based on realistic mobility (LuST), weather (ERA5), and NR-V2X channel traces, we show that the proposed framework reduces end-to-end latency by up to 27\%, lowers communication overhead by up to 65\%, and effectively stabilizes cryptographic switching behavior using reinforcement learning. Moreover, under the evaluated adversarial scenarios, the monotonic-upgrade protocol successfully prevents downgrade, replay, and desynchronization attacks.

#29 CIPHER: Cryptographic Insecurity Profiling via Hybrid Evaluation of Responses

著者: Max Manolov, Tony Gao, Siddharth Shukla, Cheng-Ting Chou, Ryan Lagasse

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.01438

要約:
Large language models (LLMs) are increasingly used to assist developers with code, yet their implementations of cryptographic functionality often contain exploitable flaws. Minor design choices (e.g., static initialization vectors or missing authentication) can silently invalidate security guarantees. We introduce CIPHER(\textbf{C}ryptographic \textbf{I}nsecurity \textbf{P}rofiling via \textbf{H}ybrid \textbf{E}valuation of \textbf{R}esponses), a benchmark for measuring cryptographic vulnerability incidence in LLM-generated Python code under controlled security-guidance conditions. CIPHER uses insecure/neutral/secure prompt variants per task, a cryptography-specific vulnerability taxonomy, and line-level attribution via an automated scoring pipeline. Across a diverse set of widely used LLMs, we find that explicit ``secure'' prompting reduces some targeted issues but does not reliably eliminate cryptographic vulnerabilities overall. The benchmark and reproducible scoring pipeline will be publicly released upon publication.

#30 DuoLungo: Usability Study of Duo 2FA

著者: Renascence Tarafder Prapty, Gene Tsudik

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.01489

要約:
Multi-Factor Authentication (MFA) enhances login security by requiring multiple authentication factors. Its adoption has increased in response to more frequent and sophisticated attacks. Duo is widely used by organizations including Fortune 500 companies and major educational institutions, yet its usability has not been examined thoroughly or recently. Earlier studies focused on technical challenges during initial deployment but did not measure core usability metrics such as task completion time or System Usability Scale (SUS) scores. These results are also outdated, originating from a time when MFA was less familiar to typical users. We conducted a long-term, large-scale Duo usability study at the University of California Irvine during the 2024-2025 academic year, involving 2559 participants. Our analysis uses authentication log data and a survey of 57 randomly selected users. The average overhead of a Duo Push task is nearly 8 seconds, which participants described as short to moderate. Overhead varies with time of day, field of study, and education level. The rate of authentication failures due to incomplete Duo tasks is 4.35 percent, and 43.86 percent of survey respondents reported at least one Duo login failure. The Duo SUS score is 70, indicating good usability. Participants generally find Duo easy to use but somewhat annoying, while also reporting an increased sense of account security. They also described common issues and offered suggestions for improvement.

#31 Sleep Reveals the Nonce: Breaking ECDSA using Sleep-Based Power Side-Channel Vulnerability

著者: Sahan Sanjaya, Prabhat Mishra

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.01491

要約:
Security of Elliptic Curve Digital Signature Algorithm (ECDSA) depends on the secrecy of the per-signature nonce. Even partial nonce leakage can expose the long-term private key through lattice-based cryptanalysis. In this paper, we introduce a previously unexplored power side-channel vulnerability that exploits sleep-induced power spikes to extract ECDSA nonces. Unlike conventional power-based side-channel attacks, this vulnerability leverages power fluctuations generated during processor context switches invoked by sleep functions. These fluctuations correlate with nonce-dependent operations in scalar multiplication, enabling nonce recovery even under constant-time and masked implementations. We evaluate the attack across multiple cryptographic libraries, RustCrypto, BearSSL, and GoCrypto, and processor architectures, including ARM and RISC-V. Our experiments show that subtle variations in the power envelope during sleep-induced context switches provide sufficient leakage for practical ECDSA nonce extraction, recovering 20 bits of the nonce. These results establish sleep-induced power spikes as a practical cross-platform side-channel threat and highlight the need to reconsider design choices in cryptographic systems.

#32 Implementation Challenges in Quantum Key Distribution

著者: Abel C. H. Chen

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.01500

要約:
In recent years, quantum computing technologies have steadily matured and have begun to find practical applications across various domains. One important area is network communication security, where Quantum Key Distribution (QKD) enables communicating parties to establish a shared secret that can then be used to generate symmetric keys for subsequent encryption and decryption. This study focuses on implementing and comparing two well-known QKD protocols, namely BB84 and E91, within an actual quantum computing environment. It also proposes the use of SX gate operations to generate uniform quantum superposition states. By leveraging the properties of quantum superposition and quantum entanglement, the study illustrates how communicating parties can securely obtain a shared secret while preventing adversaries from intercepting it. The experiments are conducted using the IBM Quantum Platform to demonstrate the feasibility of the BB84 and E91 protocols on actual quantum hardware. The evaluation considers several metrics, including entropy, Independent and Identically Distributed (IID), and error-rate verifications.

#33 Are Security Cues Static? Rethinking Warning and Trust Indicators for Life Transitions

著者: Sarah Tabassum

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.01544

要約:
Security cues, such as warnings and trust signals, are designed as stable interface elements, even though people's lives, contexts, and vulnerabilities change over time. Life transitions including migration, aging, or shifts in institutional environments reshape how risk and trust are understood and acted upon. Yet current systems rarely adapt their security cues to these changing conditions, placing the burden of interpretation on users. In this Works-in-Progress paper, we argue that the static nature of security cues represents a design mismatch with transitional human lives. We draw on prior empirical insights from work on educational migration as a motivating case, and extend the discussion to other life transitions. Building on these insights, we introduce the Transition-Aware Security Cues (TASeC) framework and present speculative design concepts illustrating how security cues might evolve across transition stages. We invite HCI to rethink security cues as longitudinal, life-centered design elements collectively.

#34 HACK NDSU: A Real-world Event to Promote Student Interest in Cybersecurity

著者: Enrique Garcia, Jeremy Straub

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.01580

要約:
Hack NDSU let students scan, probe, and hack North Dakota State University's campus network, under professionals' supervision, providing an aspirational experience, potentially motivating them to enter the field. This paper provides a blueprint for educational hacking events against production systems. No prior educational event of this type is known.

#35 Expected Harm: Rethinking Safety Evaluation of (Mis)Aligned LLMs

著者: Yen-Shan Chen, Zhi Rui Tam, Cheng-Kuang Wu, Yun-Nung Chen

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.01600

要約:
Current evaluations of LLM safety predominantly rely on severity-based taxonomies to assess the harmfulness of malicious queries. We argue that this formulation requires re-examination as it assumes uniform risk across all malicious queries, neglecting Execution Likelihood--the conditional probability of a threat being realized given the model's response. In this work, we introduce Expected Harm, a metric that weights the severity of a jailbreak by its execution likelihood, modeled as a function of execution cost. Through empirical analysis of state-of-the-art models, we reveal a systematic Inverse Risk Calibration: models disproportionately exhibit stronger refusal behaviors for low-likelihood (high-cost) threats while remaining vulnerable to high-likelihood (low-cost) queries. We demonstrate that this miscalibration creates a structural vulnerability: by exploiting this property, we increase the attack success rate of existing jailbreaks by up to $2\times$. Finally, we trace the root cause of this failure using linear probing, which reveals that while models encode severity in their latent space to drive refusal decisions, they possess no distinguishable internal representation of execution cost, making them "blind" to this critical dimension of risk.

#36 Efficient Softmax Reformulation for Homomorphic Encryption via Moment Generating Function

著者: Hanjun Park, Byeong-Seo Min, Jiheon Woo, Min-Wook Jeong, Jongho Shin, Yongwoo Lee, Young-Sik Kim, Yongjune Kim

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.01621

要約:
Homomorphic encryption (HE) is a prominent framework for privacy-preserving machine learning, enabling inference directly on encrypted data. However, evaluating softmax, a core component of transformer architectures, remains particularly challenging in HE due to its multivariate structure, the large dynamic range induced by exponential functions, and the need for accurate division during normalization. In this paper, we propose MGF-softmax, a novel softmax reformulation based on the moment generating function (MGF) that replaces the softmax denominator with its moment-based counterpart. This reformulation substantially reduces multiplicative depth while preserving key properties of softmax and asymptotically converging to the exact softmax as the number of input tokens increases. Extensive experiments on Vision Transformers and large language models show that MGF-softmax provides an efficient and accurate approximation of softmax in encrypted inference. In particular, it achieves inference accuracy close to that of high-depth exact methods, while requiring substantially lower computational cost through reduced multiplicative depth.

#37 Witnessd: Proof-of-process via Adversarial Collapse

著者: David Condrey

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.01663

要約:
Digital signatures prove key possession, not authorship. An author who generates text with AI, constructs intermediate document states post-hoc, and signs each hash produces a signature chain indistinguishable from genuine composition. We address this gap between cryptographic integrity and process provenance. We introduce proof-of-process, a primitive category for evidence that a physical process, not merely a signing key, produced a digital artifact. Our construction, the jitter seal, injects imperceptible microsecond delays derived via HMAC from a session secret, keystroke ordinal, and cumulative document hash. Valid evidence requires that real keystrokes produced the document through those intermediate states. We propose the Adversarial Collapse Principle as an evaluation criterion: evidence systems should be judged by whether disputing them requires a conjunction of specific, testable allegations against components with independent trust assumptions. We present Witnessd, an architecture combining jitter seals with Verifiable Delay Functions, external timestamp anchors, dual-source keystroke validation, and optional hardware attestation. Each layer forces allegations at different capability levels; disputing authentic evidence requires coordinated claims across independent trust boundaries. The system does not prevent forgery: a kernel-level adversary can defeat it, and typing AI-generated content produces valid evidence. The contribution is converting vague doubt into falsifiable allegations. We evaluate across 31,000 verification trials with deterministic rejection of invalid proofs.

#38 Backdoor Sentinel: Detecting and Detoxifying Backdoors in Diffusion Models via Temporal Noise Consistency

backdoordiffusion

著者: Bingzheng Wang, Xiaoyan Gu, Hongbo Xu, Hongcheng Li, Zimo Yu, Jiang Zhou, Weiping Wang

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.01765

要約:
Diffusion models have been widely deployed in AIGC services; however, their reliance on opaque training data and procedures exposes a broad attack surface for backdoor injection. In practical auditing scenarios, due to the protection of intellectual property and commercial confidentiality, auditors are typically unable to access model parameters, rendering existing white-box or query-intensive detection methods impractical. More importantly, even after the backdoor is detected, existing detoxification approaches are often trapped in a dilemma between detoxification effectiveness and generation quality. In this work, we identify a previously unreported phenomenon called temporal noise unconsistency, where the noise predictions between adjacent diffusion timesteps is disrupted in specific temporal segments when the input is triggered, while remaining stable under clean inputs. Leveraging this finding, we propose Temporal Noise Consistency Defense (TNC-Defense), a unified framework for backdoor detection and detoxification. The framework first uses the adjacent timestep noise consistency to design a gray-box detection module, for identifying and locating anomalous diffusion timesteps. Furthermore, the framework uses the identified anomalous timesteps to construct a trigger-agnostic, timestep-aware detoxification module, which directly corrects the backdoor generation path. This effectively suppresses backdoor behavior while significantly reducing detoxification costs. We evaluate the proposed method under five representative backdoor attack scenarios and compare it with state-of-the-art defenses. The results show that TNC-Defense improves the average detection accuracy by $11\%$ with negligible additional overhead, and invalidates an average of $98.5\%$ of triggered samples with only a mild degradation in generation quality.

#39 RedVisor: Reasoning-Aware Prompt Injection Defense via Zero-Copy KV Cache Reuse

著者: Mingrui Liu, Sixiao Zhang, Cheng Long, Kwok-Yan Lam

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.01795

要約:
Large Language Models (LLMs) are increasingly vulnerable to Prompt Injection (PI) attacks, where adversarial instructions hidden within retrieved contexts hijack the model's execution flow. Current defenses typically face a critical trade-off: prevention-based fine-tuning often degrades general utility via the "alignment tax", while detection-based filtering incurs prohibitive latency and memory costs. To bridge this gap, we propose RedVisor, a unified framework that synthesizes the explainability of detection systems with the seamless integration of prevention strategies. To the best of our knowledge, RedVisor is the first approach to leverage fine-grained reasoning paths to simultaneously detect attacks and guide the model's safe response. We implement this via a lightweight, removable adapter positioned atop the frozen backbone. This adapter serves a dual function: it first generates an explainable analysis that precisely localizes the injection and articulates the threat, which then explicitly conditions the model to reject the malicious command. Uniquely, the adapter is active only during this reasoning phase and is effectively muted during the subsequent response generation. This architecture yields two distinct advantages: (1) it mathematically preserves the backbone's original utility on benign inputs; and (2) it enables a novel KV Cache Reuse strategy, eliminating the redundant prefill computation inherent to decoupled pipelines. We further pioneer the integration of this defense into the vLLM serving engine with custom kernels. Experiments demonstrate that RedVisor outperforms state-of-the-art defenses in detection accuracy and throughput while incurring negligible utility loss.

#40 Things that Matter -- Identifying Interactions and IoT Device Types in Encrypted Matter Traffic

著者: Kristopher Alex Schlett, Bela Genge, Savio Sciancalepore

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.01932

要約:
Matter is the most recent application-layer standard for the Internet of Things (IoT). As one of its major selling points, Matter's design imposes particular attention to security and privacy: it provides validated secure session establishment protocols, and it uses robust security algorithms to secure communications between IoT devices and Matter controllers. However, to our knowledge, there is no systematic analysis investigating the extent to which a passive attacker, in possession of lower layer keys or exploiting security misconfiguration at those layers, could infer information by passively analyzing encrypted Matter traffic. In this paper, we fill this gap by analyzing the robustness of the Matter IoT standard to encrypted traffic analysis performed by a passive eavesdropper. By using various datasets collected from real-world testbeds and simulated setups, we identify patterns in metadata of the encrypted Matter traffic that allow inferring the specific interactions occurring between end devices and controllers. Moreover, we associate patterns in sequences of interactions to specific types of IoT devices. These patterns can be used to create fingerprints that allow a passive attacker to infer the type of devices used in the network, constituting a serious breach of users privacy. Our results reveal that we can identify specific Matter interactions that occur in encrypted traffic with over $95\%$ accuracy also in the presence of packet losses and delays. Moreover, we can identify Matter device types with a minimum accuracy of $88\%$. The CSA acknowledged our findings, and expressed the willingness to address such vulnerabilities in the next releases of the standard.

#41 Human Society-Inspired Approaches to Agentic AI Security: The 4C Framework

agent

著者: Alsharif Abuadbba, Nazatul Sultan, Surya Nepal, Sanjay Jha

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.01942

要約:
AI is moving from domain-specific autonomy in closed, predictable settings to large-language-model-driven agents that plan and act in open, cross-organizational environments. As a result, the cybersecurity risk landscape is changing in fundamental ways. Agentic AI systems can plan, act, collaborate, and persist over time, functioning as participants in complex socio-technical ecosystems rather than as isolated software components. Although recent work has strengthened defenses against model and pipeline level vulnerabilities such as prompt injection, data poisoning, and tool misuse, these system centric approaches may fail to capture risks that arise from autonomy, interaction, and emergent behavior. This article introduces the 4C Framework for multi-agent AI security, inspired by societal governance. It organizes agentic risks across four interdependent dimensions: Core (system, infrastructure, and environmental integrity), Connection (communication, coordination, and trust), Cognition (belief, goal, and reasoning integrity), and Compliance (ethical, legal, and institutional governance). By shifting AI security from a narrow focus on system-centric protection to the broader preservation of behavioral integrity and intent, the framework complements existing AI security strategies and offers a principled foundation for building agentic AI systems that are trustworthy, governable, and aligned with human values.

#42 HPE: Hallucinated Positive Entanglement for Backdoor Attacks in Federated Self-Supervised Learning

backdoor

著者: Jiayao Wang, Yang Song, Zhendong Zhao, Jiale Zhang, Qilin Wu, Wenliang Yuan, Junwu Zhu, Dongfang Zhao

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.02147

要約:
Federated self-supervised learning (FSSL) enables collaborative training of self-supervised representation models without sharing raw unlabeled data. While it serves as a crucial paradigm for privacy-preserving learning, its security remains vulnerable to backdoor attacks, where malicious clients manipulate local training to inject targeted backdoors. Existing FSSL attack methods, however, often suffer from low utilization of poisoned samples, limited transferability, and weak persistence. To address these limitations, we propose a new backdoor attack method for FSSL, namely Hallucinated Positive Entanglement (HPE). HPE first employs hallucination-based augmentation using synthetic positive samples to enhance the encoder's embedding of backdoor features. It then introduces feature entanglement to enforce tight binding between triggers and backdoor samples in the representation space. Finally, selective parameter poisoning and proximity-aware updates constrain the poisoned model within the vicinity of the global model, enhancing its stability and persistence. Experimental results on several FSSL scenarios and datasets show that HPE significantly outperforms existing backdoor attack methods in performance and exhibits strong robustness under various defense mechanisms.

#43 Malware Detection Through Memory Analysis

著者: Sarah Nassar

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.02184

要約:
This paper summarizes the research conducted for a malware detection project using the Canadian Institute for Cybersecurity's MalMemAnalysis-2022 dataset. The purpose of the project was to explore the effectiveness and efficiency of machine learning techniques for the task of binary classification (i.e., benign or malicious) as well as multi-class classification to further include three malware sub-types (i.e., benign, ransomware, spyware, or Trojan horse). The XGBoost model type was the final model selected for both tasks due to the trade-off between strong detection capability and fast inference speed. The binary classifier achieved a testing subset accuracy and F1 score of 99.98\%, while the multi-class version reached an accuracy of 87.54\% and an F1 score of 81.26\%, with an average F1 score over the malware sub-types of 75.03\%. In addition to the high modelling performance, XGBoost is also efficient in terms of classification speed. It takes about 37.3 milliseconds to classify 50 samples in sequential order in the binary setting and about 43.2 milliseconds in the multi-class setting. The results from this research project help advance the efforts made towards developing accurate and real-time obfuscated malware detectors for the goal of improving online privacy and safety. *This project was completed as part of ELEC 877 (AI for Cybersecurity) in the Winter 2024 term.

#44 QuietPrint: Protecting 3D Printers Against Acoustic Side-Channel Attacks

著者: Seyed Ali Ghazi Asgar, Narasimha Reddy

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.02198

要約:
The 3D printing market has experienced significant growth in recent years, with an estimated revenue of 15 billion USD for 2025. Cyber-attacks targeting the 3D printing process whether through the machine itself, the supply chain, or the fabricated components are becoming increasingly common. One major concern is intellectual property (IP) theft, where a malicious attacker gains access to the design file. One method for carrying out such theft is through side-channel attacks. In this work, we investigate the possibility of IP theft via acoustic side channels and propose a novel method to protect 3D printers against such attacks. The primary advantage of our approach is that it requires no additional hardware, such as large speakers or noise-canceling devices. Instead, it secures printed parts by minimal modifications to the G-code.

#45 SysFuSS: System-Level Firmware Fuzzing with Selective Symbolic Execution

著者: Dakshina Tharindu, Aruna Jayasena, Prabhat Mishra

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.02243

要約:
Firmware serves as the critical interface between hardware and software in computing systems, making any bugs or vulnerabilities particularly dangerous as they can cause catastrophic system failures. While fuzzing is a promising approach for identifying design flaws and security vulnerabilities, traditional fuzzers are ineffective at detecting firmware vulnerabilities. For example, existing fuzzers focus on user-level fuzzing, which is not suitable for detecting kernel-level vulnerabilities. Existing fuzzers also face a coverage plateau problem when dealing with complex interactions between firmware and hardware. In this paper, we present an efficient firmware verification framework, SysFuSS, that integrates system-level fuzzing with selective symbolic execution. Our approach leverages system-level emulation for initial fuzzing, and automatically transitions to symbolic execution when coverage reaches a plateau. This strategy enables us to generate targeted test cases that can trigger previously unexplored regions in firmware designs. We have evaluated SysFuSS on real-world embedded firmware, including OpenSSL, WolfBoot, WolfMQTT, HTSlib, MXML, and libIEC. Experimental evaluation demonstrates that SysFuSS significantly outperforms state-of-the-art fuzzers in terms of both branch coverage and detection of firmware vulnerabilities. Specifically, SysFuSS can detect 118 known vulnerabilities while state-of-the-art can cover only 13 of them. Moreover, SysFuSS takes significantly less time (up to 3.3X, 1.7X on average) to activate these vulnerabilities.

#46 Provenance Verification of AI-Generated Images via a Perceptual Hash Registry Anchored on Blockchain

著者: Apoorv Mohit, Bhavya Aggarwal, Chinmay Gondhalekar

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.02412

要約:
The rapid advancement of artificial intelligence has made the generation of synthetic images widely accessible, increasing concerns related to misinformation, digital forgery, and content authenticity on large-scale online platforms. This paper proposes a blockchain-backed framework for verifying AI-generated images through a registry-based provenance mechanism. Each AI-generated image is assigned a digital fingerprint that preserves similarity using perceptual hashing and is registered at creation time by participating generation platforms. The hashes are stored on a hybrid on-chain/off-chain public blockchain using a Merkle Patricia Trie for tamper-resistant storage (on-chain) and a Burkhard-Keller tree (off-chain) to enable efficient similarity search over large image registries. Verification is performed when images are re-uploaded to digital platforms such as social media services, enabling identification of previously registered AI-generated images even after benign transformations or partial modifications. The proposed system does not aim to universally detect all synthetic images, but instead focuses on verifying the provenance of AI-generated content that has been registered at creation time. By design, this approach complements existing watermarking and learning-based detection methods, providing a platform-agnostic, tamper-proof mechanism for scalable content provenance and authenticity verification at the point of large-scale online distribution.

#47 Integrity from Algebraic Manipulation Detection in Trusted-Repeater QKD Networks

著者: Ailsa Robertson, Christian Schaffner, Sebastian R. Verschoor

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.00069

要約:
Quantum Key Distribution (QKD) allows secure communication without relying on computational assumptions, but can currently only be deployed over relatively short distances due to hardware constraints. To extend QKD over long distances, networks of trusted repeater nodes can be used, wherein QKD is executed between neighbouring nodes and messages between non-neighbouring nodes are forwarded using a relay protocol. Although these networks are being deployed worldwide, no protocol exists which provides provable guarantees of integrity against manipulation from both external adversaries and corrupted intermediates. In this work, we present the first protocol that provably provides both confidentiality and integrity. Our protocol combines an existing cryptographic technique, Algebraic Manipulation Detection (AMD) codes, with multi-path relaying over trusted repeater networks. This protocol achieves Information Theoretic Security (ITS) against the detection of manipulation, which we prove formally through a sequence of games.

#48 Counterfactual Invariant Envelopes for Financial UX: Safety-Lattice Feature-Flag Governance in Crypto-Enabled Streaming

著者: Anton Malinovskiy

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.00093

要約:
Feature flags are the primary mechanism for safely introducing financial capabilities in consumer applications. In crypto-enabled live streaming, however, naive rollouts can create non-obvious risk: users may be exposed to onramps without proper eligibility, external wallets without sufficient fraud controls, or advanced views that alter risk perception and behavior. This paper introduces a novel invention candidate, a Counterfactual Invariant Envelope governor that combines a safety lattice with causal measurement and a shadow cohort for risk estimation. We formalize rollout risk, define invariant constraints across feature combinations, and propose a controller that adapts exposure using leading abuse signals, compliance readiness, and revenue guardrails. We incorporate real-world adoption and fraud data for calibration, provide formulas for rollout safety, and include reproducible policy snippets. The results show that counterfactual, invariant-aware governance reduces risk spillover while preserving conversion and retention, offering a path to patentable governance logic for financial UX.

#49 A Formal Approach to AMM Fee Mechanisms with Lean 4

著者: Marco Dessalvi, Massimo Bartoletti, Alberto Lluch-Lafuente

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.00101

要約:
Decentralized Finance (DeFi) has revolutionized financial markets by enabling complex asset-exchange protocols without trusted intermediaries. Automated Market Makers (AMMs) are a central component of DeFi, providing the core functionality of swapping assets of different types at algorithmically computed exchange rates. Several mainstream AMM implementations are based on the constant-product model, which ensures that swaps preserve the product of the token reserves in the AMM -- up to a \emph{trading fee} used to incentivize liquidity provision. Trading fees substantially complicate the economic properties of AMMs, and for this reason some AMM models abstract them away in order to simplify the analysis. However, trading fees have a non-trivial impact on users' trading strategies, making it crucial to develop refined AMM models that precisely account for their effects. We extend a foundational model of AMMs by introducing a new parameter, the trading fee $\phi\in(0,1]$, into the swap rate function. Fee amounts increase inversely proportional to $\phi$. When $\phi = 1$, no fee is applied and the original model is recovered. We analyze the resulting fee-adjusted model from an economic perspective. We show that several key properties of the swap rate function, including output-boundedness and monotonicity, are preserved. At the same time, other properties - most notably additivity - no longer hold. We precisely characterize this deviation by deriving a generalized form of additivity that captures the effect of swaps in the presence of trading fees. We prove that when $\phi < 1$, executing a single large swap yields strictly greater profit than splitting the trade into smaller ones. Finally, we derive a closed-form solution to the arbitrage problem in the presence of trading fees and prove its uniqueness. All results are formalized and machine-checked in the Lean 4 proof assistant.

#50 Optimal Transport-Guided Adversarial Attacks on Graph Neural Network-Based Bot Detection

著者: Kunal Mukherjee, Zulfikar Alom, Tran Gia Bao Ngo, Cuneyt Gurcan Akcora, Murat Kantarcioglu

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.00318

要約:
The rise of bot accounts on social media poses significant risks to public discourse. To address this threat, modern bot detectors increasingly rely on Graph Neural Networks (GNNs). However, the effectiveness of these GNN-based detectors in real-world settings remains poorly understood. In practice, attackers continuously adapt their strategies as well as must operate under domain-specific and temporal constraints, which can fundamentally limit the applicability of existing attack methods. As a result, there is a critical need for robust GNN-based bot detection methods under realistic, constraint-aware attack scenarios. To address this gap, we introduce BOCLOAK to systematically evaluate the robustness of GNN-based social bot detection via both edge editing and node injection adversarial attacks under realistic constraints. BOCLOAK constructs a probability measure over spatio-temporal neighbor features and learns an optimal transport geometry that separates human and bot behaviors. It then decodes transport plans into sparse, plausible edge edits that evade detection while obeying real-world constraints. We evaluate BOCLOAK across three social bot datasets, five state-of-the-art bot detectors, three adversarial defenses, and compare it against four leading graph adversarial attack baselines. BOCLOAK achieves up to 80.13% higher attack success rates while using 99.80% less GPU memory under realistic real-world constraints. Most importantly, BOCLOAK shows that optimal transport provides a lightweight, principled framework for bridging the gap between adversarial attacks and real-world bot detection.

#51 Text is All You Need for Vision-Language Model Jailbreaking

著者: Yihang Chen, Zhao Xu, Youyuan Jiang, Tianle Zheng, Cho-Jui Hsieh

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.00420

要約:
Large Vision-Language Models (LVLMs) are increasingly equipped with robust safety safeguards to prevent responses to harmful or disallowed prompts. However, these defenses often focus on analyzing explicit textual inputs or relevant visual scenes. In this work, we introduce Text-DJ, a novel jailbreak attack that bypasses these safeguards by exploiting the model's Optical Character Recognition (OCR) capability. Our methodology consists of three stages. First, we decompose a single harmful query into multiple and semantically related but more benign sub-queries. Second, we pick a set of distraction queries that are maximally irrelevant to the harmful query. Third, we present all decomposed sub-queries and distraction queries to the LVLM simultaneously as a grid of images, with the position of the sub-queries being middle within the grid. We demonstrate that this method successfully circumvents the safety alignment of state-of-the-art LVLMs. We argue this attack succeeds by (1) converting text-based prompts into images, bypassing standard text-based filters, and (2) inducing distractions, where the model's safety protocols fail to link the scattered sub-queries within a high number of irrelevant queries. Overall, our findings expose a critical vulnerability in LVLMs' OCR capabilities that are not robust to dispersed, multi-image adversarial inputs, highlighting the need for defenses for fragmented multimodal inputs.

#52 When Agents "Misremember" Collectively: Exploring the Mandela Effect in LLM-based Multi-Agent Systems

agent

著者: Naen Xu, Hengyu An, Shuo Shi, Jinghuai Zhang, Chunyi Zhou, Changjiang Li, Tianyu Du, Zhihui Fu, Jun Wang, Shouling Ji

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.00428

要約:
Recent advancements in large language models (LLMs) have significantly enhanced the capabilities of collaborative multi-agent systems, enabling them to address complex challenges. However, within these multi-agent systems, the susceptibility of agents to collective cognitive biases remains an underexplored issue. A compelling example is the Mandela effect, a phenomenon where groups collectively misremember past events as a result of false details reinforced through social influence and internalized misinformation. This vulnerability limits our understanding of memory bias in multi-agent systems and raises ethical concerns about the potential spread of misinformation. In this paper, we conduct a comprehensive study on the Mandela effect in LLM-based multi-agent systems, focusing on its existence, causing factors, and mitigation strategies. We propose MANBENCH, a novel benchmark designed to evaluate agent behaviors across four common task types that are susceptible to the Mandela effect, using five interaction protocols that vary in agent roles and memory timescales. We evaluate agents powered by several LLMs on MANBENCH to quantify the Mandela effect and analyze how different factors affect it. Moreover, we propose strategies to mitigate this effect, including prompt-level defenses (e.g., cognitive anchoring and source scrutiny) and model-level alignment-based defense, achieving an average 74.40% reduction in the Mandela effect compared to the baseline. Our findings provide valuable insights for developing more resilient and ethically aligned collaborative multi-agent systems.

#53 Towards Building Non-Fine-Tunable Foundation Models

著者: Ziyao Wang, Nizhang Li, Pingzhi Li, Guoheng Sun, Tianlong Chen, Ang Li

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.00446

要約:
Open-sourcing foundation models (FMs) enables broad reuse but also exposes model trainers to economic and safety risks from unrestricted downstream fine-tuning. We address this problem by building non-fine-tunable foundation models: models that remain broadly usable in their released form while yielding limited adaptation gains under task-agnostic unauthorized fine-tuning. We propose Private Mask Pre-Training (PMP), a pre-training framework that concentrates representation learning into a sparse subnetwork identified early in training. The binary mask defining this subnetwork is kept private, and only the final dense weights are released. This forces unauthorized fine-tuning without access to the mask to update parameters misaligned with pretraining subspace, inducing an intrinsic mismatch between the fine-tuning objective and the pre-training geometry. We provide theoretical analysis showing that this mismatch destabilizes gradient-based adaptation and bounds fine-tuning gains. Empirical results on large language models demonstrating that PMP preserves base model performance while consistently degrading unauthorized fine-tuning across a wide range of downstream tasks, with the strength of non-fine-tunability controlled by the mask ratio.

#54 Jailbreaking LLMs via Calibration

著者: Yuxuan Lu, Yongkang Guo, Yuqing Kong

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.00619

要約:
Safety alignment in Large Language Models (LLMs) often creates a systematic discrepancy between a model's aligned output and the underlying pre-aligned data distribution. We propose a framework in which the effect of safety alignment on next-token prediction is modeled as a systematic distortion of a pre-alignment distribution. We cast Weak-to-Strong Jailbreaking as a forecast aggregation problem and derive an optimal aggregation strategy characterized by a Gradient Shift in the loss-induced dual space. We show that logit-arithmetic jailbreaking methods are a special case of this framework under cross-entropy loss, and derive a broader family of aggregation rules corresponding to other proper losses. We also propose a new hybrid aggregation rule. Evaluations across red-teaming benchmarks and math utility tasks using frontier models demonstrate that our approach achieves superior Attack Success Rates and lower "Jailbreak Tax" compared with existing methods, especially on the safety-hardened gpt-oss-120b.

#55 NegaBent, No Regrets: Evolving Spectrally Flat Boolean Functions

著者: Claude Carlet, Marko {\DH}urasevic, Ermes Franch, Domagoj Jakobovic, Luca Mariot, Stjepan Picek

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.00843

要約:
Negabent Boolean functions are defined by having a flat magnitude spectrum under the nega-Hadamard transform. They exist in both even and odd dimensions, and the subclass of functions that are simultaneously bent and negabent (bent-negabent) has attracted interest due to the combined optimal periodic and negaperiodic spectral properties. In this work, we investigate how evolutionary algorithms can be used to evolve (bent-)negabent Boolean functions. Our experimental results indicate that evolutionary algorithms, especially genetic programming, are a suitable approach for evolving negabent Boolean functions, and we successfully evolve such functions in all dimensions we consider.

#56 MedBeads: An Agent-Native, Immutable Data Substrate for Trustworthy Medical AI

agent

著者: Takahito Nakajima

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.01086

要約:
Background: As of 2026, Large Language Models (LLMs) demonstrate expert-level medical knowledge. However, deploying them as autonomous "Clinical Agents" remains limited. Current Electronic Medical Records (EMRs) and standards like FHIR are designed for human review, creating a "Context Mismatch": AI agents receive fragmented data and must rely on probabilistic inference (e.g., RAG) to reconstruct patient history. This approach causes hallucinations and hinders auditability. Methods: We propose MedBeads, an agent-native data infrastructure where clinical events are immutable "Beads"--nodes in a Merkle Directed Acyclic Graph (DAG)--cryptographically referencing causal predecessors. This "write-once, read-many" architecture makes tampering mathematically detectable. We implemented a prototype with a Go Core Engine, Python middleware for LLM integration, and a React-based visualization interface. Results: We successfully implemented the workflow using synthetic data. The FHIR-to-DAG conversion transformed flat resources into a causally-linked graph. Our Breadth-First Search (BFS) Context Retrieval algorithm traverses relevant subgraphs with O(V+E) complexity, enabling real-time decision support. Tamper-evidence is guaranteed by design: any modification breaks the cryptographic chain. The visualization aids clinician understanding through explicit causal links. Conclusion: MedBeads addresses the "Context Mismatch" by shifting from probabilistic search to deterministic graph traversal, and from mutable records to immutable chains, providing the substrate for "Trustworthy Medical AI." It guarantees the context the AI receives is deterministic and tamper-evident, while the LLM determines interpretation. The structured Bead format serves as a token-efficient "AI-native language." We release MedBeads as open-source software to accelerate agent-native data standards.

#57 Statistical MIA: Rethinking Membership Inference Attack for Reliable Unlearning Auditing

privacy

著者: Jialong Sun, Zeming Wei, Jiaxuan Zou, Jiacheng Gong, Guanheng Wang, Chengyang Dong, Jialong Li, Bo Liu

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.01150

要約:
Machine unlearning (MU) is essential for enforcing the right to be forgotten in machine learning systems. A key challenge of MU is how to reliably audit whether a model has truly forgotten specified training data. Membership Inference Attacks (MIAs) are widely used for unlearning auditing, where samples that evade membership detection are often regarded as successfully forgotten. After carefully revisiting the reliability of MIA, we show that this assumption is flawed: failed membership inference does not imply true forgetting. We theoretically demonstrate that MIA-based auditing, when formulated as a binary classification problem, inevitably incurs statistical errors whose magnitude cannot be observed during the auditing process. This leads to overly optimistic evaluations of unlearning performance, while incurring substantial computational overhead due to shadow model training. To address these limitations, we propose Statistical Membership Inference Attack (SMIA), a novel training-free and highly effective auditing framework. SMIA directly compares the distributions of member and non-member data using statistical tests, eliminating the need for learned attack models. Moreover, SMIA outputs both a forgetting rate and a corresponding confidence interval, enabling quantified reliability of the auditing results. Extensive experiments show that SMIA provides more reliable auditing with significantly lower computational cost than existing MIA-based approaches. Notably, the theoretical guarantees and empirical effectiveness of SMIA suggest it as a new paradigm for reliable machine unlearning auditing.

#58 Learning from Anonymized and Incomplete Tabular Data

著者: Lucas Lange, Adrian B\"ottinger, Victor Christen, Anushka Vidanage, Peter Christen, Erhard Rahm

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.01217

要約:
User-driven privacy allows individuals to control whether and at what granularity their data is shared, leading to datasets that mix original, generalized, and missing values within the same records and attributes. While such representations are intuitive for privacy, they pose challenges for machine learning, which typically treats non-original values as new categories or as missing, thereby discarding generalization semantics. For learning from such tabular data, we propose novel data transformation strategies that account for heterogeneous anonymization and evaluate them alongside standard imputation and LLM-based approaches. We employ multiple datasets, privacy configurations, and deployment scenarios, demonstrating that our method reliably regains utility. Our results show that generalized values are preferable to pure suppression, that the best data preparation strategy depends on the scenario, and that consistent data representations are crucial for maintaining downstream utility. Overall, our findings highlight that effective learning is tied to the appropriate handling of anonymized values.

#59 Free-space and Satellite-Based Quantum Communication: Principles, Implementations, and Challenges

著者: Georgi Gary Rozenman, Alona Maslennikov, Sara P. Gandelman, Yuval Reches, Sahar Delfan, Neel Kanth Kundu, Leyi Zhang, Ruiqi Liu

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.01426

要約:
Satellite-based quantum communications represent a critical advancement in the pursuit of secure, global-scale quantum networks. Leveraging the principles of quantum mechanics, these systems offer unparalleled security through Quantum Key Distribution (QKD) and other quantum communication protocols. This review provides a comprehensive overview of the current state of satellite-based quantum communications, focusing on the evolution from terrestrial to space-based systems. We explore the distinct advantages and challenges of discrete-variable (DV) and continuous-variable (CV) quantum communication technologies in the context of satellite deployments. The paper also discusses key milestones such as the successful implementation of quantum communication via the Micius satellite and outlines the primary challenges, including atmospheric turbulence and the development of quantum repeaters, that must be addressed to achieve a global quantum internet. This review aims to consolidate recent advancements in the field, providing insights and perspectives on the future directions and potential innovations that will drive the continued evolution of satellite-based quantum communications.

#60 Improve the Trade-off Between Watermark Strength and Speculative Sampling Efficiency for Language Models

intellectual property

著者: Weiqing He, Xiang Li, Li Shen, Weijie Su, Qi Long

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.01428

要約:
Watermarking is a principled approach for tracing the provenance of large language model (LLM) outputs, but its deployment in practice is hindered by inference inefficiency. Speculative sampling accelerates inference, with efficiency improving as the acceptance rate between draft and target models increases. Yet recent work reveals a fundamental trade-off: higher watermark strength reduces acceptance, preventing their simultaneous achievement. We revisit this trade-off and show it is not absolute. We introduce a quantitative measure of watermark strength that governs statistical detectability and is maximized when tokens are deterministic functions of pseudorandom numbers. Using this measure, we fully characterize the trade-off as a constrained optimization problem and derive explicit Pareto curves for two existing watermarking schemes. Finally, we introduce a principled mechanism that injects pseudorandomness into draft-token acceptance, ensuring maximal watermark strength while maintaining speculative sampling efficiency. Experiments further show that this approach improves detectability without sacrificing efficiency. Our findings uncover a principle that unites speculative sampling and watermarking, paving the way for their efficient and practical deployment.

#61 MarkCleaner: High-Fidelity Watermark Removal via Imperceptible Micro-Geometric Perturbation

intellectual property

著者: Xiaoxi Kong, Jieyu Yuan, Pengdi Chen, Yuanlin Zhang, Chongyi Li, Bin Li

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.01513

要約:
Semantic watermarks exhibit strong robustness against conventional image-space attacks. In this work, we show that such robustness does not survive under micro-geometric perturbations: spatial displacements can remove watermarks by breaking the phase alignment. Motivated by this observation, we introduce MarkCleaner, a watermark removal framework that avoids semantic drift caused by regeneration-based watermark removal. Specifically, MarkCleaner is trained with micro-geometry-perturbed supervision, which encourages the model to separate semantic content from strict spatial alignment and enables robust reconstruction under subtle geometric displacements. The framework adopts a mask-guided encoder that learns explicit spatial representations and a 2D Gaussian Splatting-based decoder that explicitly parameterizes geometric perturbations while preserving semantic content. Extensive experiments demonstrate that MarkCleaner achieves superior performance in both watermark removal effectiveness and visual fidelity, while enabling efficient real-time inference. Our code will be made available upon acceptance.

#62 A new criterion for the absolute irreducibility of multivariate polynomials over finite fields

著者: Carlos Agrinsoni, Heeralal Janwa, Moises Delgado

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.01583

要約:
A key property of an algebraic variety is whether it is absolutely irreducible, meaning that it remains irreducible over the algebraic closure of its defining field, and determining absolute irreducibility is important in algebraic geometry and its applications in coding theory, cryptography, and other fields. Among the applications of absolute irreducibility are bounding the number of rational points via the Weil conjectures and establishing exceptional APN and permutation properties of functions over finite fields. In this article, we present a new criterion for the absolute irreducibility of hypersurfaces defined by multivariate polynomials over finite fields. Our criterion does not require testing for irreducibility in the ground or extension fields, assuming that the leading form is square-free. We just require multivariate GCD computations and the square-free property. Since almost all polynomials are known to be square-free, our absolute irreducibility criterion is valid for almost all multivariate polynomials.

#63 AI-Assisted Adaptive Rendering for High-Frequency Security Telemetry in Web Interfaces

著者: Mona Rajhans

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.01671

要約:
Modern cybersecurity platforms must process and display high-frequency telemetry such as network logs, endpoint events, alerts, and policy changes in real time. Traditional rendering techniques based on static pagination or fixed polling intervals fail under volume conditions exceeding hundreds of thousands of events per second, leading to UI freezes, dropped frames, or stale data. This paper presents an AI-assisted adaptive rendering framework that dynamically regulates visual update frequency, prioritizes semantically relevant events, and selectively aggregates lower-priority data using behavior-driven heuristics and lightweight on-device machine learning models. Experimental validation demonstrates a 45-60 percent reduction in rendering overhead while maintaining analyst perception of real-time responsiveness.

#64 WorldCup Sampling for Multi-bit LLM Watermarking

intellectual property

著者: Yidan Wang, Yubing Ren, Yanan Cao, Li Guo

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.01752

要約:
As large language models (LLMs) generate increasingly human-like text, watermarking offers a promising solution for reliable attribution beyond mere detection. While multi-bit watermarking enables richer provenance encoding, existing methods largely extend zero-bit schemes through seed-driven steering, leading to indirect information flow, limited effective capacity, and suboptimal decoding. In this paper, we propose WorldCup, a multi-bit watermarking framework for LLMs that treats sampling as a natural communication channel and embeds message bits directly into token selection via a hierarchical competition mechanism guided by complementary signals. Moreover, WorldCup further adopts entropy-aware modulation to preserve generation quality and supports robust message recovery through confidence-aware decoding. Comprehensive experiments show that WorldCup achieves a strong balance across capacity, detectability, robustness, text quality, and decoding efficiency, consistently outperforming prior baselines and laying a solid foundation for future LLM watermarking studies.

#65 Multi-party Computation Protocols for Post-Market Fairness Monitoring in Algorithmic Hiring: From Legal Requirements to Computational Designs

著者: Changyang He, Nina Baranowska, Josu Andoni Egu\'iluz Casta\~neira, Guillem Escriba, Matthias Juentgen, Anna Via, Frederik Zuiderveen Borgesius, Asia Biega

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.01837

要約:
Post-market fairness monitoring is now mandated to ensure fairness and accountability for high-risk employment AI systems under emerging regulations such as the EU AI Act. However, effective fairness monitoring often requires access to sensitive personal data, which is subject to strict legal protections under data protection law. Multi-party computation (MPC) offers a promising technical foundation for compliant post-market fairness monitoring, enabling the secure computation of fairness metrics without revealing sensitive attributes. Despite growing technical interest, the operationalization of MPC-based fairness monitoring in real-world hiring contexts under concrete legal, industrial, and usability constraints remains unknown. This work addresses this gap through a co-design approach integrating technical, legal, and industrial expertise. We identify practical design requirements for MPC-based fairness monitoring, develop an end-to-end, legally compliant protocol spanning the full data lifecycle, and empirically validate it in a large-scale industrial setting. Our findings provide actionable design insights as well as legal and industrial implications for deploying MPC-based post-market fairness monitoring in algorithmic hiring systems.

#66 When Feasibility of Fairness Audits Relies on Willingness to Share Data: Examining User Acceptance of Multi-Party Computation Protocols for Fairness Monitoring

著者: Changyang He, Parnian Jahangirirad, Lin Kyi, Asia J. Biega

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.01846

要約:
Fairness monitoring is critical for detecting algorithmic bias, as mandated by the EU AI Act. Since such monitoring requires sensitive user data (e.g., ethnicity), the AI Act permits its processing only with strict privacy measures, such as multi-party computation (MPC), in compliance with the GDPR. However, the effectiveness of such secure monitoring protocols ultimately depends on people's willingness to share their data. Little is known about how different MPC protocol designs shape user acceptance. To address this, we conducted an online survey with 833 participants in Europe, examining user acceptance of various MPC protocol designs for fairness monitoring. Findings suggest that users prioritized risk-related attributes (e.g., privacy protection mechanism) in direct evaluation but benefit-related attributes (e.g., fairness objective) in simulated choices, with acceptance shaped by their fairness and privacy orientations. We derive implications for deploying and communicating privacy-preserving protocols in ways that foster informed consent and align with user expectations.

#67 Co-RedTeam: Orchestrated Security Discovery and Exploitation with LLM Agents

agent

著者: Pengfei He, Ash Fox, Lesly Miculicich, Stefan Friedli, Daniel Fabian, Burak Gokturk, Jiliang Tang, Chen-Yu Lee, Tomas Pfister, Long T. Le

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.02164

要約:
Large language models (LLMs) have shown promise in assisting cybersecurity tasks, yet existing approaches struggle with automatic vulnerability discovery and exploitation due to limited interaction, weak execution grounding, and a lack of experience reuse. We propose Co-RedTeam, a security-aware multi-agent framework designed to mirror real-world red-teaming workflows by integrating security-domain knowledge, code-aware analysis, execution-grounded iterative reasoning, and long-term memory. Co-RedTeam decomposes vulnerability analysis into coordinated discovery and exploitation stages, enabling agents to plan, execute, validate, and refine actions based on real execution feedback while learning from prior trajectories. Extensive evaluations on challenging security benchmarks demonstrate that Co-RedTeam consistently outperforms strong baselines across diverse backbone models, achieving over 60% success rate in vulnerability exploitation and over 10% absolute improvement in vulnerability detection. Ablation and iteration studies further confirm the critical role of execution feedback, structured interaction, and memory for building robust and generalizable cybersecurity agents.

#68 MIRROR: Manifold Ideal Reference ReconstructOR for Generalizable AI-Generated Image Detection

著者: Ruiqi Liu, Manni Cui, Ziheng Qin, Zhiyuan Yan, Ruoxin Chen, Yi Han, Zhiheng Li, Junkai Chen, ZhiJin Chen, Kaiqing Lin, Jialiang Shen, Lubin Weng, Jing Dong, Yan Wang, Shu Wu

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.02222

要約:
High-fidelity generative models have narrowed the perceptual gap between synthetic and real images, posing serious threats to media security. Most existing AI-generated image (AIGI) detectors rely on artifact-based classification and struggle to generalize to evolving generative traces. In contrast, human judgment relies on stable real-world regularities, with deviations from the human cognitive manifold serving as a more generalizable signal of forgery. Motivated by this insight, we reformulate AIGI detection as a Reference-Comparison problem that verifies consistency with the real-image manifold rather than fitting specific forgery cues. We propose MIRROR (Manifold Ideal Reference ReconstructOR), a framework that explicitly encodes reality priors using a learnable discrete memory bank. MIRROR projects an input into a manifold-consistent ideal reference via sparse linear combination, and uses the resulting residuals as robust detection signals. To evaluate whether detectors reach the "superhuman crossover" required to replace human experts, we introduce the Human-AIGI benchmark, featuring a psychophysically curated human-imperceptible subset. Across 14 benchmarks, MIRROR consistently outperforms prior methods, achieving gains of 2.1% on six standard benchmarks and 8.1% on seven in-the-wild benchmarks. On Human-AIGI, MIRROR reaches 89.6% accuracy across 27 generators, surpassing both lay users and visual experts, and further approaching the human perceptual limit as pretrained backbones scale. The code is publicly available at: https://github.com/349793927/MIRROR

#69 RACA: Representation-Aware Coverage Criteria for LLM Safety Testing

著者: Zeming Wei, Zhixin Zhang, Chengcan Wu, Yihao Zhang, Xiaokun Luan, Meng Sun

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.02280

要約:
Recent advancements in LLMs have led to significant breakthroughs in various AI applications. However, their sophisticated capabilities also introduce severe safety concerns, particularly the generation of harmful content through jailbreak attacks. Current safety testing for LLMs often relies on static datasets and lacks systematic criteria to evaluate the quality and adequacy of these tests. While coverage criteria have been effective for smaller neural networks, they are not directly applicable to LLMs due to scalability issues and differing objectives. To address these challenges, this paper introduces RACA, a novel set of coverage criteria specifically designed for LLM safety testing. RACA leverages representation engineering to focus on safety-critical concepts within LLMs, thereby reducing dimensionality and filtering out irrelevant information. The framework operates in three stages: first, it identifies safety-critical representations using a small, expert-curated calibration set of jailbreak prompts. Second, it calculates conceptual activation scores for a given test suite based on these representations. Finally, it computes coverage results using six sub-criteria that assess both individual and compositional safety concepts. We conduct comprehensive experiments to validate RACA's effectiveness, applicability, and generalization, where the results demonstrate that RACA successfully identifies high-quality jailbreak prompts and is superior to traditional neuron-level criteria. We also showcase its practical application in real-world scenarios, such as test set prioritization and attack prompt sampling. Furthermore, our findings confirm RACA's generalization to various scenarios and its robustness across various configurations. Overall, RACA provides a new framework for evaluating the safety of LLMs, contributing a valuable technique to the field of testing for AI.

#70 Decoupling Generalizability and Membership Privacy Risks in Neural Networks

privacy

著者: Xingli Fang, Jung-Eun Kim

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.02296

要約:
A deep learning model usually has to sacrifice some utilities when it acquires some other abilities or characteristics. Privacy preservation has such trade-off relationships with utilities. The loss disparity between various defense approaches implies the potential to decouple generalizability and privacy risks to maximize privacy gain. In this paper, we identify that the model's generalization and privacy risks exist in different regions in deep neural network architectures. Based on the observations that we investigate, we propose Privacy-Preserving Training Principle (PPTP) to protect model components from privacy risks while minimizing the loss in generalizability. Through extensive evaluations, our approach shows significantly better maintenance in model generalizability while enhancing privacy preservation.

#71 Guaranteeing Privacy in Hybrid Quantum Learning through Theoretical Mechanisms

privacy

著者: Hoang M. Ngo, Tre' R. Jeter, Incheol Shin, Wanli Xing, Tamer Kahveci, My T. Thai

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.02364

要約:
Quantum Machine Learning (QML) is becoming increasingly prevalent due to its potential to enhance classical machine learning (ML) tasks, such as classification. Although quantum noise is often viewed as a major challenge in quantum computing, it also offers a unique opportunity to enhance privacy. In particular, intrinsic quantum noise provides a natural stochastic resource that, when rigorously analyzed within the differential privacy (DP) framework and composed with classical mechanisms, can satisfy formal $(\varepsilon, \delta)$-DP guarantees. This enables a reduction in the required classical perturbation without compromising the privacy budget, potentially improving model utility. However, the integration of classical and quantum noise for privacy preservation remains unexplored. In this work, we propose a hybrid noise-added mechanism, HYPER-Q, that combines classical and quantum noise to protect the privacy of QML models. We provide a comprehensive analysis of its privacy guarantees and establish theoretical bounds on its utility. Empirically, we demonstrate that HYPER-Q outperforms existing classical noise-based mechanisms in terms of adversarial robustness across multiple real-world datasets.

#72 David vs. Goliath: Verifiable Agent-to-Agent Jailbreaking via Reinforcement Learning

agent

著者: Samuel Nellessen, Tal Kachman

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.02395

要約:
The evolution of large language models into autonomous agents introduces adversarial failures that exploit legitimate tool privileges, transforming safety evaluation in tool-augmented environments from a subjective NLP task into an objective control problem. We formalize this threat model as Tag-Along Attacks: a scenario where a tool-less adversary "tags along" on the trusted privileges of a safety-aligned Operator to induce prohibited tool use through conversation alone. To validate this threat, we present Slingshot, a 'cold-start' reinforcement learning framework that autonomously discovers emergent attack vectors, revealing a critical insight: in our setting, learned attacks tend to converge to short, instruction-like syntactic patterns rather than multi-turn persuasion. On held-out extreme-difficulty tasks, Slingshot achieves a 67.0% success rate against a Qwen2.5-32B-Instruct-AWQ Operator (vs. 1.7% baseline), reducing the expected attempts to first success (on solved tasks) from 52.3 to 1.3. Crucially, Slingshot transfers zero-shot to several model families, including closed-source models like Gemini 2.5 Flash (56.0% attack success rate) and defensive-fine-tuned open-source models like Meta-SecAlign-8B (39.2% attack success rate). Our work establishes Tag-Along Attacks as a first-class, verifiable threat model and shows that effective agentic attacks can be elicited from off-the-shelf open-weight models through environment interaction alone.

#73 Secure Multi-User Linearly-Separable Distributed Computing

著者: Amir Masoud Jafarpisheh, Ali Khalesi, Petros Elia

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2602.02489

要約:
The introduction of the new multi-user linearly-separable distributed computing framework, has recently revealed how a parallel treatment of users can yield large parallelization gains with relatively low computation and communication costs. These gains stem from a new approach that converts the computing problem into a sparse matrix factorization problem; a matrix $F$ that describes the users' requests, is decomposed as $F = DE$, where a $\gamma$-sparse $E$ defines the task allocation across $N$ servers, and a $\delta$-sparse $D$ defines the connectivity between $N$ servers and $K$ users as well as the decoding process. While this approach provides near-optimal performance, its linear nature has raised data secrecy concerns. We here adopt an information-theoretic secrecy framework, seeking guarantees that each user can learn nothing more than its own requested function. In this context, our main result provides two necessary and sufficient secrecy criteria; (i) for each user $k$ who observes $\alpha_k$ server responses, the common randomness visible to that user must span a subspace of dimension exactly $\alpha_k-1$, and (ii) for each user, removing from $\mathbf{D}$ the columns corresponding to the servers it observes must leave a matrix of rank at least $K-1$. With these conditions in place, we design a general scheme -- that applies to finite and non-finite fields alike -- which is based on appending to $\mathbf{E}$ a basis of $\mathrm{Null}(\mathbf{D})$ and by carefully injecting shared randomness. In many cases, this entails no additional costs. The scheme, while maintaining performance, guarantees perfect information-theoretic secrecy in the case of finite fields, while in the real case, the conditions yield an explicit mutual-information bound that can be made arbitrarily small by increasing the variance of Gaussian common randomness.

#74 Bankrupting DoS Attackers

著者: Trisha Chakraborty, Abir Islam, Valerie King, Daniel Rayborn, Jared Saia, Maxwell Young

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2205.08287

要約:
Can we make a denial-of-service attacker pay more than the server and honest clients? Consider a model where a server sees a stream of jobs sent by either honest clients or an adversary. The server sets a price for servicing each job with the aid of an estimator, which provides approximate statistical information about the distribution of previously occurring good jobs. We describe and analyze pricing algorithms for the server under different models of synchrony, with total cost parameterized by the accuracy of the estimator. Given a reasonably accurate estimator, the algorithm's cost provably grows more slowly than the attacker's cost, as the attacker's cost grows large. Additionally, we prove a lower bound, showing that our pricing algorithm yields asymptotically tight results when the estimator is accurate within constant factors.

#75 Leveraging ASIC AI Chips for Homomorphic Encryption

著者: Jianming Tong, Tianhao Huang, Jingtian Dang, Leo de Castro, Anirudh Itagi, Anupam Golder, Asra Ali, Jeremy Kun, Jevin Jiang, Arvind, G. Edward Suh, Tushar Krishna

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2501.07047

要約:
Homomorphic Encryption (HE) provides strong data privacy for cloud services but at the cost of prohibitive computational overhead. While GPUs have emerged as a practical platform for accelerating HE, there remains an order-of-magnitude energy-efficiency gap compared to specialized (but expensive) HE ASICs. This paper explores an alternate direction: leveraging existing AI accelerators, like Google's TPUs with coarse-grained compute and memory architectures, to offer a path toward ASIC-level energy efficiency for HE. However, this architectural paradigm creates a fundamental mismatch with SoTA HE algorithms designed for GPUs. These algorithms rely heavily on: (1) high-precision (32-bit) integer arithmetic to now run on a TPU's low-throughput vector unit, leaving its high-throughput low-precision (8-bit) matrix engine (MXU) idle, and (2) fine-grained data permutations that are inefficient on the TPU's coarse-grained memory subsystem. Consequently, porting GPU-optimized HE libraries to TPUs results in severe resource under-utilization and performance degradation. To tackle above challenges, we introduce CROSS, a compiler framework that systematically transforms HE workloads to align with the TPU's architecture. CROSS makes two key contributions: (1) Basis-Aligned Transformation (BAT), a novel technique that converts high-precision modular arithmetic into dense, low-precision (INT8) matrix multiplications, unlocking and improving the utilization of TPU's MXU for HE, and (2) Memory-Aligned Transformation (MAT), which eliminates costly runtime data reordering by embedding reordering into compute kernels through offline parameter transformation. CROSS (TPU v6e) achieves higher throughput per watt on NTT and HE operators than WarpDrive, FIDESlib, FAB, HEAP, and Cheddar, establishing AI ASIC as the SotA efficient platform for HE operators. Code: https://github.com/EfficientPPML/CROSS

#76 Unconditional foundations for supersingular isogeny-based cryptography

著者: Arthur Herl\'edan Le Merdy (ENS de Lyon, UMPA-ENSL), Benjamin Wesolowski (ENS de Lyon, CNRS, UMPA-ENSL)

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2502.17010

要約:
In this paper, we prove that the supersingular isogeny problem (Isogeny), endomorphism ring problem (EndRing) and maximal order problem (MaxOrder) are equivalent under probabilistic polynomial time reductions, unconditionally. Isogeny-based cryptography is founded on the presumed hardness of these problems, and their interconnection is at the heart of the design and analysis of cryptosystems like the SQIsign digital signature scheme. Previously known reductions relied on unproven assumptions such as the generalized Riemann hypothesis. In this work, we present unconditional reductions, and extend this network of equivalences to the problem of computing the lattice of all isogenies between two supersingular elliptic curves (HomModule). For cryptographic applications, one requires computational problems to be hard on average for random instances. It is well-known that if Isogeny is hard (in the worst case), then it is hard for random instances. We extend this result by proving that if any of the above-mentionned classical problems is hard in the worst case, then all of them are hard on average. In particular, if there exist hard instances of Isogeny, then all of Isogeny, EndRing, MaxOrder and HomModule are hard on average.

#77 SmartBugBert: BERT-Enhanced Vulnerability Detection for Smart Contract Bytecode

著者: Jiuyang Bu, Wenkai Li, Zongwei Li, Zeng Zhang, Xiaoqi Li

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2504.05002

要約:
Smart contracts deployed on blockchain platforms are vulnerable to various security vulnerabilities. However, only a small number of Ethereum contracts have released their source code, so vulnerability detection at the bytecode level is crucial. This paper introduces SmartBugBert, a novel approach that combines BERT-based deep learning with control flow graph (CFG) analysis to detect vulnerabilities directly from bytecode. Our method first decompiles smart contract bytecode into optimized opcode sequences, extracts semantic features using TF-IDF, constructs control flow graphs to capture execution logic, and isolates vulnerable CFG fragments for targeted analysis. By integrating both semantic and structural information through a fine-tuned BERT model and LightGBM classifier, our approach effectively identifies four critical vulnerability types: transaction-ordering, access control, self-destruct, and timestamp dependency vulnerabilities. Experimental evaluation on 6,157 Ethereum smart contracts demonstrates that SmartBugBert achieves 90.62% precision, 91.76% recall, and 91.19% F1-score, significantly outperforming existing detection methods. Ablation studies confirm that the combination of semantic features with CFG information substantially enhances detection performance. Furthermore, our approach maintains efficient detection speed (0.14 seconds per contract), making it practical for large-scale vulnerability assessment.

#78 Enhancing Smart Contract Vulnerability Detection in DApps Leveraging Fine-Tuned LLM

著者: Jiuyang Bu, Wenkai Li, Zongwei Li, Zeng Zhang, Xiaoqi Li

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2504.05006

要約:
Decentralized applications (DApps) face significant security risks due to vulnerabilities in smart contracts, with traditional detection methods struggling to address emerging and machine-unauditable flaws. This paper proposes a novel approach leveraging fine-tuned Large Language Models (LLMs) to enhance smart contract vulnerability detection. We introduce a comprehensive dataset of 215 real-world DApp projects (4,998 contracts), including hard-to-detect logical errors like token price manipulation, addressing the limitations of existing simplified benchmarks. By fine-tuning LLMs (Llama3-8B and Qwen2-7B) with Full-Parameter Fine-Tuning (FFT) and Low-Rank Adaptation (LoRA), our method achieves superior performance, attaining an F1-score of 0.83 with FFT and data augmentation via Random Over Sampling (ROS). Comparative experiments demonstrate significant improvements over prompt-based LLMs and state-of-the-art tools. Notably, the approach excels in detecting non-machine-auditable vulnerabilities, achieving 0.97 precision and 0.68 recall for price manipulation flaws. The results underscore the effectiveness of domain-specific LLM fine-tuning and data augmentation in addressing real-world DApp security challenges, offering a robust solution for blockchain ecosystem protection.

#79 Malicious Code Detection in Smart Contracts via Opcode Vectorization

著者: Huanhuan Zou, Zongwei Li, Xiaoqi Li

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2504.12720

要約:
With the booming development of blockchain technology, smart contracts have been widely used in finance, supply chain, Internet of things and other fields in recent years. However, the security problems of smart contracts become increasingly prominent. Security events caused by smart contracts occur frequently, and the existence of malicious codes may lead to the loss of user assets and system crash. In this paper, a simple study is carried out on malicious code detection of intelligent contracts based on machine learning. The main research work and achievements are as follows: Feature extraction and vectorization of smart contract are the first step to detect malicious code of smart contract by using machine learning method, and feature processing has an important impact on detection results. In this paper, an opcode vectorization method based on smart contract text is adopted. Based on considering the structural characteristics of contract opcodes, the opcodes are classified and simplified. Then, N-Gram (N=2) algorithm and TF-IDF algorithm are used to convert the simplified opcodes into vectors, and then put into the machine learning model for training. In contrast, N-Gram algorithm and TF-IDF algorithm are directly used to quantify opcodes and put into the machine learning model training. Judging which feature extraction method is better according to the training results. Finally, the classifier chain is applied to the intelligent contract malicious code detection.

#80 DuFFin: A Dual-Level Fingerprinting Framework for LLMs IP Protection

著者: Yuliang Yan, Haochun Tang, Shuo Yan, Enyan Dai

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2505.16530

要約:
Large language models (LLMs) are considered valuable Intellectual Properties (IP) for legitimate owners due to the enormous computational cost of training. It is crucial to protect the IP of LLMs from malicious stealing or unauthorized deployment. Despite existing efforts in watermarking and fingerprinting LLMs, these methods either impact the text generation process or are limited in white-box access to the suspect model, making them impractical. Hence, we propose DuFFin, a novel $\textbf{Du}$al-Level $\textbf{Fin}$gerprinting $\textbf{F}$ramework for black-box setting ownership verification. DuFFin extracts the trigger pattern and the knowledge-level fingerprints to identify the source of a suspect model. We conduct experiments on a variety of models collected from the open-source website, including four popular base models as protected LLMs and their fine-tuning, quantization, and safety alignment versions, which are released by large companies, start-ups, and individual users. Results show that our method can accurately verify the copyright of the base protected LLM on their model variants, achieving the IP-ROC metric greater than 0.95. Our code is available at https://github.com/yuliangyan0807/llm-fingerprint.

#81 Penetration Testing for System Security: Methods and Practical Approaches

著者: Wei Zhang, Ju Xing, Xiaoqi Li

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2505.19174

要約:
Penetration testing refers to the process of simulating hacker attacks to evaluate the security of information systems . This study aims not only to clarify the theoretical foundations of penetration testing but also to explain and demonstrate the complete testing process, including how network system administrators may simulate attacks using various penetration testing methods. Methodologically, the paper outlines the five basic stages of a typical penetration test: intelligence gathering, vulnerability scanning, vulnerability exploitation, privilege escalation, and post-exploitation activities. In each phase, specific tools and techniques are examined in detail, along with practical guidance on their use. To enhance the practical relevance of the study, the paper also presents a real-life case study, illustrating how a complete penetration test is conducted in a real-world environment. Through this case, readers can gain insights into the detailed procedures and applied techniques, thereby deepening their understanding of the practical value of penetration testing. Finally, the paper summarizes the importance and necessity of penetration testing in securing information systems and maintaining network integrity, and it explores future trends and development directions for the field. Overall, the findings of this paper offer valuable references for both researchers and practitioners, contributing meaningfully to the improvement of penetration testing practices and the advancement of cybersecurity as a whole.

#82 When Blockchain Meets Crawlers: Real-time Market Analytics in Solana NFT Markets

著者: Chengxin Shen, Zhongwen Li, Xiaoqi Li, Zongwei Li

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2506.02892

要約:
In this paper, we design and implement a web crawler system based on the Solana blockchain for the automated collection and analysis of market data for popular non-fungible tokens (NFTs) on the chain. Firstly, the basic information and transaction data of popular NFTs on the Solana chain are collected using the Selenium tool. Secondly, the transaction records of the Magic Eden trading market are thoroughly analyzed by combining them with the Scrapy framework to examine the price fluctuations and market trends of NFTs. In terms of data analysis, this paper employs time series analysis to examine the dynamics of the NFT market and seeks to identify potential price patterns. In addition, the risk and return of different NFTs are evaluated using the mean-variance optimization model, taking into account their characteristics, such as illiquidity and market volatility, to provide investors with data-driven portfolio recommendations. The experimental results show that the combination of crawler technology and financial analytics can effectively analyze NFT data on the Solana blockchain and provide timely market insights and investment strategies. This study provides a reference for further exploration in the field of digital currencies.

#83 Information Security Based on LLM Approaches: A Review

著者: Chang Gong, Zhongwen Li, Xiaoqi Li

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2507.18215

要約:
Information security is facing increasingly severe challenges, and traditional protection means are difficult to cope with complex and changing threats. In recent years, as an emerging intelligent technology, large language models (LLMs) have shown a broad application prospect in the field of information security. In this paper, we focus on the key role of LLM in information security, systematically review its application progress in malicious behavior prediction, network threat analysis, system vulnerability detection, malicious code identification, and cryptographic algorithm optimization, and explore its potential in enhancing security protection performance. Based on neural networks and Transformer architecture, this paper analyzes the technical basis of large language models and their advantages in natural language processing tasks. It is shown that the introduction of large language modeling helps to improve the detection accuracy and reduce the false alarm rate of security systems. Finally, this paper summarizes the current application results and points out that it still faces challenges in model transparency, interpretability, and scene adaptability, among other issues. It is necessary to explore further the optimization of the model structure and the improvement of the generalization ability to realize a more intelligent and accurate information security protection system.

#84 Security Analysis of ChatGPT: Threats and Privacy Risks

privacy

著者: Yushan Xiang, Zhongwen Li, Xiaoqi Li

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2508.09426

要約:
As artificial intelligence technology continues to advance, chatbots are becoming increasingly powerful. Among them, ChatGPT, launched by OpenAI, has garnered widespread attention globally due to its powerful natural language processing capabilities based on the GPT model, which enables it to engage in natural conversations with users, understand various forms of linguistic expressions, and generate useful information and suggestions. However, as its application scope expands, user demand grows, and malicious attacks related to it become increasingly frequent, the security threats and privacy risks faced by ChatGPT are gradually coming to the forefront. In this paper, the security of ChatGPT is mainly studied from two aspects, security threats and privacy risks. The article systematically analyzes various types of vulnerabilities involved in the above two types of problems and their causes. Briefly, we discuss the controversies that ChatGPT may cause at the ethical and moral levels. In addition, this paper reproduces several network attack and defense test scenarios by simulating the attacker's perspective and methodology. Simultaneously, it explores the feasibility of using ChatGPT for security vulnerability detection and security tool generation from the defender's perspective.

#85 MoveScanner: Analysis of Security Risks of Move Smart Contracts

著者: Yuhe Luo, Zhongwen Li, Xiaoqi Li

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2508.17964

要約:
As blockchain technology continues to evolve, the security of smart contracts has increasingly drawn attention from both academia and industry. The Move language, with its unique resource model and linear type system, provides a solid foundation for the security of digital assets. However, smart contracts still face new security challenges due to developer programming errors and the potential risks associated with cross-module interactions. This paper systematically analyzes the limitations of existing security tools within the Move ecosystem and reveals their unique vulnerability patterns. To address these issues, it introduces MoveScanner, a static analysis tool based on a control flow graph and data flow analysis architecture. By incorporating cross-module call graph tracking, MoveScanner can effectively identify five key types of security vulnerabilities, including resource leaks, weak permission management, and arithmetic overflows. In terms of design, MoveScanner adheres to a modular principle, supports bytecode-level analysis and multi-chain adaptation, and introduces innovative resource trajectory tracking algorithms and capability matrix analysis methods, thereby significantly reducing the false positive rate. Empirical results show that MoveScanner achieved 88.2% detection accuracy in benchmark testing, filling the gap in security tools in the Move ecosystem. Furthermore, this paper identifies twelve new types of security risks based on the resource-oriented programming paradigm and provides a theoretical foundation and practical experience for the development of smart contract security mechanisms. Future work will focus on combining formal verification and dynamic analysis techniques to build a security protection framework covering the entire contract lifecycle

#86 Environmental Injection Attacks against GUI Agents in Realistic Dynamic Environments

agent

著者: Yitong Zhang, Ximo Li, Liyi Cai, Jia Li

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2509.11250

要約:
Graphical User Interface (GUI) agents are increasingly deployed to interact with online web services, yet their exposure to open-world content renders them vulnerable to Environmental Injection Attacks (EIAs). In these attacks, an attacker can inject crafted triggers into website to manipulate the behavior of GUI agents used by other users. In this paper, we find that most existing EIA studies fall short of realism. In particular, they fail to capture the dynamic nature of real-world web content, often assuming that a trigger's on-screen position and surrounding visual context remain largely consistent between training and testing. To better reflect practice, we introduce a realistic dynamic-environment threat model in which the attacker is a regular user and the trigger is embedded within a dynamically changing environment. Under this threat model, existing approaches largely fail, suggesting that their effectiveness in exposing GUI agent vulnerabilities has been substantially overestimated. To expose the hidden vulnerabilities of existing GUI agents effectively, we propose Chameleon, an attack framework with two key novelties designed for dynamic environments. (1) To synthesize more realistic training data, we introduce LLM-Driven Environment Simulation, which automatically generates diverse, high-fidelity webpage simulations that mimic the variability of real-world dynamic environments. (2) To optimize the trigger more effectively, we introduce Attention Black Hole, which converts attention weights into explicit supervisory signals. This mechanism encourages the agent to remain insensitive to irrelevant surrounding content, thereby improving robustness in dynamic environments. We evaluate Chameleon on six realistic websites and four representative LVLM-powered GUI agents, where it significantly outperforms existing methods.

#87 SLasH-DSA: Breaking SLH-DSA Using an Extensible End-To-End Rowhammer Framework

著者: Jeremy Boy, Antoon Purnal, Anna P\"atschke, Luca Wilke, Thomas Eisenbarth

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2509.13048

要約:
As quantum computing advances, Post-Quantum Cryptography (PQC) schemes are adopted to replace classical algorithms. Among them is the Stateless Hash-Based Digital Signature Algorithm (SLH-DSA) that was recently standardized by NIST and is favored for its conservative security basis. In this work, we present the first software-only universal forgery attack on SLH-DSA, leveraging Rowhammer-induced bit flips to corrupt the internal state and forge signatures. While prior work targeted embedded systems and required physical access, our attack is software-only, targeting commodity desktop and server hardware, significantly broadening the threat model. We demonstrate full end-to-end attacks against SLH-DSA in OpenSSL 3.5.1, achieving universal forgery for the SHAKE-128f (deterministic), SHA2-128s, and SHAKE-192f (randomized) parameter sets after one hour (deterministic) or eight hours (randomized) of hammering and post-processing ranging from minutes to an hour, and showing theoretical attack complexities for most parameter sets. Our post-processing is informed by a novel complexity analysis that, given a concrete set of faulty signatures, identifies the most promising computational path to pursue. To enable the attack, we introduce Swage, a modular and extensible framework for implementing end-to-end Rowhammer-based fault attacks. Swage abstracts and automates key components of practical Rowhammer attacks. Unlike prior tooling, Swage is untangled from the attacked code, making it reusable and suitable for frictionless analysis of different targets. Our findings highlight that even theoretically sound PQC schemes can fail under real-world conditions, underscoring the need for additional implementation hardening or hardware defenses against Rowhammer.

#88 Security Analysis of Web Applications Based on Gruyere

著者: Yonghao Ni, Zhongwen Li, Xiaoqi Li

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2509.14706

要約:
With the rapid development of Internet technologies, web systems have become essential infrastructures for modern information exchange and business operations. However, alongside their expansion, numerous security vulnerabilities have emerged, making web security a critical research focus within the broader field of cybersecurity. These issues are closely related to data protection, privacy preservation, and business continuity, and systematic research on web security is crucial for mitigating malicious attacks and enhancing the reliability and robustness of network systems. This paper first reviews the OWASP Top 10, summarizing the types, causes, and impacts of common web vulnerabilities, and illustrates their exploitation mechanisms through representative cases. Building upon this, the Gruyere platform is adopted as an experimental subject for analyzing known vulnerabilities. The study presents detailed reproduction steps for specific vulnerabilities, proposes comprehensive remediation strategies, and further compares Gruyere's vulnerabilities with contemporary real-world cases. The findings suggest that, although Gruyere's vulnerabilities are relatively outdated, their underlying principles remain highly relevant for explaining a wide range of modern security flaws. Overall, this research demonstrates that web system security analysis based on Gruyere not only deepens the understanding of vulnerability mechanisms but also provides practical support for technological innovation and security defense.

#89 STAC: When Innocent Tools Form Dangerous Chains to Jailbreak LLM Agents

agent

著者: Jing-Jing Li, Jianfeng He, Chao Shang, Devang Kulshreshtha, Xun Xian, Yi Zhang, Hang Su, Sandesh Swamy, Yanjun Qi

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2509.25624

要約:
As LLMs advance into autonomous agents with tool-use capabilities, they introduce security challenges that extend beyond traditional content-based LLM safety concerns. This paper introduces Sequential Tool Attack Chaining (STAC), a novel multi-turn attack framework that exploits agent tool use. STAC chains together tool calls that each appear harmless in isolation but, when combined, collectively enable harmful operations that only become apparent at the final execution step. We apply our framework to automatically generate and systematically evaluate 483 STAC cases, featuring 1,352 sets of user-agent-environment interactions and spanning diverse domains, tasks, agent types, and 10 failure modes. Our evaluations show that state-of-the-art LLM agents, including GPT-4.1, are highly vulnerable to STAC, with attack success rates (ASR) exceeding 90% in most cases. The core design of STAC's automated framework is a closed-loop pipeline that synthesizes executable multi-step tool chains, validates them through in-environment execution, and reverse-engineers stealthy multi-turn prompts that reliably induce agents to execute the verified malicious sequence. We further perform defense analysis against STAC and find that existing prompt-based defenses provide limited protection. To address this gap, we propose a new reasoning-driven defense prompt that achieves far stronger protection, cutting ASR by up to 28.8%. These results highlight a crucial gap: defending tool-enabled agents requires reasoning over entire action sequences and their cumulative effects, rather than evaluating isolated prompts or responses.

#90 When Search Goes Wrong: Red-Teaming Web-Augmented Large Language Models

著者: Haoran Ou, Kangjie Chen, Xingshuo Han, Gelei Deng, Jie Zhang, Han Qiu, Tianwei Zhang

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2510.09689

要約:
Large Language Models (LLMs) have been augmented with web search to overcome the limitations of the static knowledge boundary by accessing up-to-date information from the open Internet. While this integration enhances model capability, it also introduces a distinct safety threat surface: the retrieval and citation process has the potential risk of exposing users to harmful or low-credibility web content. Existing red-teaming methods are largely designed for standalone LLMs as they primarily focus on unsafe generation, ignoring risks emerging from the complex search workflow. To address this gap, we propose CREST-Search, a pioneering red-teaming framework for LLMs with web search. The cornerstone of CREST-Search is three novel attack strategies that generate seemingly benign search queries yet induce unsafe citations. It also employs an iterative in-context refinement mechanism to strengthen adversarial effectiveness under black-box constraints. In addition, we construct a search-specific harmful dataset, WebSearch-Harm, which enables fine-tuning a specialized red-teaming model to improve query quality. Our experiments demonstrate that CREST-Search can effectively bypass safety filters and systematically expose vulnerabilities in web search-based LLM systems, underscoring the necessity of the development of robust search models.

#91 System Password Security: Attack and Defense Mechanisms

著者: Chaofang Shi, Zhongwen Li, Xiaoqi Li

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2510.10246

要約:
System passwords serve as critical credentials for user authentication and access control when logging into operating systems or applications. Upon entering a valid password, users pass verification to access system resources and execute corresponding operations. In recent years, frequent password cracking attacks targeting system passwords have posed a severe threat to information system security. To address this challenge, in-depth research into password cracking attack methods and defensive technologies holds significant importance. This paper conducts systematic research on system password security, focusing on analyzing typical password cracking methods such as brute force attacks, dictionary attacks, and rainbow table attacks, while evaluating the effectiveness of existing defensive measures. The experimental section utilizes common cryptanalysis tools, such as John the Ripper and Hashcat, to simulate brute force and dictionary attacks. Five test datasets, each generated using Message Digest Algorithm 5 (MD5), Secure Hash Algorithm 256-bit (SHA 256), and bcrypt hash functions, are analyzed. By comparing the overall performance of different hash algorithms and password complexity strategies against these attacks, the effectiveness of defensive measures such as salting and slow hashing algorithms is validated. Building upon this foundation, this paper further evaluates widely adopted defense mechanisms, including account lockout policies, multi-factor authentication, and risk adaptive authentication. By integrating experimental data with recent research findings, it analyzes the strengths and limitations of each approach while proposing feasible improvement recommendations and optimization strategies.

#92 MalCVE: Malware Detection and CVE Association Using Large Language Models

著者: Eduard Andrei Cristea, Petter Molnes, Jingyue Li

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2510.15567

要約:
Malicious software attacks are having an increasingly significant economic impact. Commercial malware detection software can be costly, and tools that attribute malware to the specific software vulnerabilities it exploits are largely lacking. Understanding the connection between malware and the vulnerabilities it targets is crucial for analyzing past threats and proactively defending against current ones. In this study, we propose an approach that leverages large language models (LLMs) to detect binary malware, specifically within JAR files, and uses LLM capabilities combined with retrieval-augmented generation (RAG) to identify Common Vulnerabilities and Exposures (CVEs) that malware may exploit. We developed a proof-of-concept tool, MalCVE, that integrates binary code decompilation, deobfuscation, LLM-based code summarization, semantic similarity search, and LLM-based CVE classification. We evaluated MalCVE using a benchmark dataset of 3,839 JAR executables. MalCVE achieved a mean malware-detection accuracy of 97%, at a fraction of the cost of commercial solutions. In particular, the results demonstrate that LLM-based code summarization enables highly accurate and explainable malware identification. MalCVE is also the first tool to associate CVEs with binary malware, achieving a recall@10 of 65%, which is comparable to studies that perform similar analyses on source code.

#93 Can Transformer Memory Be Corrupted? Investigating Cache-Side Vulnerabilities in Large Language Models

著者: Elias Hossain, Swayamjit Saha, Somshubhra Roy, Ravi Prasad

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2510.17098

要約:
Even when prompts and parameters are secured, transformer language models remain vulnerable because their key-value (KV) cache during inference constitutes an overlooked attack surface. This paper introduces Malicious Token Injection (MTI), a modular framework that systematically perturbs cached key vectors at selected layers and timesteps through controlled magnitude and frequency, using additive Gaussian noise, zeroing, and orthogonal rotations. A theoretical analysis quantifies how these perturbations propagate through attention, linking logit deviations to the Frobenius norm of corruption and softmax Lipschitz dynamics. Empirical results show that MTI significantly alters next-token distributions and downstream task performance across GPT-2 and LLaMA-2/7B, as well as destabilizes retrieval-augmented and agentic reasoning pipelines. These findings identify cache integrity as a critical yet underexplored vulnerability in current LLM deployments, positioning cache corruption as a reproducible and theoretically grounded threat model for future robustness and security research.

#94 HAMLOCK: HArdware-Model LOgically Combined attacK

著者: Sanskar Amgain, Daniel Lobo, Atri Chatterjee, Swarup Bhunia, Fnu Suya

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2510.19145

要約:
The growing use of third-party hardware accelerators (e.g., FPGAs, ASICs) for deep neural networks (DNNs) introduces new security vulnerabilities. Conventional model-level backdoor attacks, which only poison a model's weights to misclassify inputs with a specific trigger, are often detectable because the entire attack logic is embedded within the model (i.e., software), creating a traceable layer-by-layer activation path. This paper introduces the HArdware-Model Logically Combined Attack (HAMLOCK), a far stealthier threat that distributes the attack logic across the hardware-software boundary. The software (model) is now only minimally altered by tuning the activations of few neurons to produce uniquely high activation values when a trigger is present. A malicious hardware Trojan detects those unique activations by monitoring the corresponding neurons' most significant bit or the 8-bit exponents and triggers another hardware Trojan to directly manipulate the final output logits for misclassification. This decoupled design is highly stealthy, as the model itself contains no complete backdoor activation path as in conventional attacks and hence, appears fully benign. Empirically, across benchmarks like MNIST, CIFAR10, GTSRB, and ImageNet, HAMLOCK achieves a near-perfect attack success rate with a negligible clean accuracy drop. More importantly, HAMLOCK circumvents the state-of-the-art model-level defenses without any adaptive optimization. The hardware Trojan is also undetectable, incurring area and power overheads as low as 0.01%, which is easily masked by process and environmental noise. Our findings expose a critical vulnerability at the hardware-software interface, demanding new cross-layer defenses against this emerging threat.

#95 Beyond Imprecise Distance Metrics: Trace-Guided Directed Greybox Fuzzing via LLM-Predicted Call Stacks

著者: Yifan Zhang, Xin Zhang

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2510.23101

要約:
Directed greybox fuzzing (DGF) aims to efficiently trigger bugs at specific target locations by prioritizing seeds whose execution paths are more likely to reach the targets. However, existing DGF approaches suffer from imprecise potential estimation due to their reliance on static-analysis-based distance metrics. The over-approximation inherent in static analysis causes many seeds with execution paths irrelevant to vulnerability triggering to be mistakenly prioritized, significantly reducing fuzzing efficiency. To address this issue, we propose trace-guided directed greybox fuzzing (TDGF). TDGF replaces static-analysis-based distance metrics with vulnerability-oriented execution information (referred to as guidance traces) to steer directed fuzzing: seeds whose execution paths overlap more with the guidance traces are scheduled earlier for mutation. We empirically study two representative types of guidance traces: the control-flow trace and the call-stack trace of vulnerability-triggering executions. We find that the fine-grained control-flow traces offer nearly the same guidance capability as the coarse-grained call-stack traces, while call-stack traces are also easier for large language models (LLMs) to predict. Based on this insight, we further propose a framework that leverages LLMs to predict the call stack at vulnerability-triggering time and uses it to guide DGF. We implement our approach and evaluate it against several state-of-the-art fuzzers with experiments totaling 58.4 CPU-years. On a suite of real-world programs, our approach triggers vulnerabilities 2.13$\times$ to 3.14$\times$ faster than the baselines. Moreover, through directed patch testing on the latest program versions used in our controlled experiments, our approach discovers 10 new vulnerabilities and 2 incomplete fixes, with 10 assigned CVE IDs.

#96 GeoGuard: UWB Timing-Encoded Key Reconstruction for Location-Dependent, Geographically Bounded Decryption

privacy

著者: Kunal Mukherjee

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.14032

要約:
Digital content distribution and propitiatory research driven industries face persistent risks from intellectual property theft and unauthorized redistribution. Conventional encryption schemes such as AES, TDES, ECC, and ElGamal provide strong cryptographic guarantees, but they remain fundamentally agnostic to where decryption takes place. In practice, this means that once a decryption key is leaked or intercepted, any adversary can misuse the key to decrypt the protected content from any location. This paper presents, GeoGuard, a location-dependent cryptosystem in which the decryption key is not transmitted as data but is implicitly encoded in the precise time-of-flight differences of ultra-wideband (UWB) data transmission packets. The system leverages precise timing hardware and a custom Timing-encoded Cryptographic Keying (TiCK) protocol to map a 32-byte SHA-256 AES key onto scheduled transmission timestamps. Only user located within an approved spatial location can observe the correct packet timing that aligns with the intended packet-reception timing pattern, enabling them to reconstruct the key. Eavesdroppers outside the authorized region observe an incorrect timing pattern, which yields incorrect keys. GeoGuard is designed to encrypt and transmit data, but decryption is only possible when the user is within the authorized area. Our evaluation demonstrates that the system (i) removes the need to share decryption passwords electronically or physically, (ii) ensures the decryption key cannot be recovered by the eavesdropper, and (iii) provides a non-trivial spatial tolerance for legitimate users

#97 PSM: Prompt Sensitivity Minimization via LLM-Guided Black-Box Optimization

著者: Huseein Jawad, Nicolas Brunel

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.16209

要約:
System prompts are critical for guiding the behavior of Large Language Models (LLMs), yet they often contain proprietary logic or sensitive information, making them a prime target for extraction attacks. Adversarial queries can successfully elicit these hidden instructions, posing significant security and privacy risks. Existing defense mechanisms frequently rely on heuristics, incur substantial computational overhead, or are inapplicable to models accessed via black-box APIs. This paper introduces a novel framework for hardening system prompts through shield appending, a lightweight approach that adds a protective textual layer to the original prompt. Our core contribution is the formalization of prompt hardening as a utility-constrained optimization problem. We leverage an LLM-as-optimizer to search the space of possible SHIELDs, seeking to minimize a leakage metric derived from a suite of adversarial attacks, while simultaneously preserving task utility above a specified threshold, measured by semantic fidelity to baseline outputs. This black-box, optimization-driven methodology is lightweight and practical, requiring only API access to the target and optimizer LLMs. We demonstrate empirically that our optimized SHIELDs significantly reduce prompt leakage against a comprehensive set of extraction attacks, outperforming established baseline defenses without compromising the model's intended functionality. Our work presents a paradigm for developing robust, utility-aware defenses in the escalating landscape of LLM security. The code is made public on the following link: https://github.com/psm-defense/psm

#98 Inside Qubic's Selfish Mining Campaign on Monero: Evidence, Tactics, and Limits

著者: Suhyeon Lee, Hyeongyeong Kim

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.01437

要約:
We analyze Qubic's publicly claimed selfish mining attack against Monero in 2025. By combining measurements from Monero nodes, the Qubic pool API, and Qubic-network observations, we reconstruct Qubic-attributed blocks and effective hashrate and identify ten intervals consistent with block withholding and strategic release. During these intervals, Qubic's average hashrate share rises to 23--34\%, yet we never observe sustained majority control. We evaluate the attack using the classical selfish mining model and a Markov-chain variant that captures Qubic's conservative release policy. At the inferred parameters, both models predict revenues below honest mining, and our measurements largely confirm this while showing systematic deviations. We attribute the gap to hashrate variability, coarse-grained interval detection, and operational frictions under community countermeasures. We further argue that selfish mining should be analyzed under time-varying hashrate. Even when the average hashrate stays below the static break-even point, an attacker can still run a profitable selfish-mining operation by operating it intermittently.

#99 AtomGraph: Tackling Atomicity Violation in Smart Contracts using Multimodal GCNs

著者: Xiaoqi Li, Zongwei Li, Wenkai Li, Zeng Zhang, Lei Xie

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.02399

要約:
Smart contracts are a core component of blockchain technology and are widely deployed across various scenarios. However, atomicity violations have become a potential security risk. Existing analysis tools often lack the precision required to detect these issues effectively. To address this challenge, we introduce AtomGraph, an automated framework designed for detecting atomicity violations. This framework leverages Graph Convolutional Networks (GCN) to identify atomicity violations through multimodal feature learning and fusion. Specifically, driven by a collaborative learning mechanism, the model simultaneously learns from two heterogeneous modalities: extracting structural topological features from the contract's Control Flow Graph (CFG) and uncovering deep semantics from its opcode sequence. We designed an adaptive weighted fusion mechanism to dynamically adjust the weights of features from each modality to achieve optimal feature fusion. Finally, GCN detects graph-level atomicity violation on the contract. Comprehensive experimental evaluations demonstrate that AtomGraph achieves 96.88% accuracy and 96.97% F1 score, outperforming existing tools. Furthermore, compared to the concatenation fusion model, AtomGraph improves the F1 score by 6.4%, proving its potential in smart contract security detection.

#100 No More Hidden Pitfalls? Exposing Smart Contract Bad Practices with LLM-Powered Hybrid Analysis

著者: Xiaoqi Li, Zongwei Li, Wenkai Li, Yuqing Zhang, Xin Wang

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.15179

要約:
As the Ethereum platform continues to mature and gain widespread usage, it is crucial to maintain high standards of smart contract writing practices. While bad practices in smart contracts may not directly lead to security issues, they elevate the risk of encountering problems. Therefore, to understand and avoid these bad practices, this paper introduces the first systematic study of bad practices in smart contracts, delving into over 47 specific issues. Specifically, we propose SCALM, an LLM-powered framework featuring two methodological innovations: (1) A hybrid architecture that combines context-aware function-level slicing with knowledge-enhanced semantic reasoning via extensible vectorized pattern matching. (2) A multi-layer reasoning verification system connects low-level code patterns with high-level security principles through syntax, design patterns, and architecture analysis. Our extensive experiments using multiple LLMs and datasets have shown that SCALM outperforms existing tools in detecting bad practices in smart contracts.

#101 DREAM: Dynamic Red-teaming across Environments for AI Models

著者: Liming Lu, Xiang Gu, Junyu Huang, Jiawei Du, Xu Zheng, Yunhuai Liu, Yongbin Zhou, Shuchao Pang

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.19016

要約:
Large Language Models (LLMs) are increasingly used in agentic systems, where their interactions with diverse tools and environments create complex, multi-stage safety challenges. However, existing benchmarks mostly rely on static, single-turn assessments that miss vulnerabilities from adaptive, long-chain attacks. To fill this gap, we introduce DREAM, a framework for systematic evaluation of LLM agents against dynamic, multi-stage attacks. At its core, DREAM uses a Cross-Environment Adversarial Knowledge Graph (CE-AKG) to maintain stateful, cross-domain understanding of vulnerabilities. This graph guides a Contextualized Guided Policy Search (C-GPS) algorithm that dynamically constructs attack chains from a knowledge base of 1,986 atomic actions across 349 distinct digital environments. Our evaluation of 12 leading LLM agents reveals a critical vulnerability: these attack chains succeed in over 70% of cases for most models, showing the power of stateful, cross-environment exploits. Through analysis of these failures, we identify two key weaknesses in current agents: contextual fragility, where safety behaviors fail to transfer across environments, and an inability to track long-term malicious intent. Our findings also show that traditional safety measures, such as initial defense prompts, are largely ineffective against attacks that build context over multiple interactions. To advance agent safety research, we release DREAM as a tool for evaluating vulnerabilities and developing more robust defenses.

#102 The Semantic Trap: Do Fine-tuned LLMs Learn Vulnerability Root Cause or Just Functional Pattern?

著者: Feiyang Huang, Yuqiang Sun, Fan Zhang, Ziqi Yang, Han Liu, Yang Liu

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.22655

要約:
LLMs demonstrate promising performance in software vulnerability detection after fine-tuning. However, it remains unclear whether these gains reflect a genuine understanding of vulnerability root causes or merely an exploitation of functional patterns. In this paper, we identify a critical failure mode termed the "semantic trap," where fine-tuned LLMs achieve high detection scores by associating certain functional domains with vulnerability likelihood rather than reasoning about the underlying security semantics. To systematically evaluate this phenomenon, we propose TrapEval, a comprehensive evaluation framework designed to disentangle vulnerability root cause from functional pattern. TrapEval introduces two complementary datasets derived from real-world open-source projects: V2N, which pairs vulnerable code with unrelated benign code, and V2P, which pairs vulnerable code with its corresponding patched version, forcing models to distinguish near-identical code that differs only in subtle security-critical logic. Using TrapEval, we fine-tune five representative state-of-the-art LLMs across three model families and evaluate them under cross-dataset testing, semantic-preserving perturbations, and varying degrees of semantic gap measured by CodeBLEU. Our empirical results reveal that, despite improvements in metrics, fine-tuned LLMs consistently struggle to distinguish vulnerable code from its patched counterpart, exhibit severe robustness degradation under minor semantic-preserving transformations, and rely heavily on functional-context shortcuts when the semantic gap is small. These findings provide strong evidence that current fine-tuning practices often fail to impart true vulnerability reasoning. Our findings serve as a wake-up call: high benchmark scores on traditional datasets may be illusory, masking the model's inability to understand the true causal logic of vulnerabilities.

#103 SVIP: Towards Verifiable Inference of Open-source Large Language Models

著者: Yifan Sun, Yuhang Li, Yue Zhang, Yuchen Jin, Huan Zhang

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2410.22307

要約:
The ever-increasing size of open-source Large Language Models (LLMs) renders local deployment impractical for individual users. Decentralized computing has emerged as a cost-effective solution, allowing individuals and small companies to perform LLM inference for users using surplus computational power. However, a computing provider may stealthily substitute the requested LLM with a smaller, less capable model without consent from users, thereby benefiting from cost savings. We introduce SVIP, a secret-based verifiable LLM inference protocol. Unlike existing solutions based on cryptographic or game-theoretic techniques, our method is computationally effective and does not rest on strong assumptions. Our protocol requires the computing provider to return both the generated text and processed hidden representations from LLMs. We then train a proxy task on these representations, effectively transforming them into a unique model identifier. With our protocol, users can reliably verify whether the computing provider is acting honestly. A carefully integrated secret mechanism further strengthens its security. We thoroughly analyze our protocol under multiple strong and adaptive adversarial scenarios. Our extensive experiments demonstrate that SVIP is accurate, generalizable, computationally efficient, and resistant to various attacks. Notably, SVIP achieves false negative rates below 5% and false positive rates below 3%, while requiring less than 0.01 seconds per prompt query for verification.

#104 Short-length Adversarial Training Helps LLMs Defend Long-length Jailbreak Attacks: Theoretical and Empirical Evidence

著者: Shaopeng Fu, Liang Ding, Jingfeng Zhang, Di Wang

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2502.04204

要約:
Jailbreak attacks against large language models (LLMs) aim to induce harmful behaviors in LLMs through carefully crafted adversarial prompts. To mitigate attacks, one way is to perform adversarial training (AT)-based alignment, i.e., training LLMs on some of the most adversarial prompts to help them learn how to behave safely under attacks. During AT, the length of adversarial prompts plays a critical role in the robustness of aligned LLMs. While long-length adversarial prompts during AT might lead to strong LLM robustness, their synthesis however is very resource-consuming, which may limit the application of LLM AT. This paper focuses on adversarial suffix jailbreak attacks and unveils that to defend against a jailbreak attack with an adversarial suffix of length $\Theta(M)$, it is enough to align LLMs on prompts with adversarial suffixes of length $\Theta(\sqrt{M})$. Theoretically, we analyze the adversarial in-context learning of linear transformers on linear regression tasks and prove a robust generalization bound for trained transformers. The bound depends on the term $\Theta(\sqrt{M_{\text{test}}}/M_{\text{train}})$, where $M_{\text{train}}$ and $M_{\text{test}}$ are the numbers of adversarially perturbed in-context samples during training and testing. Empirically, we conduct AT on popular open-source LLMs and evaluate their robustness against jailbreak attacks of different adversarial suffix lengths. Results confirm a positive correlation between the attack success rate and the ratio of the square root of the adversarial suffix length during jailbreaking to the length during AT. Our findings show that it is practical to defend against "long-length" jailbreak attacks via efficient "short-length" AT. The code is available at https://github.com/fshp971/adv-icl.

#105 Effective and Efficient Cross-City Traffic Knowledge Transfer: A Privacy-Preserving Perspective

privacy

著者: Zhihao Zeng, Ziquan Fang, Yuting Huang, Lu Chen, Yunjun Gao

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2503.11963

要約:
Traffic prediction aims to forecast future traffic conditions using historical traffic data, serving a crucial role in urban computing and transportation management. While transfer learning and federated learning have been employed to address the scarcity of traffic data by transferring traffic knowledge from data-rich to data-scarce cities without traffic data exchange, existing approaches in Federated Traffic Knowledge Transfer (FTT) still face several critical challenges such as potential privacy leakage, cross-city data distribution discrepancies, and low data quality, hindering their practical application in real-world scenarios. To this end, we present FedTT, a novel privacy-aware and efficient federated learning framework for cross-city traffic knowledge transfer. Specifically, our proposed framework includes three key innovations: (i) a traffic view imputation method for missing traffic data completion to enhance data quality, (ii) a traffic domain adapter for uniform traffic data transformation to address data distribution discrepancies, and (iii) a traffic secret aggregation protocol for secure traffic data aggregation to safeguard data privacy. Extensive experiments on 4 real-world datasets demonstrate that the proposed FedTT framework outperforms the 14 state-of-the-art baselines.

#106 Graph-Based Floor Separation Using Node Embeddings and Clustering of WiFi Trajectories

著者: Rabia Yasa Kostas, Kahraman Kostas

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2505.08088

要約:
Vertical localization, particularly floor separation, remains a major challenge in indoor positioning systems operating in GPS-denied multistory environments. This paper proposes a fully data-driven, graph-based framework for blind floor separation using only Wi-Fi fingerprint trajectories, without requiring prior building information or knowledge of the number of floors. In the proposed method, Wi-Fi fingerprints are represented as nodes in a trajectory graph, where edges capture both signal similarity and sequential movement context. Structural node embeddings are learned via Node2Vec, and floor-level partitions are obtained using K-Means clustering with automatic cluster number estimation. The framework is evaluated on multiple publicly available datasets, including a newly released Huawei University Challenge 2021 dataset and a restructured version of the UJIIndoorLoc benchmark. Experimental results demonstrate that the proposed approach effectively captures the intrinsic vertical structure of multistory buildings using only received signal strength data. By eliminating dependence on building-specific metadata, the proposed method provides a scalable and practical solution for vertical localization in indoor environments.

#107 Facial Recognition Leveraging Generative Adversarial Networks

著者: Zhongwen Li, Zongwei Li, Xiaoqi Li

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2505.11884

要約:
Face recognition performance based on deep learning heavily relies on large-scale training data, which is often difficult to acquire in practical applications. To address this challenge, this paper proposes a GAN-based data augmentation method with three key contributions: (1) a residual-embedded generator to alleviate gradient vanishing/exploding problems, (2) an Inception ResNet-V1 based FaceNet discriminator for improved adversarial training, and (3) an end-to-end framework that jointly optimizes data generation and recognition performance. Experimental results demonstrate that our approach achieves stable training dynamics and significantly improves face recognition accuracy by 12.7% on the LFW benchmark compared to baseline methods, while maintaining good generalization capability with limited training samples.

#108 MAS-Shield: A Defense Framework for Secure and Efficient LLM MAS

著者: Kaixiang Wang, Zhaojiacheng Zhou, Bunyod Suvonov, Jiong Lou, Jie LI

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.22924

要約:
Large Language Model (LLM)-based Multi-Agent Systems (MAS) are susceptible to linguistic attacks that can trigger cascading failures across the network. Existing defenses face a fundamental dilemma: lightweight single-auditor methods are prone to single points of failure, while robust committee-based approaches incur prohibitive computational costs in multi-turn interactions. To address this challenge, we propose \textbf{MAS-Shield}, a secure and efficient defense framework designed with a coarse-to-fine filtering pipeline. Rather than applying uniform scrutiny, MAS-Shield dynamically allocates defense resources through a three-stage protocol: (1) \textbf{Critical Agent Selection } strategically targets high-influence nodes to narrow the defense surface; (2) \textbf{Light Auditing} employs lightweight sentry models to rapidly filter the majority of benign cases; and (3) \textbf{Global Consensus Auditing} escalates only suspicious or ambiguous signals to a heavyweight committee for definitive arbitration. This hierarchical design effectively optimizes the security-efficiency trade-off. Experiments demonstrate that MAS-Shield achieves a 92.5\% recovery rate against diverse adversarial scenarios and reduces defense latency by over 70\% compared to existing methods.

#109 On the Effectiveness of Membership Inference in Targeted Data Extraction from Large Language Models

privacy

著者: Ali Al Sahili, Ali Chehab, Razane Tajeddine

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.13352

要約:
Large Language Models (LLMs) are prone to memorizing training data, which poses serious privacy risks. Two of the most prominent concerns are training data extraction and Membership Inference Attacks (MIAs). Prior research has shown that these threats are interconnected: adversaries can extract training data from an LLM by querying the model to generate a large volume of text and subsequently applying MIAs to verify whether a particular data point was included in the training set. In this study, we integrate multiple MIA techniques into the data extraction pipeline to systematically benchmark their effectiveness. We then compare their performance in this integrated setting against results from conventional MIA benchmarks, allowing us to evaluate their practical utility in real-world extraction scenarios.

#110 Sampling-Free Privacy Accounting for Matrix Mechanisms under Random Allocation

privacy

著者: Jan Schuchardt, Nikita Kalinin

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.21636

要約:
We study privacy amplification for differentially private model training with matrix factorization under random allocation (also known as the balls-in-bins model). Recent work by Choquette-Choo et al. (2025) proposes a sampling-based Monte Carlo approach to compute amplification parameters in this setting. However, their guarantees either only hold with some high probability or require random abstention by the mechanism. Furthermore, the required number of samples for ensuring $(\epsilon,\delta)$-DP is inversely proportional to $\delta$. In contrast, we develop sampling-free bounds based on R\'enyi divergence and conditional composition. The former is facilitated by a dynamic programming formulation to efficiently compute the bounds. The latter complements it by offering stronger privacy guarantees for small $\epsilon$, where R\'enyi divergence bounds inherently lead to an over-approximation. Our framework applies to arbitrary banded and non-banded matrices. Through numerical comparisons, we demonstrate the efficacy of our approach across a broad range of matrix mechanisms used in research and practice.

#111 Hardware-Triggered Backdoors

backdoor

著者: Jonas M\"oller, Erik Imgrund, Thorsten Eisenhofer, Konrad Rieck

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.21902

要約:
Machine learning models are routinely deployed on a wide range of computing hardware. Although such hardware is typically expected to produce identical results, differences in its design can lead to small numerical variations during inference. In this work, we show that these variations can be exploited to create backdoors in machine learning models. The core idea is to shape the model's decision function such that it yields different predictions for the same input when executed on different hardware. This effect is achieved by locally moving the decision boundary close to a target input and then refining numerical deviations to flip the prediction on selected hardware. We empirically demonstrate that these hardware-triggered backdoors can be created reliably across common GPU accelerators. Our findings reveal a novel attack vector affecting the use of third-party models, and we investigate different defenses to counter this threat.

#112 Semantic Leakage from Image Embeddings

著者: Yiyi Chen, Qiongkai Xu, Desmond Elliott, Qiongxiu Li, Johannes Bjerva

公開日: Tue, 03 Feb 2026 00:00:00 -0500

リンク: https://arxiv.org/abs/2601.22929

要約:
Image embeddings are generally assumed to pose limited privacy risk. We challenge this assumption by formalizing semantic leakage as the ability to recover semantic structures from compressed image embeddings. Surprisingly, we show that semantic leakage does not require exact reconstruction of the original image. Preserving local semantic neighborhoods under embedding alignment is sufficient to expose the intrinsic vulnerability of image embeddings. Crucially, this preserved neighborhood structure allows semantic information to propagate through a sequence of lossy mappings. Based on this conjecture, we propose Semantic Leakage from Image Embeddings (SLImE), a lightweight inference framework that reveals semantic information from standalone compressed image embeddings, incorporating a locally trained semantic retriever with off-the-shelf models, without training task-specific decoders. We thoroughly validate each step of the framework empirically, from aligned embeddings to retrieved tags, symbolic representations, and grammatical and coherent descriptions. We evaluate SLImE across a range of open and closed embedding models, including GEMINI, COHERE, NOMIC, and CLIP, and demonstrate consistent recovery of semantic information across diverse inference tasks. Our results reveal a fundamental vulnerability in image embeddings, whereby the preservation of semantic neighborhoods under alignment enables semantic leakage, highlighting challenges for privacy preservation.1

cs.CR updates on arXiv.org

📋 論文タイトル一覧