arXiv論文一覧 - cs.CR updates on arXiv.org

#1 Agentic Artificial Intelligence for Ethical Cybersecurity in Uganda: A Reinforcement Learning Framework for Threat Detection in Resource-Constrained Environments

agent

著者: Ibrahim Adabara, Bashir Olaniyi Sadiq, Aliyu Nuhu Shuaibu, Yale Ibrahim Danjuma, Venkateswarlu Maninti, Mutebi Joe

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.07909

要約:
Uganda's rapid digital transformation, supported by national strategies such as Vision 2040 and the Digital Transformation Roadmap, has expanded reliance on networked services while simultaneously increasing exposure to sophisticated cyber threats. In resource-constrained settings, commonly deployed rule-based intrusion detection systems lack the adaptability and ethical safeguards needed to address evolving attack patterns, leading to undetected breaches and excessive blocking of legitimate traffic. This study proposes an Agentic Artificial Intelligence (AAI) framework that integrates reinforcement learning, an explicit ethical governance layer, and human oversight to deliver adaptive and trustworthy cybersecurity. A CPU-optimized simulation environment was developed using a five-node network topology that mirrors key elements of Uganda's critical digital infrastructure and generates both benign and malicious traffic, including phishing, ransomware, and distributed denial-of-service attacks. A Q-learning agent, operating within clearly defined ethical constraints and subject to human auditability, was trained and evaluated against a traditional rule-based baseline. The AAI framework achieved a 100 percent detection rate, zero false positives, and full ethical compliance, compared with 70 percent detection and 15 percent false positives for the baseline system. These results demonstrate that agentic, ethically governed reinforcement learning can substantially improve cybersecurity effectiveness and fairness in CPU-only, resource-constrained environments, offering a practical pathway for operationalizing responsible AI in Uganda's national cybersecurity strategy.

#2 AgentCrypt: Advancing Privacy and (Secure) Computation in AI Agent Collaboration

privacyagent

著者: Harish Karthikeyan, Yue Guo, Leo de Castro, Antigoni Polychroniadou, Leo Ardon, Udari Madhushani Sehwag, Sumitra Ganesh, Manuela Veloso

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.08104

要約:
As AI agents increasingly operate in real-world, multi-agent environments, ensuring reliable and context-aware privacy in agent communication is critical, especially to comply with evolving regulatory requirements. Traditional access controls are insufficient, as privacy risks often arise after access is granted; agents may use information in ways that compromise privacy, such as messaging humans, sharing context with other agents, making tool calls, persisting data, or generating derived private information. Existing approaches often treat privacy as a binary constraint, whether data is shareable or not, overlooking nuanced, role-specific, and computation-dependent privacy needs essential for regulatory compliance. Agents, including those based on large language models, are inherently probabilistic and heuristic. There is no formal guarantee of how an agent will behave for any query, making them ill-suited for operations critical to security. To address this, we introduce AgentCrypt, a four-tiered framework for fine-grained, encrypted agent communication that adds a protection layer atop any AI agent platform. AgentCrypt spans unrestricted data exchange (Level 1) to fully encrypted computation using techniques such as homomorphic encryption (Level 4). Crucially, it guarantees the privacy of tagged data is always maintained, prioritizing privacy above correctness. AgentCrypt ensures privacy across diverse interactions and enables computation on otherwise inaccessible data, overcoming barriers such as data silos. We implemented and tested it with Langgraph and Google ADK, demonstrating versatility across platforms. We also introduce a benchmark dataset simulating privacy-critical tasks at all privacy levels, enabling systematic evaluation and fostering the development of regulatable machine learning systems for secure agent communication and computation.

#3 Detecting Ambiguity Aversion in Cyberattack Behavior to Inform Cognitive Defense Strategies

著者: Stephan Carney, Soham Hans, Sofia Hirschmann, Stacey Marsella, Yvonne Fonken, Peggy Wu, Nikolos Gurney

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.08107

要約:
Adversaries (hackers) attempting to infiltrate networks frequently face uncertainty in their operational environments. This research explores the ability to model and detect when they exhibit ambiguity aversion, a cognitive bias reflecting a preference for known (versus unknown) probabilities. We introduce a novel methodological framework that (1) leverages rich, multi-modal data from human-subjects red-team experiments, (2) employs a large language model (LLM) pipeline to parse unstructured logs into MITRE ATT&CK-mapped action sequences, and (3) applies a new computational model to infer an attacker's ambiguity aversion level in near-real time. By operationalizing this cognitive trait, our work provides a foundational component for developing adaptive cognitive defense strategies.

#4 Information-Dense Reasoning for Efficient and Auditable Security Alert Triage

著者: Guangze Zhao, Yongzheng Zhang, Changbo Tian, Dan Xie, Hongri Liu, Bailing Wang

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.08169

要約:
Security Operations Centers face massive, heterogeneous alert streams under minute-level service windows, creating the Alert Triage Latency Paradox: verbose reasoning chains ensure accuracy and compliance but incur prohibitive latency and token costs, while minimal chains sacrifice transparency and auditability. Existing solutions fail: signature systems are brittle, anomaly methods lack actionability, and fully cloud-hosted LLMs raise latency, cost, and privacy concerns. We propose AIDR, a hybrid cloud-edge framework that addresses this trade-off through constrained information-density optimization. The core innovation is gradient-based compression of reasoning chains to retain only decision-critical steps--minimal evidence sufficient to justify predictions while respecting token and latency budgets. We demonstrate that this approach preserves decision-relevant information while minimizing complexity. We construct compact datasets by distilling alerts into 3-5 high-information bullets (68% token reduction), train domain-specialized experts via LoRA, and deploy a cloud-edge architecture: a cloud LLM routes alerts to on-premises experts generating SOAR-ready JSON. Experiments demonstrate AIDR achieves higher accuracy and 40.6% latency reduction versus Chain-of-Thought, with robustness to data corruption and out-of-distribution generalization, enabling auditable and efficient SOC triage with full data residency compliance.

#5 Security Analysis of Integer Learning with Errors with Rejection Sampling

著者: Kyle Yates, Antsa Pierrottet, Abdullah Al Mamun, Ryann Cartor, Mashrur Chowdhury, Shuhong Gao

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.08172

要約:
At ASIACRYPT 2018, a digital attack based on linear least squares was introduced for a variant of the learning with errors (LWE) problem which omits modular reduction known as the integer learning with errors problem (ILWE). In this paper, we present a theoretical and experimental study of the effectiveness of the attack when applied directly to small parameter ILWE instances found in popular digital signature schemes such as CRYSTALS-Dilithium which utilize rejection sampling. Unlike other studies which form ILWE instances based on additional information obtained from side-channel attacks, we take a more direct approach to the problem by constructing our ILWE instance from only the obtained signatures. We outline and introduce novel techniques in our simulation designs such as modular polynomial arithmetic via matrices in $\mathbb{R}$, as well as algorithms for handling large sample sizes efficiently. Our experimental results reinforce the proclaimed security of signature schemes based on ILWE. We additionally discuss the implications of our work and digital signatures as a whole in regards to real-world applications such as in Intelligent Transportation Systems (ITS).

#6 A Practical Framework for Evaluating Medical AI Security: Reproducible Assessment of Jailbreaking and Privacy Vulnerabilities Across Clinical Specialties

privacy

著者: Jinghao Wang, Ping Zhang, Carter Yagemann

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.08185

要約:
Medical Large Language Models (LLMs) are increasingly deployed for clinical decision support across diverse specialties, yet systematic evaluation of their robustness to adversarial misuse and privacy leakage remains inaccessible to most researchers. Existing security benchmarks require GPU clusters, commercial API access, or protected health data -- barriers that limit community participation in this critical research area. We propose a practical, fully reproducible framework for evaluating medical AI security under realistic resource constraints. Our framework design covers multiple medical specialties stratified by clinical risk -- from high-risk domains such as emergency medicine and psychiatry to general practice -- addressing jailbreaking attacks (role-playing, authority impersonation, multi-turn manipulation) and privacy extraction attacks. All evaluation utilizes synthetic patient records requiring no IRB approval. The framework is designed to run entirely on consumer CPU hardware using freely available models, eliminating cost barriers. We present the framework specification including threat models, data generation methodology, evaluation protocols, and scoring rubrics. This proposal establishes a foundation for comparative security assessment of medical-specialist models and defense mechanisms, advancing the broader goal of ensuring safe and trustworthy medical AI systems.

#7 Evaluating Vulnerabilities of Connected Vehicles Under Cyber Attacks by Attack-Defense Tree

著者: Muhammad Baqer Mollah, Honggang Wang, Hua Fang

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.08204

要約:
Connected vehicles represent a key enabler of intelligent transportation systems, where vehicles are equipped with advanced communication, sensing, and computing technologies to interact not only with one another but also with surrounding infrastructures and the environment. Through continuous data exchange, such vehicles are capable of enhancing road safety, improving traffic efficiency, and ensuring more reliable mobility services. Further, when these capabilities are integrated with advanced automation technologies, the concept essentially evolves into connected and autonomous vehicles (CAVs). While connected vehicles primarily focus on seamless information sharing, autonomous vehicles are mainly dependent on advanced perception, decision-making, and control mechanisms to operate with minimal or without human intervention. However, as a result of connectivity, an adversary with malicious intentions might be able to compromise successfully by breaching the system components of CAVs. In this paper, we present an attack-tree based methodology for evaluating cyber security vulnerabilities in CAVs. In particular, we utilize the attack-defense tree formulation to systematically assess attack-leaf vulnerabilities, and before analyzing the vulnerability indices, we also define a measure of vulnerabilities, which is based on existing cyber security threats and corresponding defensive countermeasures.

#8 MIRAGE: Misleading Retrieval-Augmented Generation via Black-box and Query-agnostic Poisoning Attacks

backdoor

著者: Tailun Chen, Yu He, Yan Wang, Shuo Shao, Haolun Zheng, Zhihao Liu, Jinfeng Li, Yuefeng Chen, Zhixuan Chu, Zhan Qin

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.08289

要約:
Retrieval-Augmented Generation (RAG) systems enhance LLMs with external knowledge but introduce a critical attack surface: corpus poisoning. While recent studies have demonstrated the potential of such attacks, they typically rely on impractical assumptions, such as white-box access or known user queries, thereby underestimating the difficulty of real-world exploitation. In this paper, we bridge this gap by proposing MIRAGE, a novel multi-stage poisoning pipeline designed for strict black-box and query-agnostic environments. Operating on surrogate model feedback, MIRAGE functions as an automated optimization framework that integrates three key mechanisms: it utilizes persona-driven query synthesis to approximate latent user search distributions, employs semantic anchoring to imperceptibly embed these intents for high retrieval visibility, and leverages an adversarial variant of Test-Time Preference Optimization (TPO) to maximize persuasion. To rigorously evaluate this threat, we construct a new benchmark derived from three long-form, domain-specific datasets. Extensive experiments demonstrate that MIRAGE significantly outperforms existing baselines in both attack efficacy and stealthiness, exhibiting remarkable transferability across diverse retriever-LLM configurations and highlighting the urgent need for robust defense strategies.

#9 Systematization of Knowledge: Security and Safety in the Model Context Protocol Ecosystem

著者: Shiva Gaire, Srijan Gyawali, Saroj Mishra, Suman Niroula, Dilip Thakur, Umesh Yadav

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.08290

要約:
The Model Context Protocol (MCP) has emerged as the de facto standard for connecting Large Language Models (LLMs) to external data and tools, effectively functioning as the "USB-C for Agentic AI." While this decoupling of context and execution solves critical interoperability challenges, it introduces a profound new threat landscape where the boundary between epistemic errors (hallucinations) and security breaches (unauthorized actions) dissolves. This Systematization of Knowledge (SoK) aims to provide a comprehensive taxonomy of risks in the MCP ecosystem, distinguishing between adversarial security threats (e.g., indirect prompt injection, tool poisoning) and epistemic safety hazards (e.g., alignment failures in distributed tool delegation). We analyze the structural vulnerabilities of MCP primitives, specifically Resources, Prompts, and Tools, and demonstrate how "context" can be weaponized to trigger unauthorized operations in multi-agent environments. Furthermore, we survey state-of-the-art defenses, ranging from cryptographic provenance (ETDI) to runtime intent verification, and conclude with a roadmap for securing the transition from conversational chatbots to autonomous agentic operating systems.

#10 Exposing and Defending Membership Leakage in Vulnerability Prediction Models

著者: Yihan Liao, Jacky Keung, Xiaoxue Ma, Jingyu Zhang, Yicheng Sun

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.08291

要約:
Neural models for vulnerability prediction (VP) have achieved impressive performance by learning from large-scale code repositories. However, their susceptibility to Membership Inference Attacks (MIAs), where adversaries aim to infer whether a particular code sample was used during training, poses serious privacy concerns. While MIA has been widely investigated in NLP and vision domains, its effects on security-critical code analysis tasks remain underexplored. In this work, we conduct the first comprehensive analysis of MIA on VP models, evaluating the attack success across various architectures (LSTM, BiGRU, and CodeBERT) and feature combinations, including embeddings, logits, loss, and confidence. Our threat model aligns with black-box and gray-box settings where prediction outputs are observable, allowing adversaries to infer membership by analyzing output discrepancies between training and non-training samples. The empirical findings reveal that logits and loss are the most informative and vulnerable outputs for membership leakage. Motivated by these observations, we propose a Noise-based Membership Inference Defense (NMID), which is a lightweight defense module that applies output masking and Gaussian noise injection to disrupt adversarial inference. Extensive experiments demonstrate that NMID significantly reduces MIA effectiveness, lowering the attack AUC from nearly 1.0 to below 0.65, while preserving the predictive utility of VP models. Our study highlights critical privacy risks in code analysis and offers actionable defense strategies for securing AI-powered software systems.

#11 Secure Audio Embedding in Images using Nature-Inspired Optimization

著者: Aman Kumar, Ankit Chaudhary

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.08299

要約:
In todays digital world, protecting sensitive data is very essential. Steganography hides the existence of secret data instead of its content, providing better security for multimedia communication. This paper proposes a new technique for hiding audio files inside images using the Least Significant Bit (LSB) method optimized by the Harris Hawks Optimization (HHO) algorithm. HHO is a nature-inspired metaheuristic that imitates the hunting behavior of Harris hawks to find optimal pixel positions for embedding data. The proposed method is evaluated using Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index (SSIM), and Mean Square Error (MSE). Experimental results show that HHO achieves better image quality, robustness, and embedding capacity compared to existing methods.

#12 Privacy-Preserving Identifier Checking in 5G

privacy

著者: Marcel D. S. K. Gr\"afenstein, Stefan K\"opsell, Maryam Zarezadeh

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.08310

要約:
Device identifiers like the International Mobile Equipment Identity (IMEI) are crucial for ensuring device integrity and meeting regulations in 4G and 5G networks. However, sharing these identifiers with Mobile Network Operators (MNOs) brings significant privacy risks by enabling long-term tracking and linking of user activities across sessions. In this work, we propose a privacy-preserving identifier checking method in 5G. This paper introduces a protocol for verifying device identifiers without exposing them to the network while maintaining the same functions as the 3GPP-defined Equipment Identity Register (EIR) process. The proposed solution modifies the PEPSI protocol for a Private Set Membership (PSM) setting using the BFV homomorphic encryption scheme. This lets User Equipment (UE) prove that its identifier is not on an operator's blacklist or greylist while ensuring that the MNO only learns the outcome of the verification. The protocol allows controlled deanonymization through an authorized Law Enforcement (LE) hook, striking a balance between privacy and accountability. Implementation results show that the system can perform online verification within five seconds and requires about 15 to 16 MB of communication per session. This confirms its practical use under post-quantum security standards. The findings highlight the promise of homomorphic encryption for managing identifiers while preserving privacy in 5G, laying the groundwork for scalable and compliant verification systems in future 6G networks.

#13 Developing a Strong CPS Defender: An Evolutionary Approach

著者: Qingyuan Hu, Christopher M. Poskitt, Jun Sun, Yuqi Chen

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.08320

要約:
Cyber-physical systems (CPSs) are used extensively in critical infrastructure, underscoring the need for anomaly detection systems that are able to catch even the most motivated attackers. Traditional anomaly detection techniques typically do `one-off' training on datasets crafted by experts or generated by fuzzers, potentially limiting their ability to generalize to unseen and more subtle attack strategies. Stopping at this point misses a key opportunity: a defender can actively challenge the attacker to find more nuanced attacks, which in turn can lead to more effective detection capabilities. Building on this concept, we propose Evo-Defender, an evolutionary framework that iteratively strengthens CPS defenses through a dynamic attacker-defender interaction. Evo-Defender includes a smart attacker that employs guided fuzzing to explore diverse, non-redundant attack strategies, while the self-evolving defender uses incremental learning to adapt to new attack patterns. We implement Evo-Defender on two realistic CPS testbeds: the Tennessee Eastman process and a Robotic Arm Assembly Workstation, injecting over 600 attack scenarios. In end-to-end attack detection experiments, Evo-Defender achieves up to 2.7% higher performance than state-of-the-art baselines on unseen scenarios, while utilizing training data more efficiently for faster and more robust detection.

#14 Argus: A Multi-Agent Sensitive Information Leakage Detection Framework Based on Hierarchical Reference Relationships

agent

著者: Bin Wang, Hui Li, Liyang Zhang, Qijia Zhuang, Ao Yang, Dong Zhang, Xijun Luo, Bing Lin

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.08326

要約:
Sensitive information leakage in code repositories has emerged as a critical security challenge. Traditional detection methods that rely on regular expressions, fingerprint features, and high-entropy calculations often suffer from high false-positive rates. This not only reduces detection efficiency but also significantly increases the manual screening burden on developers. Recent advances in large language models (LLMs) and multi-agent collaborative architectures have demonstrated remarkable potential for tackling complex tasks, offering a novel technological perspective for sensitive information detection. In response to these challenges, we propose Argus, a multi-agent collaborative framework for detecting sensitive information. Argus employs a three-tier detection mechanism that integrates key content, file context, and project reference relationships to effectively reduce false positives and enhance overall detection accuracy. To comprehensively evaluate Argus in real-world repository environments, we developed two new benchmarks, one to assess genuine leak detection capabilities and another to evaluate false-positive filtering performance. Experimental results show that Argus achieves up to 94.86% accuracy in leak detection, with a precision of 96.36%, recall of 94.64%, and an F1 score of 0.955. Moreover, the analysis of 97 real repositories incurred a total cost of only 2.2$. All code implementations and related datasets are publicly available at https://github.com/TheBinKing/Argus-Guard for further research and application.

#15 USCSA: Evolution-Aware Security Analysis for Proxy-Based Upgradeable Smart Contracts

著者: Xiaoqi Li, Lei Xie, Wenkai Li, Zongwei Li

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.08372

要約:
In the case of upgrading smart contracts on blockchain systems, it is essential to consider the continuity of upgrade and subsequent maintenance. In practice, upgrade operations often introduce new vulnerabilities. To address this, we propose an Upgradable Smart Contract Security Analyzer, USCSA, which evaluates the risks associated with the upgrade process using the Abstract Syntax Tree (AST) differential analysis. We collected and analyzed 3,546 cases of vulnerabilities in upgradable contracts,covering common vulnerability categories such as reentrancy, access control flaws, and integer overflow. Experimental results show that USCSA achieves an accuracy of 92.3%, recall of 89.7%, and F1-score of 91.0% in detecting upgrade-induced vulnerabilities. In addition, the efficiency of mapping high-risk changes has achieved a 30% improvement over the conventional approach. As a result, USCSA provides a significant advantage to improve the security and integrity of upgradable smart contracts, providing a novel and efficient solution to secure audits on blockchain applications.

#16 Attention is All You Need to Defend Against Indirect Prompt Injection Attacks in LLMs

著者: Yinan Zhong, Qianhao Miao, Yanjiao Chen, Jiangyi Deng, Yushi Cheng, Wenyuan Xu

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.08417

要約:
Large Language Models (LLMs) have been integrated into many applications (e.g., web agents) to perform more sophisticated tasks. However, LLM-empowered applications are vulnerable to Indirect Prompt Injection (IPI) attacks, where instructions are injected via untrustworthy external data sources. This paper presents Rennervate, a defense framework to detect and prevent IPI attacks. Rennervate leverages attention features to detect the covert injection at a fine-grained token level, enabling precise sanitization that neutralizes IPI attacks while maintaining LLM functionalities. Specifically, the token-level detector is materialized with a 2-step attentive pooling mechanism, which aggregates attention heads and response tokens for IPI detection and sanitization. Moreover, we establish a fine-grained IPI dataset, FIPI, to be open-sourced to support further research. Extensive experiments verify that Rennervate outperforms 15 commercial and academic IPI defense methods, achieving high precision on 5 LLMs and 6 datasets. We also demonstrate that Rennervate is transferable to unseen attacks and robust against adaptive adversaries.

#17 LLM-based Vulnerable Code Augmentation: Generate or Refactor?

著者: Dyna Soumhane Ouchebara, St\'ephane Dupont

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.08493

要約:
Vulnerability code-bases often suffer from severe imbalance, limiting the effectiveness of Deep Learning-based vulnerability classifiers. Data Augmentation could help solve this by mitigating the scarcity of under-represented CWEs. In this context, we investigate LLM-based augmentation for vulnerable functions, comparing controlled generation of new vulnerable samples with semantics-preserving refactoring of existing ones. Using Qwen2.5-Coder to produce augmented data and CodeBERT as a vulnerability classifier on the SVEN dataset, we find that our approaches are indeed effective in enriching vulnerable code-bases through a simple process and with reasonable quality, and that a hybrid strategy best boosts vulnerability classifiers' performance.

#18 Labeled Delegated PSI and its Applications in the Public Sector

著者: Kristof Verslype, Florian Kerschbaum, Cyprien Delpech de Saint Guilhem, Bart De Decker, Jorn Lapon

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.08558

要約:
Sensitive citizen data, such as social, medical, and fiscal data, is heavily fragmented across public bodies and the private domain. Mining the combined data sets allows for new insights that otherwise remain hidden. Examples are improved healthcare, fraud detection, and evidence-based policy making. (Multi-party) delegated private set intersection (D-PSI) is a privacy-enhancing technology to link data across multiple data providers using a data collector. However, before it can be deployed in these use cases, it needs to be enhanced with additional functions, e.g., securely delivering payload only for elements in the intersection. Although there has been recent progress in the communication and computation requirements of D-PSI, these practical obstacles have not yet been addressed. This paper is the result of a collaboration with a governmental organization responsible for collecting, linking, and pseudonymizing data. Based on their requirements, we design a new D-PSI protocol with composable output functions, including encrypted payload and pseudonymized identifiers. We show that our protocol is secure in the standard model against colluding semi-honest data providers and against a non-colluding, possibly malicious independent party, the data collector. It, hence, allows to privately link and collect data from multiple data providers suitable for deployment in these use cases in the public sector.

#19 Integrating Public Input and Technical Expertise for Effective Cybersecurity Policy Formulation

著者: Hlekane Ngobeni, Mike Wa Nkongolo

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.08575

要約:
The evolving of digital transformation and increased use of technology comes with increased cyber vulnerabilities, which compromise national security. Cyber-threats become more sophisticated as the technology advances. This emphasises the need for strong risk mitigation strategies. To define strong and robust cybersecurity, policies requires an integrated approach of balancing technical expertise with public input. This paper aims to explore strategies used to balance technical expertise and public input to develop effective and robust cybersecurity policies. It also studied how the effective integration of technical expertise with public input is critical to developing effective strategies and resilient cybersecurity frameworks that strengthens national security. A lack of a holistic approach and collaborative efforts to cybersecurity can hinder the effectiveness of cybersecurity policies. This paper followed a systematic literature review with bibliometric analysis using the PRISMA methodology to explore how technical expertise and public input can be integrated to guide cybersecurity policy making. The thematic analysis identified five important themes in developing effective cybersecurity policies, these key themes are: Multi-Stakeholder Involvement and Human Centric Approaches (MSI & HCA), Governance and Policy Frameworks (GPF), Technical Infrastructure (TI), Evaluation and Compliance (EC), and Legal Rights and Sovereignty (LRS). The synthesis shows that there is no adequate exploration of collaborative efforts which undermines the effectiveness of the cybersecurity policies. The findings suggest that inclusive, flexible governance strategies that integrate public input at every stage are necessary for future cybersecurity policy research and practice, which must shift away from a primarily technical and legal perspective.

#20 An Explainable AI Model for the Detecting Malicious Smart Contracts Based on EVM Opcode Based Features

著者: Roopak Surendran

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.08782

要約:
Hackers may create malicious solidity programs and deploy it in the Ethereum block chain. These malicious smart contracts try to attack legitimate programs by exploiting its vulnerabilities such as reentrancy, tx.origin attack, bad randomness, deligatecall and so on. This may lead to drain of the funds, denial of service and so on . Hence, it is necessary to identify and prevent the malicious smart contract before deploying it into the blockchain. In this paper, we propose an ML based malicious smart contract detection mechanism by analyzing the EVM opcodes. After balancing the opcode frequency dataset with SMOTE algorithm, we transformed opcode frequencies to the binary values (0,1) using an entropy based supervised binning method. Then, an explainable AI model is trained with the proposed binary opcode based features. From the implementations, we found that the proposed mechanism can detect 99% of malicious smart contracts with a false positive rate of only 0.01. Finally, we incorporated LIME algorithm in our classifier to justify its predictions. We found that, LIME algorithm can explain why a particular smart contract app is declared as malicious by our ML classifier based on the binary value of EVM opcodes.

#21 Democratizing ML for Enterprise Security: A Self-Sustained Attack Detection Framework

著者: Sadegh Momeni, Ge Zhang, Birkett Huber, Hamza Harkous, Sam Lipton, Benoit Seguin, Yanis Pavlidis

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.08802

要約:
Despite advancements in machine learning for security, rule-based detection remains prevalent in Security Operations Centers due to the resource intensiveness and skill gap associated with ML solutions. While traditional rule-based methods offer efficiency, their rigidity leads to high false positives or negatives and requires continuous manual maintenance. This paper proposes a novel, two-stage hybrid framework to democratize ML-based threat detection. The first stage employs intentionally loose YARA rules for coarse-grained filtering, optimized for high recall. The second stage utilizes an ML classifier to filter out false positives from the first stage's output. To overcome data scarcity, the system leverages Simula, a seedless synthetic data generation framework, enabling security analysts to create high-quality training datasets without extensive data science expertise or pre-labeled examples. A continuous feedback loop incorporates real-time investigation results to adaptively tune the ML model, preventing rule degradation. This proposed model with active learning has been rigorously tested for a prolonged time in a production environment spanning tens of thousands of systems. The system handles initial raw log volumes often reaching 250 billion events per day, significantly reducing them through filtering and ML inference to a handful of daily tickets for human investigation. Live experiments over an extended timeline demonstrate a general improvement in the model's precision over time due to the active learning feature. This approach offers a self-sustained, low-overhead, and low-maintenance solution, allowing security professionals to guide model learning as expert ``teachers''.

#22 PrivTune: Efficient and Privacy-Preserving Fine-Tuning of Large Language Models via Device-Cloud Collaboration

privacy

著者: Yi Liu, Weixiang Han, Chengjun Cai, Xingliang Yuan, Cong Wang

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.08809

要約:
With the rise of large language models, service providers offer language models as a service, enabling users to fine-tune customized models via uploaded private datasets. However, this raises concerns about sensitive data leakage. Prior methods, relying on differential privacy within device-cloud collaboration frameworks, struggle to balance privacy and utility, exposing users to inference attacks or degrading fine-tuning performance. To address this, we propose PrivTune, an efficient and privacy-preserving fine-tuning framework via Split Learning (SL). The key idea of PrivTune is to inject crafted noise into token representations from the SL bottom model, making each token resemble the $n$-hop indirect neighbors. PrivTune formulates this as an optimization problem to compute the optimal noise vector, aligning with defense-utility goals. On this basis, it then adjusts the parameters (i.e., mean) of the $d_\chi$-Privacy noise distribution to align with the optimization direction and scales the noise according to token importance to minimize distortion. Experiments on five datasets (covering both classification and generation tasks) against three embedding inversion and three attribute inference attacks show that, using RoBERTa on the Stanford Sentiment Treebank dataset, PrivTune reduces the attack success rate to 10% with only a 3.33% drop in utility performance, outperforming state-of-the-art baselines.

#23 Secure and Privacy-Preserving Federated Learning for Next-Generation Underground Mine Safety

privacy

著者: Mohamed Elmahallawy, Sanjay Madria, Samuel Frimpong

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.08862

要約:
Underground mining operations depend on sensor networks to monitor critical parameters such as temperature, gas concentration, and miner movement, enabling timely hazard detection and safety decisions. However, transmitting raw sensor data to a centralized server for machine learning (ML) model training raises serious privacy and security concerns. Federated Learning (FL) offers a promising alternative by enabling decentralized model training without exposing sensitive local data. Yet, applying FL in underground mining presents unique challenges: (i) Adversaries may eavesdrop on shared model updates to launch model inversion or membership inference attacks, compromising data privacy and operational safety; (ii) Non-IID data distributions across mines and sensor noise can hinder model convergence. To address these issues, we propose FedMining--a privacy-preserving FL framework tailored for underground mining. FedMining introduces two core innovations: (1) a Decentralized Functional Encryption (DFE) scheme that keeps local models encrypted, thwarting unauthorized access and inference attacks; and (2) a balancing aggregation mechanism to mitigate data heterogeneity and enhance convergence. Evaluations on real-world mining datasets demonstrate FedMining's ability to safeguard privacy while maintaining high model accuracy and achieving rapid convergence with reduced communication and computation overhead. These advantages make FedMining both secure and practical for real-time underground safety monitoring.

#24 Decentralized Trust for Space AI: Blockchain-Based Federated Learning Across Multi-Vendor LEO Satellite Networks

著者: Mohamed Elmahallawy, Asma Jodeiri Akbarfam

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.08882

要約:
The rise of space AI is reshaping government and industry through applications such as disaster detection, border surveillance, and climate monitoring, powered by massive data from commercial and governmental low Earth orbit (LEO) satellites. Federated satellite learning (FSL) enables joint model training without sharing raw data, but suffers from slow convergence due to intermittent connectivity and introduces critical trust challenges--where biased or falsified updates can arise across satellite constellations, including those injected through cyberattacks on inter-satellite or satellite-ground communication links. We propose OrbitChain, a blockchain-backed framework that empowers trustworthy multi-vendor collaboration in LEO networks. OrbitChain (i) offloads consensus to high-altitude platforms (HAPs) with greater computational capacity, (ii) ensures transparent, auditable provenance of model updates from different orbits owned by different vendors, and (iii) prevents manipulated or incomplete contributions from affecting global FSL model aggregation. Extensive simulations show that OrbitChain reduces computational and communication overhead while improving privacy, security, and global model accuracy. Its permissioned proof-of-authority ledger finalizes over 1000 blocks with sub-second latency (0.16,s, 0.26,s, 0.35,s for 1-of-5, 3-of-5, and 5-of-5 quorums). Moreover, OrbitChain reduces convergence time by up to 30 hours on real satellite datasets compared to single-vendor, demonstrating its effectiveness for real-time, multi-vendor learning. Our code is available at https://github.com/wsu-cyber-security-lab-ai/OrbitChain.git

#25 Improved Pseudorandom Codes from Permuted Puzzles

著者: Miranda Christ, Noah Golowich, Sam Gunn, Ankur Moitra, Daniel Wichs

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.08918

要約:
Watermarks are an essential tool for identifying AI-generated content. Recently, Christ and Gunn (CRYPTO '24) introduced pseudorandom error-correcting codes (PRCs), which are equivalent to watermarks with strong robustness and quality guarantees. A PRC is a pseudorandom encryption scheme whose decryption algorithm tolerates a high rate of errors. Pseudorandomness ensures quality preservation of the watermark, and error tolerance of decryption translates to the watermark's ability to withstand modification of the content. In the short time since the introduction of PRCs, several works (NeurIPS '24, RANDOM '25, STOC '25) have proposed new constructions. Curiously, all of these constructions are vulnerable to quasipolynomial-time distinguishing attacks. Furthermore, all lack robustness to edits over a constant-sized alphabet, which is necessary for a meaningfully robust LLM watermark. Lastly, they lack robustness to adversaries who know the watermarking detection key. Until now, it was not clear whether any of these properties was achievable individually, let alone together. We construct pseudorandom codes that achieve all of the above: plausible subexponential pseudorandomness security, robustness to worst-case edits over a binary alphabet, and robustness against even computationally unbounded adversaries that have the detection key. Pseudorandomness rests on a new assumption that we formalize, the permuted codes conjecture, which states that a distribution of permuted noisy codewords is pseudorandom. We show that this conjecture is implied by the permuted puzzles conjecture used previously to construct doubly efficient private information retrieval. To give further evidence, we show that the conjecture holds against a broad class of simple distinguishers, including read-once branching programs.

#26 Command & Control (C2) Traffic Detection Via Algorithm Generated Domain (Dga) Classification Using Deep Learning And Natural Language Processing

著者: Maria Milena Araujo Felix

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.07866

要約:
The sophistication of modern malware, specifically regarding communication with Command and Control (C2) servers, has rendered static blacklist-based defenses obsolete. The use of Domain Generation Algorithms (DGA) allows attackers to generate thousands of dynamic addresses daily, hindering blocking by traditional firewalls. This paper aims to propose and evaluate a method for detecting DGA domains using Deep Learning and Natural Language Processing (NLP) techniques. The methodology consisted of collecting a hybrid database containing 50,000 legitimate and 50,000 malicious domains, followed by the extraction of lexical features and the training of a Recurrent Neural Network (LSTM). Results demonstrated that while statistical entropy analysis is effective for simple DGAs, the Neural Network approach presents superiority in detecting complex patterns, reaching 97.2% accuracy and reducing the false positive rate in ambiguous lawful traffic scenarios.

#27 CapsuleFS A Multi-credential DataCapsule Filesystem

著者: Qingyang Hu, Yucheng Huang, Manshi Yang

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.08067

要約:
CapsuleFS (CFS) is the first filesystem to integrate multi-credential functionality within a POSIX-compliant framework, utilizing DataCapsule as the storage provider. This innovative system is established based on the Global Data Plane in the area of edge computing. Our comprehensive design and implementation of CFS successfully fulfill the objective of providing a multi-credential Common Access API. The architecture of CFS is methodically segmented into three integral components: Firstly, the DataCapsule server, tasked with the storage, dissemination, and replication of DataCapsules on the edge. Secondly, the middleware, a crucial element running in a Trusted Execution Environment responsible for the enforcement and management of write permissions and requests. Finally, the client component, which manifests as a POSIX-compliant filesystem, is adaptable and operational across many architectures. Experimental evaluations of CFS reveal that, while its read and write performances are comparatively modest, it upholds a high degree of functional correctness. This attribute distinctly positions CFS as a viable candidate for application in real-world software development scenarios. The paper also delineates potential future enhancements, aimed at augmenting the practicality of CFS in the landscape of software development.

#28 An Efficient Secret Communication Scheme for the Bosonic Wiretap Channel

著者: Esther H\"anggi, Iy\'an M\'endez Veiga, Ligong Wang

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.08623

要約:
We propose a new secret communication scheme over the bosonic wiretap channel. It uses readily available hardware such as lasers and direct photodetectors. The scheme is based on randomness extractors, pulse-position modulation, and Reed-Solomon codes and is therefore computationally efficient. It is secure against an eavesdropper performing coherent joint measurements on the quantum states it observes. In the low-photon-flow limit, the scheme is asymptotically optimal and achieves the same dominant term as the secrecy capacity of the same channel.

#29 Can the GPC standard eliminate consent banners in the EU?

著者: Sebastian Zimmeck, Harshvardhan J. Pandit, Frederik Zuiderveen Borgesius, Cristiana Teixeira Santos, Konrad Kollnig, Robin Berjon

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.08856

要約:
In the EU, the General Data Protection Regulation and the ePrivacy Directive mandate informed consent for behavioural advertising and use of tracking technologies. However, the ubiquity of consent banners and popups has led to widespread consent fatigue and questions regarding the effectiveness of these mechanisms in protecting users' data. In contrast, users in California and other US jurisdictions can utilize Global Privacy Control (GPC), a browser-based privacy signal that automatically broadcasts a legally binding opt-out request to websites. In this paper we explore whether, and to what extent, GPC can be adapted to the EU legal framework to mitigate consent fatigue and improve privacy protections for EU residents. We analyse GPC as a technical specification standardized at the World Wide Web Consortium and examine its standing under current EU data protection law. Generally, GPC can be mapped to the various legal bases for processing under the GDPR. However, our evaluation also identifies friction between the GPC specification and EU data protection law as it stands. These discrepancies are resolvable and present an opportunity for EU legislators and regulators to interpret GPC in alignment with EU data protection requirements, particularly, considering the European Commission's recent Digital Omnibus proposal. We conclude that while GPC is not a silver bullet, its adoption -- supported by clear authoritative guidance and specification updates -- can offer a pragmatic path toward more automated and effective data protection in the EU.

#30 NecoFuzz: Effective Fuzzing of Nested Virtualization via Fuzz-Harness Virtual Machines

著者: Reima Ishii, Takaaki Fukai, Takahiro Shinagawa

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.08858

要約:
Nested virtualization is now widely supported by major cloud vendors, allowing users to leverage virtualization-based technologies in the cloud. However, supporting nested virtualization significantly increases host hypervisor complexity and introduces a new attack surface in cloud platforms. While many prior studies have explored hypervisor fuzzing, none has explicitly addressed nested virtualization due to the challenge of generating effective virtual machine (VM) instances with a vast state space as fuzzing inputs. We present NecoFuzz, the first fuzzing framework that systematically targets nested virtualization-specific logic in hypervisors. NecoFuzz synthesizes executable fuzz-harness VMs with internal states near the boundary between valid and invalid, guided by an approximate model of hardware-assisted virtualization specifications. Since vulnerabilities in nested virtualization often stem from incorrect handling of unexpected VM states, this specification-guided, boundary-oriented generation significantly improves coverage of security-critical code across different hypervisors. We implemented NecoFuzz on Intel VT-x and AMD-V by extending AFL++ to support fuzz-harness VMs. NecoFuzz achieved 84.7% and 74.2% code coverage for nested virtualization-specific code on Intel VT-x and AMD-V, respectively, and uncovered six previously unknown vulnerabilities across three hypervisors, including two assigned CVEs.

#31 Differentially Private Synthetic Data Generation Using Context-Aware GANs

privacysynthetic data

著者: Anantaa Kotal, Anupam Joshi

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.08869

要約:
The widespread use of big data across sectors has raised major privacy concerns, especially when sensitive information is shared or analyzed. Regulations such as GDPR and HIPAA impose strict controls on data handling, making it difficult to balance the need for insights with privacy requirements. Synthetic data offers a promising solution by creating artificial datasets that reflect real patterns without exposing sensitive information. However, traditional synthetic data methods often fail to capture complex, implicit rules that link different elements of the data and are essential in domains like healthcare. They may reproduce explicit patterns but overlook domain-specific constraints that are not directly stated yet crucial for realism and utility. For example, prescription guidelines that restrict certain medications for specific conditions or prevent harmful drug interactions may not appear explicitly in the original data. Synthetic data generated without these implicit rules can lead to medically inappropriate or unrealistic profiles. To address this gap, we propose ContextGAN, a Context-Aware Differentially Private Generative Adversarial Network that integrates domain-specific rules through a constraint matrix encoding both explicit and implicit knowledge. The constraint-aware discriminator evaluates synthetic data against these rules to ensure adherence to domain constraints, while differential privacy protects sensitive details from the original data. We validate ContextGAN across healthcare, security, and finance, showing that it produces high-quality synthetic data that respects domain rules and preserves privacy. Our results demonstrate that ContextGAN improves realism and utility by enforcing domain constraints, making it suitable for applications that require compliance with both explicit patterns and implicit rules under strict privacy guarantees.

#32 PP-GWAS: Privacy Preserving Multi-Site Genome-wide Association Studies

privacy

著者: Arjhun Swaminathan, Anika Hannemann, Ali Burak \"Unal, Nico Pfeifer, Mete Akg\"un

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2410.08122

要約:
Genome-wide association studies are pivotal in understanding the genetic underpinnings of complex traits and diseases. Collaborative, multi-site GWAS aim to enhance statistical power but face obstacles due to the sensitive nature of genomic data sharing. Current state-of-the-art methods provide a privacy-focused approach utilizing computationally expensive methods such as Secure Multi-Party Computation and Homomorphic Encryption. In this context, we present a novel algorithm PP-GWAS designed to improve upon existing standards in terms of computational efficiency and scalability without sacrificing data privacy. This algorithm employs randomized encoding within a distributed architecture to perform stacked ridge regression on a Linear Mixed Model to ensure rigorous analysis. Experimental evaluation with real world and synthetic data indicates that PP-GWAS can achieve computational speeds twice as fast as similar state-of-the-art algorithms while using lesser computational resources, all while adhering to a robust security model that caters to an all-but-one semi-honest adversary setting. We have assessed its performance using various datasets, emphasizing its potential in facilitating more efficient and private genomic analyses.

#33 Private Synthetic Data Generation in Bounded Memory

privacysynthetic data

著者: Rayne Holland, Seyit Camtepe, Chandra Thapa, Minhui Xue

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2412.09756

要約:
We propose $\mathtt{PrivHP}$, a lightweight synthetic data generator with \textit{differential privacy} guarantees. $\mathtt{PrivHP}$ uses a novel hierarchical decomposition that approximates the input's cumulative distribution function (CDF) in bounded memory. It balances hierarchy depth, noise addition, and pruning of low-frequency subdomains while preserving frequent ones. Private sketches estimate subdomain frequencies efficiently without full data access. A key feature is the pruning parameter $k$, which controls the trade-off between space and utility. We define the skew measure $\mathtt{tail}_k$, capturing all but the top $k$ subdomain frequencies. Given a dataset $\mathcal{X}$, $\mathtt{PrivHP}$ uses $M=\mathcal{O}(k\log^2 |X|)$ space and, for input domain $\Omega = [0,1]$, ensures $\varepsilon$-differential privacy. It yields a generator with expected Wasserstein distance: \[ \mathcal{O}\left(\frac{\log^2 M}{\varepsilon n} + \frac{||\mathtt{tail}_k(\mathcal{X})||_1}{M n}\right) \] from the empirical distribution. This parameterized trade-off offers a level of flexibility unavailable in prior work. We also provide interpretable utility bounds that account for hierarchy depth, privacy noise, pruning, and frequency estimation errors.

#34 AEIOU: A Unified Defense Framework against NSFW Prompts in Text-to-Image Models

著者: Yiming Wang, Jiahao Chen, Qingming Li, Tong Zhang, Rui Zeng, Xing Yang, Shouling Ji

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2412.18123

要約:
As text-to-image (T2I) models advance and gain widespread adoption, their associated safety concerns are becoming increasingly critical. Malicious users exploit these models to generate Not-Safe-for-Work (NSFW) images using harmful or adversarial prompts, underscoring the need for effective safeguards to ensure the integrity and compliance of model outputs. However, existing detection methods often exhibit low accuracy and inefficiency. In this paper, we propose AEIOU, a defense framework that is adaptable, efficient, interpretable, optimizable, and unified against NSFW prompts in T2I models. AEIOU extracts NSFW features from the hidden states of the model's text encoder, utilizing the separable nature of these features to detect NSFW prompts. The detection process is efficient, requiring minimal inference time. AEIOU also offers real-time interpretation of results and supports optimization through data augmentation techniques. The framework is versatile, accommodating various T2I architectures. Our extensive experiments show that AEIOU significantly outperforms both commercial and open-source moderation tools, achieving over 95\% accuracy across all datasets and improving efficiency by at least tenfold. It effectively counters adaptive attacks and excels in few-shot and multi-label scenarios.

#35 Dark Deceptions in DHCP: Dismantling Network Defenses

著者: Robert Dilworth

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2502.10646

要約:
This paper explores vulnerabilities in the Dynamic Host Configuration Protocol (DHCP) and their implications on the Confidentiality, Integrity, and Availability (CIA) Triad. Through an analysis of various attacks, including DHCP Starvation, Rogue DHCP Servers, Replay Attacks, and TunnelVision exploits, the paper provides a taxonomic classification of threats, assesses risks, and proposes appropriate controls. The discussion also highlights the dangers of VPN decloaking through DHCP exploits and underscores the importance of safeguarding network infrastructures. By bringing awareness to the TunnelVision exploit, this paper aims to mitigate risks associated with these prevalent vulnerabilities.

#36 Scalable Differentially Private Sketches under Continual Observation

privacy

著者: Rayne Holland

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2507.03361

要約:
Linear sketches are fundamental tools in data stream analytics. They are notable for supporting both approximate frequency queries and heavy hitter detection with bounded trade-offs for error and memory. Importantly, on streams that contain sensitive information, linear sketches can be easily privatized with the injection of a suitable amount of noise. This process is efficient in the single release model, where the output is released only at the end of the stream. In this setting, it suffices to add noise to the sketch once. In contrast, in the continual observation model, where the output is released at every time-step, fresh noise needs to be added to the sketch before each release. This creates an additional computational overhead. To address this, we introduce Lazy Sketch, a novel differentially private sketching method that employs lazy updates, perturbing and modifying only a small portion of the sketch at each step. Compared to prior work, we reduce the update complexity by a factor of $O(w)$, where $w$ is the width of the sketch. Experiments demonstrate that our method increases throughput by up to 250x over prior work, making continual observation differential privacy practical for high-speed streaming applications.

#37 3S-Attack: Spatial, Spectral and Semantic Invisible Backdoor Attack Against DNN Models

backdoor

著者: Jianyao Yin, Luca Arnaboldi, Honglong Chen, Pascal Berrang, Mark Ryan

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2507.10733

要約:
Backdoor attacks implant hidden behaviors into models by poisoning training data or modifying the model directly. These attacks aim to maintain high accuracy on benign inputs while causing misclassification when a specific trigger is present. While existing studies have explored stealthy triggers in spatial and spectral domains, few incorporate the semantic domain. In this paper, we propose 3S-attack, a novel backdoor attack which is stealthy across the spatial, spectral, and semantic domains. The key idea is to exploit the semantic features of benign samples as triggers, using Gradient-weighted Class Activation Mapping (Grad-CAM) and a preliminary model for extraction. Then we embedded the trigger in the spectral domain, followed by pixel-level restrictions in the spatial domain. This process minimizes the distance between poisoned and benign samples, making the attack harder to detect by existing defenses and human inspection. And it exposes a vulnerability at the intersection of robustness and semantic interpretability, revealing that models can be manipulated to act in semantically consistent yet malicious ways. Extensive experiments on various datasets, along with theoretical analysis, demonstrate the stealthiness of 3S-attack and highlight the need for stronger defenses to ensure AI security.

#38 ScamAgents: How AI Agents Can Simulate Human-Level Scam Calls

agent

著者: Sanket Badhe

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2508.06457

要約:
Large Language Models (LLMs) have demonstrated impressive fluency and reasoning capabilities, but their potential for misuse has raised growing concern. In this paper, we present ScamAgent, an autonomous multi-turn agent built on top of LLMs, capable of generating highly realistic scam call scripts that simulate real-world fraud scenarios. Unlike prior work focused on single-shot prompt misuse, ScamAgent maintains dialogue memory, adapts dynamically to simulated user responses, and employs deceptive persuasion strategies across conversational turns. We show that current LLM safety guardrails, including refusal mechanisms and content filters, are ineffective against such agent-based threats. Even models with strong prompt-level safeguards can be bypassed when prompts are decomposed, disguised, or delivered incrementally within an agent framework. We further demonstrate the transformation of scam scripts into lifelike voice calls using modern text-to-speech systems, completing a fully automated scam pipeline. Our findings highlight an urgent need for multi-turn safety auditing, agent-level control frameworks, and new methods to detect and disrupt conversational deception powered by generative AI.

#39 Locus: Agentic Predicate Synthesis for Directed Fuzzing

agent

著者: Jie Zhu, Chihao Shen, Ziyang Li, Jiahao Yu, Yizheng Chen, Kexin Pei

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2508.21302

要約:
Directed fuzzing aims to find program inputs that lead to specified target program states. It has broad applications, such as debugging system crashes, confirming reported bugs, and generating exploits for potential vulnerabilities. This task is inherently challenging because target states are often deeply nested in the program, while the search space manifested by numerous possible program inputs is prohibitively large. Existing approaches rely on branch distances or manually-specified constraints to guide the search; however, the branches alone are often insufficient to precisely characterize progress toward reaching the target states, while the manually specified constraints are often tailored for specific bug types and thus difficult to generalize to diverse target states and programs. We present Locus, a novel framework to improve the efficiency of directed fuzzing. Our key insight is to synthesize predicates to capture fuzzing progress as semantically meaningful intermediate states, serving as milestones towards reaching the target states. When used to instrument the program under fuzzing, they can reject executions unlikely to reach the target states, while providing additional coverage guidance. To automate this task and generalize to diverse programs, Locus features an agentic framework with program analysis tools to synthesize and iteratively refine the candidate predicates, while ensuring the predicates strictly relax the target states to prevent false rejections via symbolic execution. Our evaluation shows that Locus substantially improves the efficiency of eight state-of-the-art fuzzers in discovering real-world vulnerabilities, achieving an average speedup of 41.6x. So far, Locus has found nine previously unpatched bugs, with three already acknowledged with draft patches.

#40 On the Credibility of Deniable Communication in Court

著者: Jacob Leiken, Sunoo Park

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2510.16873

要約:
Over time, cryptographically deniable systems have come to be associated in computer-science literature with the idea of "denying" evidence in court - specifically, with the ability to convincingly forge evidence in courtroom scenarios and an inability to authenticate evidence in such contexts. Evidentiary processes in courts, however, have been developed over centuries to account for the reality that evidence has always been forgeable, and relies on factors outside of cryptographic models to seek the truth "as well as possible" while acknowledging that all evidence is imperfect. We argue that deniability does not and need not change this paradigm. Our analysis highlights a gap between technical deniability notions and their application to the real world. There will always be factors outside a cryptographic model that influence perceptions of a message's authenticity, in realistic situations. We propose the broader concept of credibility to capture these factors. The credibility of a system is determined by (1) a threshold of quality that a forgery must pass to be "believable" as an original communication, which varies based on sociotechnical context and threat model, (2) the ease of creating a forgery that passes this threshold, which is also context- and threat-model-dependent, and (3) default system retention policy and retention settings. All three aspects are important for designing secure communication systems for real-world threat models, and some aspects of (2) and (3) may be incorporated directly into technical system design. We hope that our model of credibility will facilitate system design and deployment that addresses threats that are not and cannot be captured by purely technical definitions and existing cryptographic models, and support more nuanced discourse on the strengths and limitations of cryptographic guarantees within specific legal and sociotechnical contexts.

#41 Privacy Preservation through Practical Machine Unlearning

privacy

著者: Robert Dilworth

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2502.10635

要約:
Machine Learning models thrive on vast datasets, continuously adapting to provide accurate predictions and recommendations. However, in an era dominated by privacy concerns, Machine Unlearning emerges as a transformative approach, enabling the selective removal of data from trained models. This paper examines methods such as Naive Retraining and Exact Unlearning via the SISA framework, evaluating their Computational Costs, Consistency, and feasibility using the $\texttt{HSpam14}$ dataset. We explore the potential of integrating unlearning principles into Positive Unlabeled (PU) Learning to address challenges posed by partially labeled datasets. Our findings highlight the promise of unlearning frameworks like $\textit{DaRE}$ for ensuring privacy compliance while maintaining model performance, albeit with significant computational trade-offs. This study underscores the importance of Machine Unlearning in achieving ethical AI and fostering trust in data-driven systems.

#42 Dual-Source SPIR over a noiseless MAC without Data Replication or Shared Randomness

著者: Remi A. Chou

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2503.14682

要約:
Information-theoretically secure Symmetric Private Information Retrieval (SPIR) is known to be infeasible over noiseless channels with a single server. Known solutions to overcome this infeasibility involve additional resources such as database replication, shared randomness, or noisy channels. In this paper, we propose an alternative approach for achieving SPIR with information-theoretic security guarantees, without relying on shared randomness, noisy channels, or data replication. Specifically, we demonstrate that it is sufficient to use a noiseless binary adder multiple-access channel, where inputs are controlled by two non-colluding servers and the output is observed by the client, alongside a public noiseless communication channel between the client and the servers. Furthermore, in this setting, we characterize the optimal file rates, i.e., the file lengths normalized by the number of channel uses, that can be transferred.

#43 Reasoning Under Pressure: How do Training Incentives Influence Chain-of-Thought Monitorability?

著者: Matt MacDermott, Qiyao Wei, Rada Djoneva, Francis Rhys Ward

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.00218

要約:
AI systems that output their reasoning in natural language offer an opportunity for safety -- we can \emph{monitor} their chain of thought (CoT) for undesirable reasoning, such as the pursuit of harmful objectives. However, the extent to which CoT faithfully reflects the underlying reasoning process, and hence the extent to which it can be usefully monitored, may be influenced by certain aspects of training. We investigate how different \emph{training incentives}, applied to a reasoning model, affect its monitorability. We introduce a novel methodology for measuring monitorability according to whether a monitor can predict a key latent variable using the model's reasoning. When controlling for accuracy, we do not find evidence for consistent effects from commonly used incentives (length penalties and KL regularisation), but we find that adversarial optimisation (penalising monitor accuracy) degrades monitor performance, while direct optimisation for monitorability does not reliably lead to improvements. Our code is available at https://github.com/QiyaoWei/reasoning-under-pressure.

#44 Understanding Privacy Risks in Code Models Through Training Dynamics: A Causal Approach

privacy

著者: Hua Yang, Alejandro Velasco, Sen Fang, Bowen Xu, Denys Poshyvanyk

公開日: Wed, 10 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.07814

要約:
Large language models for code (LLM4Code) have greatly improved developer productivity but also raise privacy concerns due to their reliance on open-source repositories containing abundant personally identifiable information (PII). Prior work shows that commercial models can reproduce sensitive PII, yet existing studies largely treat PII as a single category and overlook the heterogeneous risks among different types. We investigate whether distinct PII types vary in their likelihood of being learned and leaked by LLM4Code, and whether this relationship is causal. Our methodology includes building a dataset with diverse PII types, fine-tuning representative models of different scales, computing training dynamics on real PII data, and formulating a structural causal model to estimate the causal effect of learnability on leakage. Results show that leakage risks differ substantially across PII types and correlate with their training dynamics: easy-to-learn instances such as IP addresses exhibit higher leakage, while harder types such as keys and passwords leak less frequently. Ambiguous types show mixed behaviors. This work provides the first causal evidence that leakage risks are type-dependent and offers guidance for developing type-aware and learnability-aware defenses for LLM4Code.

cs.CR updates on arXiv.org

📋 論文タイトル一覧