Introducing CodeMender: an AI agent for code security
Introducing CodeMender: an AI agent for code security Using advanced AI to fix critical software vulnerabilities
313 stories from 28 sources
Introducing CodeMender: an AI agent for code security Using advanced AI to fix critical software vulnerabilities
Microsoft, Salesforce Patch AI Agent Data Leak Flaws Two recently fixed prompt injections in Salesforce Agentforce and Microsoft Copilot would have enabled an external attacker to leak sensitive data.
Anthropic MCP Design Vulnerability Enables RCE, Threatening AI Supply Chain Cybersecurity researchers have discovered a critical "by design" weakness in the Model Context Protocol's (MCP) architecture that could pave th…
Multi-Level Temporal Graph Networks with Local-Global Fusion for Industrial Fault Diagnosis arXiv:2604.18765v1 Announce Type: cross Abstract: Fault detection and diagnosis are critical for the optimal and safe operation…
Decompose, Structure, and Repair: A Neuro-Symbolic Framework for Autoformalization via Operator Trees arXiv:2604.19000v1 Announce Type: cross Abstract: Statement autoformalization acts as a critical bridge between human…
Fairness Audits of Institutional Risk Models in Deployed ML Pipelines arXiv:2604.19468v1 Announce Type: cross Abstract: Fairness audits of institutional risk models are critical for understanding how deployed machine le…
When Graph Structure Becomes a Liability: A Critical Re-Evaluation of Graph Neural Networks for Bitcoin Fraud Detection under Temporal Distribution Shift arXiv:2604.19514v1 Announce Type: cross Abstract: The consensus t…
DP-FlogTinyLLM: Differentially private federated log anomaly detection using Tiny LLMs arXiv:2604.19118v1 Announce Type: cross Abstract: Modern distributed systems generate massive volumes of log data that are critical…
[Webinar] Eliminate Ghost Identities Before They Expose Your Enterprise Data In 2024, compromised service accounts and forgotten API keys were behind 68% of cloud breaches. Not phishing.
Deepening our partnership with the UK AI Security Institute Google DeepMind and UK AI Security Institute (AISI) strengthen collaboration on critical AI safety and security research
Google Antigravity in Crosshairs of Security Researchers, Cybercriminals Researchers discovered a remote code execution vulnerability and cybercriminals are using its reputation to deliver malware. The post Google Antig…
Towards Optimal Agentic Architectures for Offensive Security Tasks arXiv:2604.18718v1 Announce Type: cross Abstract: Agentic security systems increasingly audit live targets with tool-using LLMs, but prior systems fix a…
Prompt to Pwn: Automated Exploit Generation for Smart Contracts arXiv:2508.01371v3 Announce Type: replace Abstract: Smart contracts are important for digital finance, yet they are hard to patch once deployed. Prior work…
Announcing Copilot leadership update Satya Nadella, Chairman and CEO, and Mustafa Suleyman, Executive Vice President and CEO of Microsoft AI, shared the below communications with Microsoft employees this morning. SATYA…
Claude Mythos Finds 271 Firefox Vulnerabilities All the flaws could have also been found by an elite human researcher, according to Mozilla. The post Claude Mythos Finds 271 Firefox Vulnerabilities appeared first on Sec…
CAISI Issues Request for Information About Securing AI Agent Systems The Center for AI Standards and Innovation (CAISI) at the U.S. Department of Commerce’s National Institute of Standards and Technology (NIST) has publ…
CAISI Evaluation of DeepSeek AI Models Finds Shortcomings and Risks The Center for AI Standards and Innovation at NIST evaluated several leading models from DeepSeek, an AI company based in the People’s Republic of Chin…
Toxic Combinations: When Cross-App Permissions Stack into Risk On January 31, 2026, researchers disclosed that Moltbook, a social network built for AI agents, had left its database wide open, exposing 35,000 email addre…
AI-Driven Pushpaganda Scam Exploits Google Discover to Spread Scareware and Ad Fraud Cybersecurity researchers have unmasked a novel ad fraud scheme that has been found to leverage search engine poisoning (SEO) techniqu…
Now Meta will track what employees do on their computers to train its AI agents Meta employees' activity at work is now being used to train the company's AI agents. As reported by Reuters, Meta is installing a tool it c…
Top Law Firm Admits to AI ‘Hallucinations’ in Bankruptcy Filing Tied to Alleged Scam Network Sullivan & Cromwell said internal safeguards were bypassed in the Prince Group case, resulting in fabricated and inaccurate le…
AI needs a strong data fabric to deliver business value Artificial intelligence is moving quickly in the enterprise, from experimentation to everyday use. Organizations are deploying copilots, agents, and predictive sys…
Anthropic’s most dangerous AI model just fell into the wrong hands Anthropic's Mythos AI model, a powerful cybersecurity tool that the company said could be dangerous in the wrong hands, has been accessed by a "small gr…
ARES: Adaptive Red-Teaming and End-to-End Repair of Policy-Reward System arXiv:2604.18789v1 Announce Type: new Abstract: Reinforcement Learning from Human Feedback (RLHF) is central to aligning Large Language Models (LL…
AI scientists produce results without reasoning scientifically arXiv:2604.18805v1 Announce Type: new Abstract: Large language model (LLM)-based systems are increasingly deployed to conduct scientific research autonomous…
Quantum inspired qubit qutrit neural networks for real time financial forecasting arXiv:2604.18838v1 Announce Type: new Abstract: This research investigates the performance and efficacy of machine learning models in sto…
Human-Guided Harm Recovery for Computer Use Agents arXiv:2604.18847v1 Announce Type: new Abstract: As LM agents gain the ability to execute actions on real computer systems, we need ways to not only prevent harmful acti…
From Natural Language to Executable Narsese: A Neuro-Symbolic Benchmark and Pipeline for Reasoning with NARS arXiv:2604.18873v1 Announce Type: new Abstract: Large language models (LLMs) are highly capable at language ge…
Formally Verified Patent Analysis via Dependent Type Theory: Machine-Checkable Certificates from a Hybrid AI + Lean 4 Pipeline arXiv:2604.18882v1 Announce Type: new Abstract: We present a formally verified framework for…
Error-free Training for MedMNIST Datasets arXiv:2604.18916v1 Announce Type: new Abstract: In this paper, we introduce a new concept called Artificial Special Intelligence by which Machine Learning models for the classif…
AutomationBench arXiv:2604.18934v1 Announce Type: new Abstract: Existing AI benchmarks for software automation rarely combine cross-application coordination, autonomous API discovery, and policy adherence. Real business…
Personalized Benchmarking: Evaluating LLMs by Individual Preferences arXiv:2604.18943v1 Announce Type: new Abstract: With the rise in capabilities of large language models (LLMs) and their deployment in real-world tasks…
Reasoning Structure Matters for Safety Alignment of Reasoning Models arXiv:2604.18946v1 Announce Type: new Abstract: Large reasoning models (LRMs) achieve strong performance on complex reasoning tasks but often generate…
DW-Bench: Benchmarking LLMs on Data Warehouse Graph Topology Reasoning arXiv:2604.18964v1 Announce Type: new Abstract: This paper introduces DW-Bench, a new benchmark that evaluates large language models (LLMs) on graph…
SAVOIR: Learning Social Savoir-Faire via Shapley-based Reward Attribution arXiv:2604.18982v1 Announce Type: new Abstract: Social intelligence, the ability to navigate complex interpersonal interactions, presents a funda…
On Accelerating Grounded Code Development for Research arXiv:2604.19022v1 Announce Type: new Abstract: A major challenge for niche scientific and technical domains in leveraging coding agents is the lack of access to up…
Learning Lifted Action Models from Unsupervised Visual Traces arXiv:2604.19043v1 Announce Type: new Abstract: Efficient construction of models capturing the preconditions and effects of actions is essential for applying…
Reinforcement Learning Improves LLM Accuracy and Reasoning in Disease Classification from Radiology Reports arXiv:2604.19060v1 Announce Type: new Abstract: Accurate disease classification from radiology reports is essen…
Reasoning-Aware AIGC Detection via Alignment and Reinforcement arXiv:2604.19172v1 Announce Type: new Abstract: The rapid advancement and widespread adoption of Large Language Models (LLMs) have elevated the need for rel…
UAF: A Unified Audio Front-end LLM for Full-Duplex Speech Interaction arXiv:2604.19221v1 Announce Type: new Abstract: Full-duplex speech interaction, as the most natural and intuitive mode of human communication, is dri…
Industrial Surface Defect Detection via Diffusion Generation and Asymmetric Student-Teacher Network arXiv:2604.19240v1 Announce Type: new Abstract: Industrial surface defect detection often suffers from limited defect s…
Explicit Trait Inference for Multi-Agent Coordination arXiv:2604.19278v1 Announce Type: new Abstract: LLM-based multi-agent systems (MAS) show promise on complex tasks but remain prone to coordination failures such as g…
Do Agents Dream of Root Shells? Partial-Credit Evaluation of LLM Agents in Capture The Flag Challenges arXiv:2604.19354v1 Announce Type: new Abstract: Large Language Model (LLM) agents are increasingly proposed for auto…
GRASPrune: Global Gating for Budgeted Structured Pruning of Large Language Models arXiv:2604.19398v1 Announce Type: new Abstract: Large language models (LLMs) are expensive to serve because model parameters, attention c…
Four-Axis Decision Alignment for Long-Horizon Enterprise AI Agents arXiv:2604.19457v1 Announce Type: new Abstract: Long-horizon enterprise agents make high-stakes decisions (loan underwriting, claims adjudication, clini…
CoDA: Towards Effective Cross-domain Knowledge Transfer via CoT-guided Domain Adaptation arXiv:2604.19488v1 Announce Type: new Abstract: Large language models (LLMs) have achieved substantial advances in logical reasoni…
SimDiff: Depth Pruning via Similarity and Difference arXiv:2604.19520v1 Announce Type: new Abstract: Depth pruning improves the deployment efficiency of large language models (LLMs) by identifying and removing redundant…
Revac: A Social Deduction Reasoning Agent arXiv:2604.19523v1 Announce Type: new Abstract: Social deduction games such as Mafia present a unique AI challenge: players must reason under uncertainty, interpret incomplete a…
Enhancing Construction Worker Safety in Extreme Heat: A Machine Learning Approach Utilizing Wearable Technology for Predictive Health Analytics arXiv:2604.19559v1 Announce Type: new Abstract: Construction workers are hi…
Detecting Data Contamination in Large Language Models arXiv:2604.19561v1 Announce Type: new Abstract: Large Language Models (LLMs) utilize large amounts of data for their training, some of which may come from copyrighte…
Multi-modal Reasoning with LLMs for Visual Semantic Arithmetic arXiv:2604.19567v1 Announce Type: new Abstract: Reinforcement learning (RL) as post-training is crucial for enhancing the reasoning ability of large languag…
Time Series Augmented Generation for Financial Applications arXiv:2604.19633v1 Announce Type: new Abstract: Evaluating the reasoning capabilities of Large Language Models (LLMs) for complex, quantitative financial tasks…
Modelling and Analysing Behaviours and Emotions via Complex User Interactions arXiv:1902.07683v1 Announce Type: cross Abstract: Over the past 15 years, the volume, richness and quality of data collected from the combine…
Two-dimensional early exit optimisation of LLM inference arXiv:2604.18592v1 Announce Type: cross Abstract: We introduce a two-dimensional (2D) early exit strategy that coordinates layer-wise and sentence-wise exiting fo…
Thermal Anomaly Detection using Physics Aware Neuromorphic Networks: Comparison between Raw and L1C Sentinel-2 Data arXiv:2604.18606v1 Announce Type: cross Abstract: Damage caused by bushfires and volcanic eruptions esc…
TurboEvolve: Towards Fast and Robust LLM-Driven Program Evolution arXiv:2604.18607v1 Announce Type: cross Abstract: LLM-driven program evolution can discover high-quality programs, but its cost and run-to-run variance h…
SpikeMLLM: Spike-based Multimodal Large Language Models via Modality-Specific Temporal Scales and Temporal Compression arXiv:2604.18610v1 Announce Type: cross Abstract: Multimodal Large Language Models (MLLMs) have achi…
Agent-GWO: Collaborative Agents for Dynamic Prompt Optimization in Large Language Models arXiv:2604.18612v1 Announce Type: cross Abstract: Large Language Models (LLMs) have demonstrated strong capabilities in complex re…
ARGUS: Agentic GPU Optimization Guided by Data-Flow Invariants arXiv:2604.18616v1 Announce Type: cross Abstract: LLM-based coding agents can generate functionally correct GPU kernels, yet their performance remains far b…
NeuroAI and Beyond: Bridging Between Advances in Neuroscience and ArtificialIntelligence arXiv:2604.18637v1 Announce Type: cross Abstract: Neuroscience and Artificial Intelligence (AI) have made impressive progress in r…