A widespread campaign known as ShadowRay is actively exploiting a critical, yet officially disputed, vulnerability in the open-source Ray framework to hijack AI workloads, turning them into a self-propagating cryptomining botnet and stealing highly sensitive data.[1], [2] This campaign, formally cataloged by MITRE ATT&CK as Campaign C0045, represents the first known instance of AI infrastructure being exploited in the wild on a large scale.[5] The vulnerability, CVE-2023-48022, is a lack of authentication in Ray’s Jobs API that allows unauthenticated attackers to achieve remote code execution (RCE) on all nodes within an exposed Ray cluster.[1] Security researchers from Oligo Security discovered hundreds of compromised clusters in March 2024, with evidence suggesting the attacks have been ongoing since at least September 2023.[2], [7]
Executive Summary for Security Leadership
The ShadowRay campaign exploits a critical security gap in the Ray framework, a distributed computing platform used by major companies like OpenAI, Uber, and Amazon for AI and Python workloads. The core issue, CVE-2023-48022, is a missing authentication mechanism in the Ray Dashboard’s Jobs API (default port 8265), which the vendor, Anyscale, disputes as a vulnerability, arguing that Ray is intended for trusted networks.[1] This “shadow vulnerability” creates a significant blind spot as it is often suppressed by standard security scanners. Attackers are exploiting this to execute arbitrary code, leading to data theft and the illicit use of expensive GPU resources for cryptocurrency mining. The collective value of the compromised GPU hardware is estimated to be nearly $1 billion.[2]
- Threat: Active exploitation of CVE-2023-48022, a disputed RCE flaw in Ray.
- Impact: Cryptojacking, theft of AI models, API keys, and cloud credentials.
- Scale: Thousands of exposed Ray servers compromised globally.
- Key Mitigation: Isolate the Ray Dashboard (port 8265) from the internet immediately.
The Disputed Vulnerability: CVE-2023-48022
The vulnerability at the heart of the ShadowRay campaign is a critical lack of authentication in Ray’s Jobs API, which is accessible via the Ray Dashboard on port 8265.[1] This flaw allows any unauthenticated user with network access to the Dashboard to submit and execute arbitrary code, leading to full remote code execution on every node in the cluster. The National Vulnerability Database (NVD) assigned this flaw a CVSS score of 9.8, indicating critical severity.[1] However, Anyscale, the company behind Ray, contends that this is expected behavior and not a software bug. The vendor’s stance is that Ray is designed for trusted, isolated network environments and that enforcing security is the responsibility of the user.[1], [7] This dispute has created a dangerous scenario where a high-severity flaw is often suppressed by static application security testing (SAST) tools and major vulnerability databases, creating a significant blind spot for security teams relying on these sources.[2]
Campaign Mechanics and MITRE ATT&CK Mapping
The ShadowRay campaign’s tactics, techniques, and procedures (TTPs) have been formally mapped to the MITRE ATT&CK framework, providing a clear blueprint of the attack chain.[5] Attackers begin by scanning for publicly exposed Ray servers and exploiting CVE-2023-48022 (T1190 – Exploit Public-Facing Application). Once access is gained, they use the Python `pty` module to open reverse shells (T1059.006 – Command and Scripting Interpreter: Python). For persistence, they modify Unix shell configuration files (T1546.004 – Event Triggered Execution). A key technique observed is the use of Base64-encoded Python code to evade detection (T1027.013 – Obfuscated Files or Information).[5] The attackers also engage in credential dumping by executing commands like `cat /etc/shadow` (T1003.008 – OS Credential Dumping) and download tools like the XMRig miner directly onto the compromised systems (T1105 – Ingress Tool Transfer).[2], [5]
Proof-of-concept exploit code has been publicly released, demonstrating the ease of exploitation. The following Python code snippet, derived from public research, shows how an attacker can leverage the `ray` library to execute commands on a vulnerable cluster.[7]
“`python
from ray.job_submission import JobSubmissionClient
# Target a vulnerable Ray Dashboard
client = JobSubmissionClient(“http://victim-ip:8265″)
# Submit a job that executes a system command
submission_id = client.submit_job(
entrypoint=”cat /etc/passwd” # Or any other command
)
“`
This code uses the official `JobSubmissionClient` to run arbitrary commands, such as dumping the `/etc/passwd` file or establishing a reverse shell, without requiring any authentication.[7]
Consequences: Data Theft and Resource Hijacking
The compromise of a Ray cluster provides attackers with access to a wealth of sensitive information, making AI infrastructure a prime target. Stolen data includes proprietary AI models and training workloads, which could be stolen or poisoned during training phases.[7] Cloud and infrastructure credentials are also a major target, with attackers obtaining production database passwords, cloud access keys for AWS, GCP, and Azure, and private SSH keys. In some cases, this granted full administrative access to Kubernetes clusters.[2] A particularly valuable find for attackers is API tokens, including Hugging Face tokens for accessing private model repositories, OpenAI tokens that could drain credits, and Stripe tokens with the potential to sign financial transactions.[7], [9]
The primary monetization method observed in this campaign is cryptojacking. Attackers deploy miners like XMRig, NBMiner, and Zephyr to hijack the powerful GPU resources. One attacker using a specific Zephyr wallet reached the top 5% of a mining pool, indicating significant illicit earnings.[2] The financial impact is substantial; the on-demand cost for a single high-end GPU machine on AWS can reach $858,480 annually, and the collective value of the compromised hardware is estimated to be worth nearly $1 billion in compute power.[2], [7]
Mitigation and Security Recommendations
The most critical step in mitigating this threat is network isolation. The Ray Dashboard (port 8265) must not be exposed to the public internet under any circumstances and should be placed in a strictly controlled network environment.[2] Organizations should implement strict security group and firewall rules to block unauthorized access to this port. If external access is absolutely necessary, the Dashboard must be placed behind a reverse proxy that enforces strong authentication. Furthermore, administrators should avoid binding the Dashboard to `0.0.0.0` and instead use specific, trusted network interfaces.[4]
While Anyscale maintains its position on the issue, it has released a “Ray Open Ports Checker” tool to help users identify exposed clusters and has announced plans to include authorization features in a future release (Ray 2.11).[3], [7] Given that static tools may miss this “shadow vulnerability,” security teams should employ runtime security and monitoring solutions to detect anomalous behavior, such as the execution of cryptocurrency miners or unexpected reverse shell connections.[2] This incident highlights the shared responsibility between open-source developers and users to secure components, especially when vendors design them with an assumption of a trusted environment.[10]
The ShadowRay campaign marks a significant escalation in the targeting of AI infrastructure, transforming powerful computational resources into attack platforms. It underscores the severe risks posed by disputed vulnerabilities that fall outside traditional security scanning paradigms. For security teams, this serves as a stark reminder that foundational infrastructure, particularly in emerging fields like AI, must be secured with a defense-in-depth approach, prioritizing network segmentation and runtime monitoring to counter threats that evade conventional detection methods.