Ce mail provient de l'extérieur, restons vigilants

=====================================================================

                            CERT-Renater

                Note d'Information No. 2025/VULN673
_____________________________________________________________________

DATE                : 08/10/2025

HARDWARE PLATFORM(S): /

OPERATING SYSTEM(S): Systems running vllm versions prior to
                                  0.11.0.

=====================================================================
https://github.com/vllm-project/vllm/security/advisories/GHSA-wr9h-g72x-mwhm
https://github.com/advisories/GHSA-3f6c-7fw2-ppm4
https://github.com/vllm-project/vllm/security/advisories/GHSA-6fvq-23cw-5628
_____________________________________________________________________


API key authentication vulnerable to timing attack
High
russellb published GHSA-wr9h-g72x-mwhm Oct 7, 2025

Package
vllm (pip)

Affected versions
<=0.11.0

Patched versions
0.11.0


Description

Summary

The API key support in vLLM performed validation using a method that
was vulnerable to a timing attack. This could potentially allow an
attacker to discover a valid API key using an approach more efficient
than brute force.


Details

vllm/vllm/entrypoints/openai/api_server.py

Lines 1270 to 1274 in 4b946d6
 if url_path.startswith("/v1") and headers.get( 
         "Authorization") not in self.api_tokens: 
     response = JSONResponse(content={"error": "Unauthorized"}, 
                             status_code=401) 
     return response(scope, receive, send) 

API key validation used a string comparison that will take longer the
more characters the provided API key gets correct. Data analysis
across many attempts can allow an attacker to determine when it finds
the next correct character in the key sequence.


Impact

Deployments relying on vLLM's built-in API key validation are
vulnerable to authentication bypass using this technique.

Severity
High
7.5/ 10

CVSS v3 base metrics
Attack vector
Network
Attack complexity
Low
Privileges required
None
User interaction
None
Scope
Unchanged
Confidentiality
High
Integrity
None
Availability
None
CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N

CVE ID
CVE-2025-59425

Weaknesses
Weakness CWE-385

Credits

    @russellb russellb Remediation reviewer

_____________________________________________________________________

vLLM is vulnerable to Server-Side Request Forgery (SSRF) through
`MediaConnector` class
High severity GitHub Reviewed Published Oct 7, 2025 in
vllm-project/vllm • Updated Oct 8, 2025

Vulnerability details
Dependabot alerts 0

Package
vllm (pip)

Affected versions
>= 0.5.0, < 0.11.0

Patched versions
0.11.0


Description

Summary

A Server-Side Request Forgery (SSRF) vulnerability exists in the
MediaConnector class within the vLLM project's multimodal feature set.
The load_from_url and load_from_url_async methods fetch and process
media from user-provided URLs without adequate restrictions on the
target hosts. This allows an attacker to coerce the vLLM server into
making arbitrary requests to internal network resources.

This vulnerability is particularly critical in containerized
environments like llm-d, where a compromised vLLM pod could be used
to scan the internal network, interact with other pods, and
potentially cause denial of service or access sensitive data. For
example, an attacker could make the vLLM pod send malicious requests
to an internal llm-d management endpoint, leading to system instability
by falsely reporting metrics like the KV cache state.


Vulnerability Details

The core of the vulnerability lies in the
MediaConnector.load_from_url method and its asynchronous counterpart.
These methods accept a URL string to fetch media
content (images, audio, video).

https://github.com/vllm-project/vllm/blob/119f683949dfed10df769fe63b2676d7f1eb644e/vllm/multimodal/utils.py#L97-L113

The function directly processes URLs with http, https, and file schemes.
An attacker can supply a URL pointing to an internal IP address or a
localhost endpoint. The vLLM server will then initiate a connection to
this internal resource.

    HTTP/HTTPS Scheme: An attacker can craft a request like {"image_url": "http://127.0.0.1:8080/internal_api"}.
The vLLM server will send a GET request to this internal endpoint.

    File Scheme: The _load_file_url method attempts to restrict file
access to a subdirectory defined by --allowed-local-media-path. While
this is a good security measure for local file access, it does not
prevent network-based SSRF attacks.


Impact in llm-d Environments

The risk is significantly amplified in orchestrated environments such
as llm-d, where multiple pods communicate over an internal network.

    Denial of Service (DoS): An attacker could target internal
management endpoints of other services within the llm-d cluster. For
instance, if a monitoring or metrics service is exposed internally,
an attacker could send malformed requests to it. A specific example is
an attacker causing the vLLM pod to call an internal API that reports
a false KV cache utilization, potentially triggering incorrect scaling
decisions or even a system shutdown.

    Internal Network Reconnaissance: Attackers can use the
vulnerability to scan the internal network for open ports and services
by providing URLs like http://10.0.0.X:PORT and observing the server's
response time or error messages.

    Interaction with Internal Services: Any unsecured internal service
becomes a potential target. This could include databases, internal
APIs, or other model pods that might not have robust authentication,
as they are not expected to be directly exposed.

Delegating this security responsibility to an upper-level orchestrator
like llm-d is problematic. The orchestrator cannot easily distinguish
between legitimate requests initiated by the vLLM engine for its own
purposes and malicious requests originating from user input, thus
complicating traffic filtering rules and increasing management overhead.


Proposed Mitigation

To address this vulnerability, it is essential to restrict the URLs that
the MediaConnector can access. The principle of least privilege should
be applied.

It is recommend to implement a configurable allowlist or denylist for
domains and IP addresses.

    Allowlist: The most secure approach is to allow connections only to
a predefined list of trusted domains. This could be configured via a
command-line argument, such as --allowed-media-domains. By default, this
list could be empty, forcing administrators to explicitly enable
external media fetching.

    Denylist: Alternatively, a denylist could block access to private IP
address ranges (127.0.0.1, 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16)
and other sensitive domains.

A check should be added at the beginning of the load_from_url methods
to validate the parsed hostname against this list before any connection
is made.


Example Implementation Idea:

# In MediaConnector.__init__
self.allowed_domains = set(config.get("allowed_media_domains", []))
self.denied_ip_ranges = [ip_network(r) for r in PRIVATE_IP_RANGES]

# In MediaConnector.load_from_url
url_spec = urlparse(url)
hostname = url_spec.hostname

if self.allowed_domains and hostname not in self.allowed_domains:
    raise ValueError(f"Domain {hostname} is not in the allowed list.")

ip_address = ip_address(socket.gethostbyname(hostname))
if any(ip_address in network for network in self.denied_ip_ranges):
    raise ValueError(f"Access to private IP address {ip_address} is forbidden.")

By integrating this control directly into vLLM, empower administrators
to enforce security policies at the source, creating a more secure
deployment by default and reducing the burden on higher-level
infrastructure management.


References

    GHSA-3f6c-7fw2-ppm4
    https://nvd.nist.gov/vuln/detail/CVE-2025-6242
    vllm-project/vllm@9d9a2b7
    https://access.redhat.com/security/cve/CVE-2025-6242
    https://bugzilla.redhat.com/show_bug.cgi?id=2373716


Severity
High
7.1/ 10

CVSS v3 base metrics
Attack vector
Network
Attack complexity
High
Privileges required
Low
User interaction
None
Scope
Unchanged
Confidentiality
High
Integrity
Low
Availability
High
CVSS:3.1/AV:N/AC:H/PR:L/UI:N/S:U/C:H/I:L/A:H

EPSS score

Weaknesses
Weakness CWE-601
Weakness CWE-918

CVE ID
CVE-2025-6242

GHSA ID
GHSA-3f6c-7fw2-ppm4

Source code
vllm-project/vllm

Credits

    @kexinoh kexinoh Finder
    @d3do-23 d3do-23 Reporter
    @lonelyuan lonelyuan Reporter
    @huachenheli huachenheli Remediation developer
    @DarkLight1337 DarkLight1337 Remediation reviewer
    @russellb russellb Coordinator
    @sidhpurwala-huzaifa sidhpurwala-huzaifa Coordinator

See something to contribute? Suggest improvements for this vulnerability.

_____________________________________________________________________

Resource-Exhaustion (DoS) through chat_template / chat_template_kwargs
in OpenAI-Compatible Server
Moderate
russellb published GHSA-6fvq-23cw-5628 Oct 7, 2025

Package
vllm (pip)

Affected versions
>=0.5.1,<0.11.0

Patched versions
0.11.0


Description

Summary

A resource-exhaustion (denial-of-service) vulnerability exists in
multiple endpoints of the OpenAI-Compatible Server due to the
ability to specify Jinja templates via the chat_template and
chat_template_kwargs parameters. If an attacker can supply these
parameters to the API, they can cause a service outage by
exhausting CPU and/or memory resources.


Details

When using an LLM as a chat model, the conversation history must
be rendered into a text input for the model. In hf/transformer,
this rendering is performed using a Jinja template. The
OpenAI-Compatible Server launched by vllm serve exposes a
chat_template parameter that lets users specify that template.
In addition, the server accepts a chat_template_kwargs parameter
to pass extra keyword arguments to the rendering function.


Because Jinja templates support programming-language-like
constructs (loops, nested iterations, etc.), a crafted template
can consume extremely large amounts of CPU and memory and thereby
trigger a denial-of-service condition.

Importantly, simply forbidding the chat_template parameter does not
fully mitigate the issue. The implementation constructs a
dictionary of keyword arguments for apply_hf_chat_template and then
updates that dictionary with the user-supplied chat_template_kwargs
via dict.update. Since dict.update can overwrite existing keys, an
attacker can place a chat_template key inside chat_template_kwargs
to replace the template that will be used by apply_hf_chat_template.

# vllm/entrypoints/openai/serving_engine.py#L794-L816
_chat_template_kwargs: dict[str, Any] = dict(
    chat_template=chat_template,
    add_generation_prompt=add_generation_prompt,
    continue_final_message=continue_final_message,
    tools=tool_dicts,
    documents=documents,
)
_chat_template_kwargs.update(chat_template_kwargs or {})

request_prompt: Union[str, list[int]]
if isinstance(tokenizer, MistralTokenizer):
    ...
else:
    request_prompt = apply_hf_chat_template(
        tokenizer=tokenizer,
        conversation=conversation,
        model_config=model_config,
        **_chat_template_kwargs,
    )


Impact

If an OpenAI-Compatible Server exposes endpoints that accept
chat_template or chat_template_kwargs from untrusted clients,
an attacker can submit a malicious Jinja template (directly or
by overriding chat_template inside chat_template_kwargs) that
consumes excessive CPU and/or memory. This can result in a
resource-exhaustion denial-of-service that renders the server
unresponsive to legitimate requests.


Fixes

    #25794


Severity
Moderate
6.5/ 10

CVSS v3 base metrics
Attack vector
Network
Attack complexity
Low
Privileges required
Low
User interaction
None
Scope
Unchanged
Confidentiality
None
Integrity
None
Availability
High
CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H

CVE ID
CVE-2025-61620

Weaknesses
Weakness CWE-20
Weakness CWE-400
Weakness CWE-770
Weakness CWE-789


Credits

    @key-moon key-moon Reporter
    @Ga-ryo Ga-ryo Other
    @Isotr0py Isotr0py Remediation developer
    @DarkLight1337 DarkLight1337 Remediation reviewer


=========================================================
+ CERT-RENATER        |    tel : 01-53-94-20-44         +
+ 23/25 Rue Daviel    |    fax : 01-53-94-20-41         +
+ 75013 Paris         |   email:cert@support.renater.fr +
=========================================================