Conversation
Summary of ChangesHello @doringeman, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the model-runner's capabilities by integrating Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces support for the vllm-metal backend, which is a great addition for Apple Silicon users. The changes are comprehensive, including Makefile targets for building and installing, the backend implementation itself, and necessary adjustments in the CLI and scheduler to integrate it. The approach of distributing the Python dependencies via a Docker image is clever. My review includes a few points for improvement: a bug fix for handling streaming responses, a suggestion to improve Makefile maintainability, and minor enhancements to error handling and security in the new backend code.
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
3d0e6f4 to
ea914fc
Compare
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
Add `make vllm-metal-dev VLLM_METAL_PATH=../vllm-metal` target to install vllm-metal from local source in editable mode. Installs vLLM CPU requirements before and after vllm-metal to ensure compatible dependency versions(transformers < 5). Also recognize "dev" as valid version in Go backend to skip re-downloading from Docker Hub when using local development install. Signed-off-by: Dorin Geman <dorin.geman@docker.com>
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
|
None of these flagged security issues doesn't take user input. 🤔 |
There was a problem hiding this comment.
Hey - I've found 4 security issues, and left some high level feedback:
Security issues:
- Detected non-static command inside Command. Audit the input to 'exec.Command'. If unverified user data can reach this call site, this is a code injection vulnerability. A malicious actor can inject a malicious script to execute arbitrary code. (link)
- Detected non-static command inside Command. Audit the input to 'exec.Command'. If unverified user data can reach this call site, this is a code injection vulnerability. A malicious actor can inject a malicious script to execute arbitrary code. (link)
- Detected non-static command inside Command. Audit the input to 'exec.Command'. If unverified user data can reach this call site, this is a code injection vulnerability. A malicious actor can inject a malicious script to execute arbitrary code. (link)
- Detected non-static command inside Command. Audit the input to 'exec.Command'. If unverified user data can reach this call site, this is a code injection vulnerability. A malicious actor can inject a malicious script to execute arbitrary code. (link)
General comments:
- The new non-streaming handling in
ChatWithMessagesContextrelies on the first line not starting withdata:, which can be brittle for chunked or pretty-printed JSON responses; consider detecting streaming via headers (e.g.,Content-TypeorTransfer-Encoding) or attempting JSON decode first before falling back to SSE parsing. - Python 3.12 detection and venv setup logic is now duplicated across the Makefile targets,
vllmmetalbackend (findSystemPython/downloadAndExtract), andbuild-vllm-metal-tarball.sh; consolidating this into a single reusable helper or clearly aligning the flows would reduce the risk of version skew or behavior drift. - In
sandbox_darwin.Create, whenconfigurationis empty the process is no longer sandboxed but error messages still refer to a "sandboxed process"; updating the error text or differentiating the code path would make behavior clearer and easier to debug.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The new non-streaming handling in `ChatWithMessagesContext` relies on the first line not starting with `data: `, which can be brittle for chunked or pretty-printed JSON responses; consider detecting streaming via headers (e.g., `Content-Type` or `Transfer-Encoding`) or attempting JSON decode first before falling back to SSE parsing.
- Python 3.12 detection and venv setup logic is now duplicated across the Makefile targets, `vllmmetal` backend (`findSystemPython` / `downloadAndExtract`), and `build-vllm-metal-tarball.sh`; consolidating this into a single reusable helper or clearly aligning the flows would reduce the risk of version skew or behavior drift.
- In `sandbox_darwin.Create`, when `configuration` is empty the process is no longer sandboxed but error messages still refer to a "sandboxed process"; updating the error text or differentiating the code path would make behavior clearer and easier to debug.
## Individual Comments
### Comment 1
<location> `pkg/inference/backends/vllmmetal/vllmmetal.go:141` </location>
<code_context>
out, err := exec.CommandContext(ctx, pythonPath, "--version").Output()
</code_context>
<issue_to_address>
**security (go.lang.security.audit.dangerous-exec-command):** Detected non-static command inside Command. Audit the input to 'exec.Command'. If unverified user data can reach this call site, this is a code injection vulnerability. A malicious actor can inject a malicious script to execute arbitrary code.
*Source: opengrep*
</issue_to_address>
### Comment 2
<location> `pkg/inference/backends/vllmmetal/vllmmetal.go:209` </location>
<code_context>
venvCmd := exec.CommandContext(ctx, pythonPath, "-m", "venv", v.installDir)
</code_context>
<issue_to_address>
**security (go.lang.security.audit.dangerous-exec-command):** Detected non-static command inside Command. Audit the input to 'exec.Command'. If unverified user data can reach this call site, this is a code injection vulnerability. A malicious actor can inject a malicious script to execute arbitrary code.
*Source: opengrep*
</issue_to_address>
### Comment 3
<location> `pkg/inference/backends/vllmmetal/vllmmetal.go:274` </location>
<code_context>
cmd := exec.CommandContext(ctx, v.pythonPath, "-c", "import vllm_metal")
</code_context>
<issue_to_address>
**security (go.lang.security.audit.dangerous-exec-command):** Detected non-static command inside Command. Audit the input to 'exec.Command'. If unverified user data can reach this call site, this is a code injection vulnerability. A malicious actor can inject a malicious script to execute arbitrary code.
*Source: opengrep*
</issue_to_address>
### Comment 4
<location> `pkg/sandbox/sandbox_darwin.go:127` </location>
<code_context>
command = exec.CommandContext(ctx, name, arg...)
</code_context>
<issue_to_address>
**security (go.lang.security.audit.dangerous-exec-command):** Detected non-static command inside Command. Audit the input to 'exec.Command'. If unverified user data can reach this call site, this is a code injection vulnerability. A malicious actor can inject a malicious script to execute arbitrary code.
*Source: opengrep*
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
@sourcery-ai resolve |
|
@sourcery-ai dismiss |
ilopezluna
left a comment
There was a problem hiding this comment.
I'm super excited with this one 👏
ericcurtin
left a comment
There was a problem hiding this comment.
This LGTM, want to answer @ilopezluna 's queries @doringeman ?
e15e244 to
9258bbf
Compare
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
9258bbf to
9f64701
Compare
…extraction Signed-off-by: Dorin Geman <dorin.geman@docker.com>
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
|
@sourcery-ai dismiss |
You can build and install it manually using
or, let DMR pull vllm-metal from Hub (https://hub.docker.com/layers/docker/model-runner/vllm-metal-v0.1.0-20260126-121650/images/sha256-9f4eeb45c168889baa14e935f5a9200bf1dcc6ab9d5be392fb1092a414fcc8cf from https://github.com/vllm-project/vllm-metal/releases/tag/v0.1.0-20260126-121650)
or, install vllm-metal form local source (for vllm-metal development)
For running
vllm-metalPython 3.12 is required, the same used for compiling the wheel.The
vllm-metalfiles are downloaded and installed at~/.docker/model-runner/vllm-metal.Then, pull and use an MLX model.
At the moment
vllm-metaldoesn't seem to support streaming, so the CLI needs a tweak.