-
-
Notifications
You must be signed in to change notification settings - Fork 6.5k
Issues: vllm-project/vllm
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Feature]: Support Cascade Attention for Sliding Window Attention
feature request
New feature or request
good first issue
Good for newcomers
help wanted
Extra attention is needed
#15710
opened Mar 28, 2025 by
WoosukKwon
1 task done
[Bug]: docker scluster has no open port
bug
Something isn't working
#15707
opened Mar 28, 2025 by
jli113
1 task done
[Bug]: Qwen2.5: Sliding window for some but all layers is not supported. This model uses sliding window but Something isn't working
max_window_layers
= 28 is less than num_hidden_layers
= 28. Please open an issue to discuss this feature.
bug
#15705
opened Mar 28, 2025 by
Martin7-1
1 task done
[Bug]: Engine crash periodically running Deepseek V3/R1 on Hopper GPUs in cutlass_scaled_mm_sm90()
bug
Something isn't working
#15702
opened Mar 28, 2025 by
rymc
[Bug]: TP16 Not Working on k8s Cluster
bug
Something isn't working
#15698
opened Mar 28, 2025 by
yongho-chang
1 task done
[Feature]: Composite model loading using New feature or request
good first issue
Good for newcomers
AutoWeightsLoader
for all models
feature request
#15697
opened Mar 28, 2025 by
DarkLight1337
1 task done
[Feature]: MiniCPM-O support beam_search
feature request
New feature or request
#15694
opened Mar 28, 2025 by
yangJirui
1 task done
[Bug]: Performance downgrade 20% for AWQ + MLA + chunked-prefill
bug
Something isn't working
#15689
opened Mar 28, 2025 by
DefTruth
1 task done
2
[Bug]: Worker died during distributed inference
bug
Something isn't working
#15687
opened Mar 28, 2025 by
Haoxi2002
1 task done
[Bug]: ValueError: The checkpoint you are trying to load has model type Something isn't working
qwen2_5_omni
but Transformers does not recognize this architecture.
bug
#15685
opened Mar 28, 2025 by
jieguolove
1 task done
[Usage]: how to run two models in Docker
usage
How to use vllm
#15682
opened Mar 28, 2025 by
Tu1231
1 task done
[Bug]: Qwen-2-vl inference question about max_seq_length
bug
Something isn't working
#15680
opened Mar 28, 2025 by
Eduiskss
1 task done
[Feature]: K-cache only
feature request
New feature or request
#15679
opened Mar 28, 2025 by
HaiFengZeng
1 task done
[Bug]: v0.8.2 generates incomplete sequences for Qwen2-VL-7B under specific concurrency range
bug
Something isn't working
#15677
opened Mar 28, 2025 by
xushangning
1 task done
[Bug]: OpenAI-Compatible Server cannot be requested
bug
Something isn't working
#15675
opened Mar 28, 2025 by
lihao056
1 task done
[Usage]: torchrun data parallel and tensor parallel at the same time
usage
How to use vllm
#15672
opened Mar 28, 2025 by
JieFeng-cse
1 task done
[Feature]: update to flashinfer 0.2.3
feature request
New feature or request
#15666
opened Mar 28, 2025 by
youkaichao
1 task done
[Bug]: VLLM 0.8.2 OOM error (No error in 0.7.3 version)
bug
Something isn't working
#15664
opened Mar 28, 2025 by
manitadayon
1 task done
[Bug]: v0.8.2 The default parameters start with a very low performance of 9.8 tokens/s with deepseek-r1:671b
bug
Something isn't working
#15663
opened Mar 28, 2025 by
Hugh-yw
1 task done
[Feature]: distribute the package on macos
feature request
New feature or request
#15661
opened Mar 28, 2025 by
youkaichao
1 task done
[Usage]: different models in the /vllm/model directory
usage
How to use vllm
#15660
opened Mar 28, 2025 by
jeerychao
1 task done
[Feature]: Refactor the logic in tool parser manager and reasoning parser manager
feature request
New feature or request
#15658
opened Mar 28, 2025 by
gaocegege
1 task done
[Bug]: vllm server dead issue
bug
Something isn't working
#15653
opened Mar 27, 2025 by
pimang62
1 task done
[Performance]: Update Cascade Attention Heuristics for FA3
feature request
New feature or request
good first issue
Good for newcomers
help wanted
Extra attention is needed
#15647
opened Mar 27, 2025 by
WoosukKwon
1 task done
[Usage]: profiling.py example broken on v0.8.1
usage
How to use vllm
#15640
opened Mar 27, 2025 by
rajesh-s
1 task done
Previous Next
ProTip!
Adding no:label will show everything without a label.