huqi - OpenI - 启智AI开源社区提供普惠算力！ -

huqi synced commits to master at huqi/samples from mirror

16d6b7d8e2 !2811 fix soc version * fix soc version

20 minutes ago

huqi synced commits to v0.11.0-dev at huqi/vllm-ascend from mirror

87c0cfafa3 [0.11.0][Bugfix] fix fastapi version (#5048) ### What this PR does / why we need it? fix fastapi version <0.124.0 Signed-off-by: hfadzxy <starmoon_zhang@163.com>

37 minutes ago

huqi synced commits to V4.5.0 at huqi/oceanbase-doc from mirror

62346c0a61 PullRequest: 11449 450：upgrade overview
9ef134144d PullRequest: 11559 ob_llm_translate: zh->en, fix SQL doc minors
fc1208b6fc PullRequest: 11558 fix SQL doc minors
e1b88f3dc9 PullRequest: 11521 450 dima：Low error modification
b8f81efe2b PullRequest: 11482 450: Add heap table usage recommendations [pending patch 441]
Compare 7 commits »

49 minutes ago

huqi synced commits to V4.4.1 at huqi/oceanbase-doc from mirror

e06bb0d81d PullRequest: 11090 441 dima：Low error modification

49 minutes ago

huqi synced commits to V4.2.5 at huqi/oceanbase-doc from mirror

bdfdeb0d37 PullRequest: 11530 425 dima：Low error modification

49 minutes ago

huqi synced commits to V4.2.4 at huqi/oceanbase-doc from mirror

c61d383949 PullRequest: 11531 424 dima：Low error modification

49 minutes ago

huqi synced commits to V4.2.3 at huqi/oceanbase-doc from mirror

6f2e46da85 PullRequest: 11532 423 dima：Low error modification

49 minutes ago

huqi synced commits to V4.2.1 at huqi/oceanbase-doc from mirror

9a855d6c3e PullRequest: 11534 421 dima：Low error modification

49 minutes ago

huqi synced commits to V4.0.0-preview at huqi/oceanbase-doc from mirror

3a123bd01a PullRequest: 11537 400use dima：Low error modification

49 minutes ago

huqi synced commits to main at huqi/LLaMA-Factory from mirror

a0179772ab [example] add deepspeed autotp config and example (#9602)

6 hours ago

huqi synced commits to main at huqi/vllm-ascend from mirror

b662d914a4 [bugfix] [main] Fix KV cache query inconsistency across different TP ranks in the KV Pool (#5030) ### What this PR does / why we need it? In the current KV Pool scenario for models like MLA and GQA, where different TP ranks generate identical KV caches, the system is designed to store only a single copy. The previous approach allowed each card to query storage requirements dynamically, but inconsistent query results across cards led to incorrect storage. To fix this, the new solution pre-allocates storage responsibilities; each card now simply stores its pre-assigned blocks, bypassing the inconsistent query step and ensuring data correctness. - vLLM version: v0.12.0 - vLLM main: https://github.com/vllm-project/vllm/commit/ad32e3e19ccf0526cb6744a5fed09a138a5fb2f9 --------- Signed-off-by: fems14 <1804143737@qq.com>
c064d11fd7 [Cleanup] Remove unused attn_metadata parameter from Proposer classes (#4862) The `attn_metadata` is not used by any draft proposer, so we can remove it. - vLLM version: v0.12.0 - vLLM main: https://github.com/vllm-project/vllm/commit/ad32e3e19ccf0526cb6744a5fed09a138a5fb2f9 --------- Signed-off-by: Jade Zheng <zheng.shoujian@outlook.com>
a9625851ef [Attention] Temporarily add back pa for small batch sizes. (#4765) ### What this PR does / why we need it? This PR adds back pa in scenarios of small batch sizes due to performance consideration. Will remove pa once fia performs better than pa in all scenarios. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? CI passed with existing test. - vLLM version: v0.12.0 - vLLM main: https://github.com/vllm-project/vllm/commit/ad32e3e19ccf0526cb6744a5fed09a138a5fb2f9 --------- Signed-off-by: whx-sjtu <2952154980@qq.com> Co-authored-by: weijinqian0 <1184188277@qq.com>
95e6400128 [KVPool]Fix PP get bug (#5007) ### What this PR does / why we need it? When kv caches are evicted from the key-value pool, it's possible that the kv cache for pp0 is still active, but the kv cache for pp1 has already been evicted. Therefore, a unified check is needed during the get operation. - vLLM version: v0.12.0 - vLLM main: https://github.com/vllm-project/vllm/commit/ad32e3e19ccf0526cb6744a5fed09a138a5fb2f9 Signed-off-by: baxingpiaochong <771405853@qq.com> Co-authored-by: Jade Zheng <zheng.shoujian@outlook.com>
a5cb8e40f5 [doc]Modify quantization tutorials (#5026) ### What this PR does / why we need it? Modify quantization tutorials to correct a few mistakes: Qwen3-32B-W4A4.md and Qwen3-8B-W4A8.md Qwen3-8B-W4A8: need to set one idle npu card. Qwen3-32B-W4A4: need to set two idle npu cards for the flatquant training and modify the calib_file path which does not match the ModeSlim version. ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? - vLLM version: v0.12.0 - vLLM main: https://github.com/vllm-project/vllm/commit/ad32e3e19ccf0526cb6744a5fed09a138a5fb2f9 Signed-off-by: IncSec <1790766300@qq.com>
Compare 24 commits »

6 hours ago

huqi synced commits to main at huqi/opencompass from mirror

5c0213de89 [Fix] fix OpenAISDKRollout and dump-res-length (#2351)
2667d4d0ec [Dataset] add dataset SciReasoner (#2360) * add scireasoner datasets * reorganize scireasoner * commit1 * fix * fix * fix * fix * fix * fix * fix * fix * fix summarizer --------- Co-authored-by: yusun-nlp <yusun.nlp@gmail.com>
d836b49fee [ci] add v1.8 new datasets (#2358) * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update
Compare 3 commits »

7 hours ago

huqi synced commits to V4.4.1 at huqi/oceanbase-doc from mirror

eb18eee2a5 PullRequest: 11090 441 dima：Low error modification

7 hours ago

huqi synced commits to develop at huqi/Paddle from mirror

24c29dc7eb [CodeStyle][Xdoctest][17,21,27,28] Fix example code(`paddle.Tensor.matmul`,`paddle.Tensor.new_empty`,`paddle.Tensor.sgn`,`paddle.Tensor.shape`,) (#76691) * Doc Sample Code Fix Task * Doc Sample Code Fix * Doc Sample Code Fix * doc_sample_code_fix * Update python/paddle/tensor/math.py Co-authored-by: Nyakku Shigure <sigure.qaq@gmail.com> * doc_sample_code_fix:paddle.Tensor.shape --------- Co-authored-by: Nyakku Shigure <sigure.qaq@gmail.com>
107306f398 enable avx (#76835)
24e18c9948 Update forwards.h to metax (#76875)
52f76d38a4 fix custom device all to all (#76880)
704e3eeb87 [API Compatibilty][CustomDevice] Remove force set grad to None (#76883) * fix(eager): 移除梯度清除时的类型和位置匹配检查 - 删除梯度张量与原始张量在类型和位置上不匹配时的特殊处理逻辑 - 现在无论梯度张量是否匹配，当 set_to_zero 为 true 时都会执行清零操作 - 修改测试用例以验证新的梯度清除行为 - 将测试中的 set_to_zero 参数明确设置为 false 来测试梯度重置为 None 的情况 * fix(test): 放宽梯度计算错误的异常类型检查 - 将测试中的断言从只捕获 RuntimeError 改为同时捕获 ValueError 和 RuntimeError - 这更准确地反映了实际可能抛出的异常类型 - 确保测试在多种错误场景下都能正确验证功能
Compare 11 commits »

7 hours ago

huqi synced commits to master at huqi/samples from mirror

58307f3d89 !2810 Fix matmul * Fix matmul

16 hours ago

huqi synced commits to master at huqi/samples from mirror

2a3fbb4413 !2809 update include path for Libraries * update include path for Libraries

23 hours ago

huqi synced commits to develop at huqi/Paddle from mirror

fdac87060f DeepEP normal dispatch stream bugfix (#76901)
de56830107 [XPU] update xhpc to 20251213 (#76902)
3e32d54ca0 support batched_gemm fp32 (#76897)
b0394b13c0 [XPU] support bflaot16,float16 type for op scatter_nd (#76893)
58a6a3aa47 [XPU] binding ceil ond gaussian_inplace op on xpu3 (#76874) * [XPU] binding ceil ond gaussian_inplace op on xpu3 * [XPU] ceil not support int and long
Compare 7 commits »

1 day ago

huqi synced commits to main at huqi/LLaMA-Factory from mirror

aeda079014 [v1] model loader (#9613)

1 day ago

huqi synced commits to main at huqi/vllm-ascend from mirror

bb7b74c14f add ut for model runner (#4991) ### What this PR does / why we need it? add ut for model runner - vLLM version: v0.12.0 - vLLM main: https://github.com/vllm-project/vllm/commit/ad32e3e19ccf0526cb6744a5fed09a138a5fb2f9 --------- Signed-off-by: LookAround <lixushi@huawei.com>
8090914d69 [CI] CI refactor (#4928) 1. rename workflow to better name 2. fix lint error 3. remove accuracy report doc and test - vLLM version: v0.12.0 - vLLM main: https://github.com/vllm-project/vllm/commit/ad32e3e19ccf0526cb6744a5fed09a138a5fb2f9 Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
ba28d54f35 [Perf]enable prefill flashcommon3 (#4065) ### What this PR does / why we need it? moe multistream overlap to improve the performance. ### How was this patch tested? --additional-config '{"multistream_overlap_gate": true}' - vLLM version: v0.12.0 - vLLM main: https://github.com/vllm-project/vllm/commit/ad32e3e19ccf0526cb6744a5fed09a138a5fb2f9 --------- Signed-off-by: AlvisGong <gwly0401@163.com> Signed-off-by: chenxiao <Jaychou1620@Gmail.com> Co-authored-by: clrs97 <524936896@qq.com> Co-authored-by: zzhx1 <zzh_201018@outlook.com> Co-authored-by: chenxiao <Jaychou1620@Gmail.com>
Compare 3 commits »

1 day ago

huqi synced commits to develop at huqi/Paddle from mirror

60c74b5142 [XPU] Fix contiguous complex64 strided-view regression test (#76895) - Use device-based tensor creation instead of unsupported place kwarg - Remove try/except and gate dtypes via XPU support list - Use as_strided to build non-contiguous zero-numel views for branch coverage - Rename test file to reflect strided-view focus

1 day ago