kewei
Loading Heatmap…

kewei synced commits to ronshakutai/gpu-optimizations at kewei/presidio from mirror

  • 8c453d4948 remove benchmark script.
  • c46929fd05 Merge branch 'main' into ronshakutai/gpu-optimizations
  • e0109d405c Bump actions/cache from 4 to 5 (#1817) Bumps [actions/cache](https://github.com/actions/cache) from 4 to 5. - [Release notes](https://github.com/actions/cache/releases) - [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md) - [Commits](https://github.com/actions/cache/compare/v4...v5) --- updated-dependencies: - dependency-name: actions/cache dependency-version: '5' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
  • 8d7804dca4 Add comprehensive error path mocking for device_detector and spacy GPU config tests
  • 754349be5e test: enhance GPU detection tests for DeviceDetector and SpacyNlpEngine
  • Compare 11 commits »

3 hours ago

kewei synced commits to main at kewei/presidio from mirror

  • e0109d405c Bump actions/cache from 4 to 5 (#1817) Bumps [actions/cache](https://github.com/actions/cache) from 4 to 5. - [Release notes](https://github.com/actions/cache/releases) - [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md) - [Commits](https://github.com/actions/cache/compare/v4...v5) --- updated-dependencies: - dependency-name: actions/cache dependency-version: '5' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

3 hours ago

kewei synced commits to coverage-data-presidio-structured at kewei/presidio from mirror

3 hours ago

kewei synced commits to coverage-data-presidio-image-redactor at kewei/presidio from mirror

3 hours ago

kewei synced commits to coverage-data-presidio-cli at kewei/presidio from mirror

3 hours ago

kewei synced commits to coverage-data-presidio-anonymizer at kewei/presidio from mirror

3 hours ago

kewei synced commits to coverage-data-presidio-analyzer at kewei/presidio from mirror

3 hours ago

kewei synced commits to main at kewei/protobuf from mirror

  • d326c1d6da Prepare to make many APIs [[nodiscard]]. PiperOrigin-RevId: 844770264
  • eade3d4c54 [ObjC] Revise tag parsing to have a 5 byte limit. There is now a conformane test that checks overlong varints within tags; revise the tag parsing to ensure it doesn't allow overlong values. PiperOrigin-RevId: 844767809
  • 4c2f2727bc Automated Code Change PiperOrigin-RevId: 844736039
  • 2d9cbdc589 Add support for moving lazy fields in static reflection message movers. PiperOrigin-RevId: 844643215
  • 4459a20205 Support more chars in type URLs in the Python text-format parser. Change the Python text-format parser to allow for more characters and formats in the type URL prefixes of expanded Any protos. This follows a recent change to the text-format spec which we are now closely following. Refs: - [1] https://protobuf.dev/reference/protobuf/textformat-spec/#characters - [2] https://protobuf.dev/reference/protobuf/textformat-spec/#field- PiperOrigin-RevId: 844614988
  • Compare 5 commits »

3 hours ago

kewei synced commits to moe-imp at kewei/transformers from mirror

3 hours ago

kewei synced commits to main at kewei/transformers from mirror

  • c7aec088a6 Enforce call to `post_init` and fix all of them (#42873) * fix all * style * simplify * improve msg * fix * oupsi * add colors and sort
  • f3d5f2558b [CB] Easy optimizations for continuous batching (#42839) * Cb example more args * Remove useless sync * Better new tokens, and no more BS1 on outputs * Add dynamic to compile to avoid many graphs * Sort prefix to maximize cache hits * More robust ways to retrieve results in test * Style * Update src/transformers/generation/continuous_batching/continuous_api.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
  • 298d08dc36 typo (#42863) just a typo
  • a187b857a7 Remove tied weights from internal attribute if they are not tied (#42871) fix
  • 64c12fdf5f [docs] Improve contribution guidelines for Quantization (#42870) * update * fix * nit * nit
  • Compare 8 commits »

3 hours ago

kewei synced commits to clean-param-size at kewei/transformers from mirror

3 hours ago

kewei synced commits to master at kewei/mindformers from mirror

  • 7b8b27fe9a !7869 【master】【bugfix】修复sharded_tesnor和metadata测试用例 Merge pull request !7869 from 森镇/fix_test_ut_of_sharded_tensor
  • 13add2b32e !7866 【bugfix】【master】去除Qwen3系列中数据集无用配置,避免增加额外操作引起性能波动 Merge pull request !7866 from hsshuai/bugfix/master/dataset_setting
  • d3b3bbc2a3 修复sharded_tesnor和metadata测试用例
  • 9f0a67b57f !7837 修改blended_megatron_dataset_builder和gpt_data偶发错误 Merge pull request !7837 from zzzkeke/new/add_builder_test
  • 477621fde7 Update dataset configurations to remove attention_mask in Qwen3 YAML
  • Compare 6 commits »

4 hours ago

kewei synced commits to main at kewei/lobe-chat from mirror

  • cf197e638d 🐛 fix: correct claude_args quoting to prevent shell-quote parsing errors (#10790) Closes #10789

4 hours ago

kewei synced commits to lighthouse at kewei/lobe-chat from mirror

4 hours ago

kewei synced commits to main at kewei/RWKV-LM from mirror

4 hours ago

kewei synced commits to main at kewei/netron from mirror

4 hours ago

kewei synced commits to refs/merge-requests/38621/head at kewei/mesa from mirror

  • 58aa336eb7 lavapipe: Add CPU-based BC texture decompression emulation Add support for BC (S3TC/RGTC/BPTC) compressed texture emulation using the dual-plane approach. Block compressed textures are decompressed to an emulation plane during image copies using util_format_translate_3d(). Unlike ASTC/ETC2 which have dedicated unpack functions, BC formats use util_format_translate_3d() which handles the decompression internally through format pack/unpack callbacks. Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>

4 hours ago

kewei synced commits to main at kewei/mesa from mirror

  • 188193cbf2 iris: Add comments from Bspec fast-clear preamble page Copy and paste from anv. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38928>
  • 18e67d853f iris: Fix pipe control around fast-clears Use the right pipe control helper function so that texture invalidates occur after the end-of-pipe sync rather than during. Fixes: 23658920d13 ("anv,iris: Skip tex invalidate for clear conversion") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12550 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38928>
  • a2b70ce4ec aco/isel: remove uniform reduce/scan optimization This is now done in NIR, with the exception of exclusive min/max/and/or scans. But those are not really useful, and if we ever come across them we can optimize them in NIR using write_invocation_amd. No Foz-DB changes on Navi21. Acked-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38902>
  • 81245e262f radeonsi: use nir_opt_uniform_subgroup Acked-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38902>
  • ec81337d8d radv: use nir_opt_uniform_subgroup Foz-DB Navi21: Totals from 665 (0.68% of 97581) affected shaders: MaxWaves: 12856 -> 12822 (-0.26%) Instrs: 2073376 -> 2068645 (-0.23%); split: -0.23%, +0.00% CodeSize: 11116904 -> 11098376 (-0.17%); split: -0.18%, +0.01% VGPRs: 39584 -> 39568 (-0.04%); split: -0.20%, +0.16% SpillSGPRs: 160 -> 155 (-3.12%) SpillVGPRs: 2995 -> 2968 (-0.90%) Latency: 15432093 -> 15503462 (+0.46%); split: -0.13%, +0.59% InvThroughput: 3344411 -> 3351185 (+0.20%); split: -0.08%, +0.28% VClause: 50278 -> 50225 (-0.11%); split: -0.15%, +0.04% SClause: 57537 -> 57505 (-0.06%); split: -0.18%, +0.13% Copies: 189642 -> 188175 (-0.77%); split: -0.86%, +0.08% Branches: 68800 -> 68502 (-0.43%); split: -0.45%, +0.02% PreSGPRs: 37646 -> 37068 (-1.54%) PreVGPRs: 35891 -> 35943 (+0.14%) VALU: 1386943 -> 1385881 (-0.08%); split: -0.09%, +0.01% SALU: 287322 -> 284165 (-1.10%); split: -1.11%, +0.01% VMEM: 90874 -> 90820 (-0.06%) Acked-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38902>
  • Compare 34 commits »

4 hours ago

kewei synced commits to users/makslevental/mlirpythonsupport at kewei/llvm-project from mirror

  • 164217cc33 Merge branch 'main' into users/makslevental/mlirpythonsupport
  • 9975cb166e [libc++][expected] Applied `[[nodiscard]]` (#170245) [[nodiscard]] should be applied to functions where discarding the return value is most likely a correctness issue. - https://libcxx.llvm.org/CodingGuidelines.html - https://wg21.link/expected.bad.void - https://wg21.link/expected.bad - https://wg21.link/expected.expected - https://wg21.link/expected.void - https://wg21.link/expected.unexpected It was already discussed not to mark the type `std::expected` as `[[nodiscard]]` see: https://github.com/llvm/llvm-project/pull/139651 https://github.com/llvm/llvm-project/pull/130820 Also: https://github.com/llvm/llvm-project/pull/154943
  • 8f93365b19 [tsan] Export __cxa_guard_ interceptors from TSan runtime. (#171921) These functions from C++ ABI are defined in compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp and are supposed to replace implementations from libstdc++/libc++abi. We need to export them similar to why we need to export other interceptors and TSan runtime functions - e.g. if a dlopen-ed shared library depends on `__cxa_guard_acquire`, it needs to pick up the exported definition from the TSan runtime that was linked into the main executable calling the dlopen() However, because the `__cxa_guard_` functions don't use traditional interceptor machinery, they are omitted from the auto-generated `libclang_rt.tsan.a.syms` files. Fix this by adding them to tsan.syms.extra file explicitly. Co-authored-by: Vitaly Buka <vitalybuka@google.com>
  • 4b267d5caa [MLIR][MemRef] Emit error on atomic generic result op defined outside the region (#172190) While figuring out how to perform an atomic exchange on a memref, I tried the generic atomic rmw with the yielded value captured from the enclosing scope (instead of a plain atomic_rmw with `arith::AtomicRMWKind::assign`). Instead of segfaulting, this PR changes the pass to produce an error when the result is not found in the region's IR map. It might be more useful to give a suggestion to the user, but giving an error message instead of a crash is at least an imrovement, I think. See: #172184
  • dd33690686 [X86] combineVectorSizedSetCCEquality - convert to mayFoldIntoVector helper (#172215) Add AssumeSingleUse (default = false) argument to mayFoldIntoVector to allow us to match combineVectorSizedSetCCEquality behaviour with AssumeSingleUse=true Hopefully we can drop the AssumeSingleUse entirely soon, but there are a number of messy test regressions that need handling first
  • Compare 214 commits »

5 hours ago

kewei synced commits to users/hev/issue-168152 at kewei/llvm-project from mirror

5 hours ago

Baidu
map