liaowsh
Loading Heatmap…

liaowsh created pull request openioctopus/Grampus#1377

镜像管理及统一镜像功能

1 week ago

liaowsh pushed to lws-image-0731 at openioctopus/Grampus

  • b936cfaa21 Merge remote-tracking branch 'origin/V20251215' into lws-image-0731
  • dd593e6ae8 Merge pull request 'V20251113' (#1361) from V20251113 into master Reviewed-on: https://openi.pcl.ac.cn/openioctopus/Grampus/pulls/1361
  • Compare 2 commits »

1 week ago

liaowsh pushed to lws-image-0731 at openioctopus/Grampus

1 week ago

liaowsh commented on issue openioctopus/Grampus#1366

使用历史镜像起NPU训练任务,任务报错

{ "name": "lws-test-npu-wh-1", "tasks": [ { "name": "lws-test-npu-wh-1", "url": "", "code": { "id": "", "name": "openi_cloudbrain_example1230", "size": 0, "bucket": "test-opendata", "endPoint": "obs.cn-south-222.ai.pcl.cn", "readOnly": false, "objectKey": "job/liwei2025120218t554944891/code/master.zip", "isNeedUnzip": true, "isOverwrite": false, "containerPath": "/cache/code/master.zip", "internalMigrateId": 32433 }, "datasets": [ { "id": "", "name": "MnistDataset_torch", "size": 220782009, "bucket": "test-opendata", "endPoint": "obs.cn-south-222.ai.pcl.cn", "readOnly": true, "objectKey": "attachment/e/f/ef60e144-bee4-4281-91ad-cee76d0da80fef60e144-bee4-4281-91ad-cee76d0da80f/", "isNeedUnzip": false, "isOverwrite": false, "containerPath": "/cache/dataset/MnistDataset_torch", "internalMigrateId": 32431 }, { "id": "", "name": "MnistDataset_mindspore", "size": 54950081, "bucket": "test-opendata", "endPoint": "obs.cn-south-222.ai.pcl.cn", "readOnly": true, "objectKey": "attachment/c/6/c6e4784d-d29e-4dc7-8ec8-c63be4754087c6e4784d-d29e-4dc7-8ec8-c63be4754087/", "isNeedUnzip": false, "isOverwrite": false, "containerPath": "/cache/dataset/MnistDataset_mindspore", "internalMigrateId": 32432 } ], "imageId": "84893615233345d2b27810eeee782d11", "processorType": "npu.huawei.com/NPU", "nodeCount": 1, "resourceSpecId": "f2497d54732b45fb8d887e63be1db4a7", "bootFile": "npu_mnist_example/train.py" } ] }-----修改代码后,49环境武汉分中心及中原复现任务,可以成功。![image](/attachments/ae00a9ad-9781-4fcb-b70a-b019a08f2387)

1 week ago

liaowsh commented on issue openioctopus/Grampus#1367

广州超算任务创建失败

广超存储密码过期 ![image](/attachments/2e7847a8-d7e0-47bd-a0e3-a424b3538bb2)

1 week ago

liaowsh commented on issue openioctopus/Grampus#1368

K100_AI资源起训练任务,报错“internal error”

K1_100AI是发送到sugon-ai,该中心是无法起训练任务的。

1 week ago

liaowsh commented on issue openioctopus/Grampus#1369

GCU训练任务一直wating,分配不到分中心

{ "name": "lws-test-gcu-0", "tasks": [ { "name": "lws-test-gcu-0", "url": "", "code": { "id": "", "name": "openi_cloudbrain_example", "size": 0, "bucket": "test-opendata", "endPoint": "obs.cn-south-222.ai.pcl.cn", "readOnly": false, "objectKey": "job/wjtes2025120314t494456268/code/master.zip", "isNeedUnzip": true, "isOverwrite": false, "containerPath": "/tmp/code/master.zip", "internalMigrateId": 32509 }, "datasets": [ { "id": "", "name": "MnistDataset_torch", "size": 220782009, "bucket": "test-opendata", "endPoint": "obs.cn-south-222.ai.pcl.cn", "readOnly": true, "objectKey": "attachment/e/f/ef60e144-bee4-4281-91ad-cee76d0da80fef60e144-bee4-4281-91ad-cee76d0da80f/", "isNeedUnzip": false, "isOverwrite": false, "containerPath": "/tmp/dataset/MnistDataset_torch", "internalMigrateId": 32507 }, { "id": "", "name": "MnistDataset_mindspore", "size": 54950081, "bucket": "test-opendata", "endPoint": "obs.cn-south-222.ai.pcl.cn", "readOnly": true, "objectKey": "attachment/c/6/c6e4784d-d29e-4dc7-8ec8-c63be4754087c6e4784d-d29e-4dc7-8ec8-c63be4754087/", "isNeedUnzip": false, "isOverwrite": false, "containerPath": "/tmp/dataset/MnistDataset_mindspore", "internalMigrateId": 32508 } ], "imageId": "7e170aebaa344f5383b4f0e3076405ba", "processorType": "enflame-tech.com/gcu", "nodeCount": 1, "resourceSpecId": "3e301b9ef23e46f0b80787f2afd12144", "bootFile": "gcu_mnist_example/train.py" } ] }---49环境代码修改后,复现昨天的任务,可以成功。![image](/attachments/cb3503d2-9fc2-421b-8843-dbf0a11b48ac)

1 week ago

liaowsh commented on issue openioctopus/Grampus#1370

天数GPGPU调试任务报错“请求参数非法”

{ "name": "lws-test-lw-train-9", "tasks": [ { "name": "lws-test-lw-train-9", "url": "", "code": { "id": "", "name": "openi_cloudbrain_example", "size": 0, "bucket": "test-opendata", "endPoint": "obs.cn-south-222.ai.pcl.cn", "readOnly": false, "objectKey": "job/wjtes2025120317t095056774/tmp/code/", "isNeedUnzip": true, "isOverwrite": false, "containerPath": "/tmp/code", "internalMigrateId": 32584 }, "imageId": "240e3c8ad8ca49399e173248546d29c1", "centerId": [ "iluvatar-cloud" ], "processorType": "iluvatar.com/iluvatar-gpgpu", "nodeCount": 1, "resourceSpecId": "11e882f8434e40ed88cb751260773a74" } ] }![image](/attachments/4483251f-9b63-4917-ae1a-205f80e5d243)

1 week ago

liaowsh pushed to lws-image-0731 at openioctopus/Grampus

1 week ago

liaowsh pushed to lws-image-0731 at openioctopus/Grampus

2 weeks ago

liaowsh pushed to lws-image-0731 at openioctopus/Grampus

2 weeks ago

liaowsh pushed to lws-image-0731 at openioctopus/Grampus

2 weeks ago

liaowsh pushed to lws-image-0731 at openioctopus/Grampus

2 weeks ago

liaowsh pushed to lws-image-0731 at openioctopus/Grampus

  • 772e333388 listUserImage add canter_image_id info

2 weeks ago

liaowsh pushed to lws-image-0731 at openioctopus/Grampus

2 weeks ago

liaowsh pushed to lws-image-0731 at openioctopus/Grampus

2 weeks ago

liaowsh created ILUVATAR-GPGPU type debugging task liaow202511281514941

2 weeks ago

liaowsh pushed to lws-image-0731 at openioctopus/Grampus

  • bf3f0027b8 Merge remote-tracking branch 'origin/V20251113' into lws-image-0731
  • 259fbd2a96 Merge pull request '彻底解决分中心id更新错误的问题' (#1354) from fix-1287 into V20251113 Reviewed-on: https://openi.pcl.ac.cn/openioctopus/Grampus/pulls/1354 Reviewed-by: xiongkai <xiongkai@noreply.localhost>
  • ea5e029ced #1287
  • Compare 3 commits »

2 weeks ago

liaowsh pushed to lws-image-0731 at openioctopus/Grampus

2 weeks ago

liaowsh pushed to lws-image-0731 at openioctopus/Grampus

3 weeks ago

Baidu
map