WARNING: THIS SITE IS A MIRROR OF GITHUB.COM / IT CANNOT LOGIN OR REGISTER ACCOUNTS / THE CONTENTS ARE PROVIDED AS-IS / THIS SITE ASSUMES NO RESPONSIBILITY FOR ANY DISPLAYED CONTENT OR LINKS / IF YOU FOUND SOMETHING MAY NOT GOOD FOR EVERYONE, CONTACT ADMIN AT ilovescratch@foxmail.com
Skip to content

[Bug]: Bad handling of indexing in detect_sharding_from_config #9767

@tcherckez-nvidia

Description

@tcherckez-nvidia

System Info

H100

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

build_and_run_ad.py --model THUDM/GLM-4-9B-0414 --args.model-factory AutoModelForCausalLM '--args.model-kwargs={}' --args.tokenizer null --args.world-size 2 --args.compile-backend torch-compile --args.attn-backend flashinfer --args.runtime trtllm --args.skip-loading-weights False --args.transforms.detect-sharding.simple-shard-only False --args.max-seq-len 512 --benchmark.enabled True --benchmark.results-path /jet/logs/basic/auto-deploy-model-coverage_ab-flashinfer_b-true_cb-torch-compile_m-thudm-glm-4-9b-0414_mf-automodelforcausallm_mk--_msl-512_r-trtllm_sso-false_sw-false_t-null_ws-2/extra.json --benchmark.store-results true

Expected behavior

Should pass

actual behavior

0:   File "/opt/tensorrt-llm/tensorrt_llm/_torch/auto_deploy/transform/interface.py", line 358, in __call__
0:     mod, info_apply = self._apply_per_gm_or_whole_model(mod, cm, factory, shared_config)
0:                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
0:   File "/opt/tensorrt-llm/tensorrt_llm/_torch/auto_deploy/transform/interface.py", line 417, in _apply_per_gm_or_whole_model
0:     graph_sub, info_apply = self._apply(graph_sub, cm, factory, shared_config)
0:                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
0:   File "/opt/tensorrt-llm/tensorrt_llm/_torch/auto_deploy/transform/library/sharding.py", line 243, in _apply
0:     info += detect_sharding_from_config(gm, transform_container, ShardingSource.FACTORY)
0:             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
0:   File "/opt/tensorrt-llm/tensorrt_llm/_torch/auto_deploy/transform/library/sharding.py", line 666, in detect_sharding_from_config
0:     _process_column_sharding(
0:   File "/opt/tensorrt-llm/tensorrt_llm/_torch/auto_deploy/transform/library/sharding.py", line 486, in _process_column_sharding
0:     fused_weight_dims = [s.args[3] - s.args[2] for s in linear_node.users]
0:                          ~~~~~~^^^
0: IndexError: tuple index out of range

additional notes

NA

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and checked the documentation and examples for answers to frequently asked questions.

Metadata

Metadata

Labels

AutoDeploy<NV> AutoDeploy BackendCustomized kernels<NV>Specialized/modified CUDA kernels in TRTLLM for LLM ops, beyond standard TRT. Dev & perf.bugSomething isn't workingtriagedIssue has been triaged by maintainers

Type

Projects

Status

Ready

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions