-
Notifications
You must be signed in to change notification settings - Fork 663
remove add_bias option #5425
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
remove add_bias option #5425
Conversation
|
Thanks for your contribution! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR removes the deprecated add_bias option from linear layers throughout the codebase. The parameter was previously used to control whether bias should be added in the current layer or in pre/post layers, but is no longer supported.
Key changes:
- Removed
add_biasparameter from all linear layer classes (LinearBase,ColumnParallelLinear,MergedColumnParallelLinear,ReplicatedLinear,GatedLinear,QKVParallelLinear,RowParallelLinear) - Updated quantization methods to unconditionally pass
layer.biasinstead of checkingadd_biasflag - Cleaned up test mocks that referenced the removed parameter
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| fastdeploy/model_executor/layers/linear.py | Removed add_bias parameter from all linear layer class constructors and docstrings; simplified bias handling in RowParallelLinear |
| fastdeploy/model_executor/layers/quantization/weight_only.py | Changed bias parameter to always pass layer.bias instead of conditional based on add_bias |
| fastdeploy/model_executor/layers/quantization/w4afp8.py | Changed bias parameter to always pass layer.bias instead of conditional based on add_bias |
| fastdeploy/model_executor/models/gpt_oss.py | Removed add_bias=True argument from RowParallelLinear instantiation |
| tests/quantization/test_w4afp8.py | Removed mock setup of add_bias attribute from test fixtures |
Comments suppressed due to low confidence (1)
tests/quantization/test_w4afp8.py:146
- After removing the
add_biasparameter, this test no longer properly tests the "without_bias" scenario. ThesetUpmethod setsself.layer.bias = "bias"(line 43), so this test now actually tests with a bias present.
To properly test the case when bias is None, you should add:
self.layer.bias = Noneat the beginning of this test method (after line 141), so that it correctly verifies the behavior when no bias is provided.
def test_apply_without_bias(self, mock_gemm):
mock_gemm.return_value = "out"
x = "x"
result = self.method.apply(self.layer, x)
self.assertEqual(result, "out")
args = mock_gemm.call_args.kwargs
self.assertIsNone(args["bias"])
| assert with_bias, "with_bias must be True when add_bias is True." | ||
| if self.tp_size > 1 and self.reduce_results: | ||
| set_weight_attrs(self.bias, {"tp_row_bias": True}) | ||
| if self.tp_size > 1 and self.reduce_results: |
Copilot
AI
Dec 8, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code now unconditionally accesses self.bias when self.tp_size > 1 and self.reduce_results, but self.bias is None when with_bias=False (see line 173-179 in LinearLayer.init). This will cause an AttributeError when trying to set weight attributes on a None object.
Consider adding a condition to check if bias exists:
if self.tp_size > 1 and self.reduce_results and self.with_bias:
set_weight_attrs(self.bias, {"tp_row_bias": True})| if self.tp_size > 1 and self.reduce_results: | |
| if self.tp_size > 1 and self.reduce_results and with_bias: |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #5425 +/- ##
==========================================
Coverage ? 59.51%
==========================================
Files ? 327
Lines ? 40643
Branches ? 6170
==========================================
Hits ? 24188
Misses ? 14588
Partials ? 1867
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Motivation
Remove the add_bias option; this option is no longer supported.
Modifications
Usage or Command
Accuracy Tests
Checklist
[FDConfig],[APIServer],[Engine],[Scheduler],[PD Disaggregation],[Executor],[Graph Optimization],[Speculative Decoding],[RL],[Models],[Quantization],[Loader],[OP],[KVCache],[DataProcessor],[BugFix],[Docs],[CI],[Optimization],[Feature],[Benchmark],[Others],[XPU],[HPU],[GCU],[DCU],[Iluvatar],[Metax]]pre-commitbefore commit.releasebranch, make sure the PR has been submitted to thedevelopbranch, then cherry-pick it to thereleasebranch with the[Cherry-Pick]PR tag.