-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Fix QueryPhraseMap.markTerminal() boost override due to conflicting query expansion #15434
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fix QueryPhraseMap.markTerminal() boost override due to conflicting query expansion #15434
Conversation
| this.terminal = true; | ||
| this.slop = slop; | ||
| this.boost = boost; | ||
| this.boost = Math.max(this.boost, boost); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your approach seems simpler, but I think we have an opportunity to optimize this. Instead of always executing all assignments, should we have something like:
if (!this.terminal || boost > this.boost) {
this.terminal = true;
this.slop = slop;
this.boost = boost;
this.termOrPhraseNumber = fieldQuery.nextTermOrPhraseNumber();
}
Benefits:
Performance: Avoids unnecessary state updates when boost doesn't improve
Semantic correctness: termOrPhraseNumber only increments when meaningful changes occur
Cleaner logic: Single condition handles both initialization and duplicate prevention
Current approach with Math.max():
Always updates all fields, even when boost=1 < existing boost=100
Increments termOrPhraseNumber unnecessarily on duplicate calls
Proposed approach:
Only updates when first call (!this.terminal) or when boost improves (boost > this.boost)
More efficient for queries with many overlapping phrases
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a fine suggestion but it's also a trivial matter. I'd even argue the complexity of the condition you propose makes the situation worse.... a future reader may scratching their heads for a little longer than merits the code being avoided.
|
This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution! |
dsmiley
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
Please bring up to date with main & add a changelog entry to CHANGES.txt
| pqF(50, "b", "c")); | ||
|
|
||
| Map<String, QueryPhraseMap> map = fq.rootMaps; | ||
| QueryPhraseMap qpm = map.get("f").subMap.get("a"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find the variable name choices here confusing. For example, should this one be a_qpm maybe?
|
Thanks for the reviews! I really appreciate it, especially during the holiday season. @dsmiley I’ve addressed your comments by adding an entry to CHANGES.txt and renaming the test variables. This branch is now up to date with main. I’m currently unable to merge the PR, so I’ve re-requested @sunny6142's review in case that’s the blocker. |
Description
When a query includes overlapping phrases, the expansion process may generate duplicate phrases—one with the original (possibly high) user-defined boost, and another one with the boost of 1. As a result, the final boost value assigned to the QueryPhraseMap may be incorrect, since it is determined by whichever duplicate is processed last during the creation of the QueryPhraseMap in the markTerminal method.
We could avoid boost overrides of conflicting expanded phrases by taking the max boost in markTerminal. The expectation is that if there are duplicate phrases, one is from the original query and the other is from the expand method with boost of 1. Therefore, it should have one phrase with boost > 1 from the original query, and another equals to 1 from the expanded query. For example, with the expanded phrases [“a b c”: 100, “a b”: 20, “a b c”: 1, “b c”: 50], the final query phrase mapping would be “a b c”: 100.
Fixes #15433