WARNING: THIS SITE IS A MIRROR OF GITHUB.COM / IT CANNOT LOGIN OR REGISTER ACCOUNTS / THE CONTENTS ARE PROVIDED AS-IS / THIS SITE ASSUMES NO RESPONSIBILITY FOR ANY DISPLAYED CONTENT OR LINKS / IF YOU FOUND SOMETHING MAY NOT GOOD FOR EVERYONE, CONTACT ADMIN AT ilovescratch@foxmail.com
Skip to content

Conversation

@fragosoluana
Copy link

Description

When a query includes overlapping phrases, the expansion process may generate duplicate phrases—one with the original (possibly high) user-defined boost, and another one with the boost of 1. As a result, the final boost value assigned to the QueryPhraseMap may be incorrect, since it is determined by whichever duplicate is processed last during the creation of the QueryPhraseMap in the markTerminal method.

We could avoid boost overrides of conflicting expanded phrases by taking the max boost in markTerminal. The expectation is that if there are duplicate phrases, one is from the original query and the other is from the expand method with boost of 1. Therefore, it should have one phrase with boost > 1 from the original query, and another equals to 1 from the expanded query. For example, with the expanded phrases [“a b c”: 100, “a b”: 20, “a b c”: 1, “b c”: 50], the final query phrase mapping would be “a b c”: 100.

Fixes #15433

this.terminal = true;
this.slop = slop;
this.boost = boost;
this.boost = Math.max(this.boost, boost);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your approach seems simpler, but I think we have an opportunity to optimize this. Instead of always executing all assignments, should we have something like:

   if (!this.terminal || boost > this.boost) {
        this.terminal = true;
        this.slop = slop;
        this.boost = boost;
        this.termOrPhraseNumber = fieldQuery.nextTermOrPhraseNumber();
      }

Benefits:

Performance: Avoids unnecessary state updates when boost doesn't improve

Semantic correctness: termOrPhraseNumber only increments when meaningful changes occur

Cleaner logic: Single condition handles both initialization and duplicate prevention

Current approach with Math.max():

Always updates all fields, even when boost=1 < existing boost=100

Increments termOrPhraseNumber unnecessarily on duplicate calls

Proposed approach:

Only updates when first call (!this.terminal) or when boost improves (boost > this.boost)

More efficient for queries with many overlapping phrases

What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a fine suggestion but it's also a trivial matter. I'd even argue the complexity of the condition you propose makes the situation worse.... a future reader may scratching their heads for a little longer than merits the code being avoided.

@github-actions
Copy link
Contributor

github-actions bot commented Dec 4, 2025

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!

Copy link
Contributor

@dsmiley dsmiley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!
Please bring up to date with main & add a changelog entry to CHANGES.txt

pqF(50, "b", "c"));

Map<String, QueryPhraseMap> map = fq.rootMaps;
QueryPhraseMap qpm = map.get("f").subMap.get("a");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the variable name choices here confusing. For example, should this one be a_qpm maybe?

@github-actions github-actions bot removed the Stale label Dec 27, 2025
@github-actions github-actions bot added this to the 11.0.0 milestone Jan 5, 2026
@fragosoluana fragosoluana requested a review from sunny6142 January 5, 2026 16:20
@fragosoluana
Copy link
Author

Thanks for the reviews! I really appreciate it, especially during the holiday season.

@dsmiley I’ve addressed your comments by adding an entry to CHANGES.txt and renaming the test variables. This branch is now up to date with main.

I’m currently unable to merge the PR, so I’ve re-requested @sunny6142's review in case that’s the blocker.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Highlighter QueryPhraseMap boost overridden due to conflicting query expansion

3 participants