fix: try different approach with algolia by owner search#1978
Draft
alex-key wants to merge 3 commits intonpmx-dev:mainfrom
Draft
fix: try different approach with algolia by owner search#1978alex-key wants to merge 3 commits intonpmx-dev:mainfrom
alex-key wants to merge 3 commits intonpmx-dev:mainfrom
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
1 Skipped Deployment
|
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
Contributor
Author
|
Another good example of how Algolia search might be completely different from npmRegistry is @graphieros. He has different github username (
Any other his repo has this: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.






🔗 Linked issue
Possibly resolves #1882, #1888
🧭 Context
The issues above raised a problem of inconsistency in user packages search results between Algolia and npmRegistry engines.
There is a major difference among: Algolia search, npmRegistry search and npmjs.org search by username. I've done some investigation and testing, so let me explain some points.
package.publisher.usernamefield in npmRegistry)package.maintainers.usernamefield in npmRegistry)package.links.repositoryfield in npmRegistry)Combination of these fields is used by different search engines to display search results by username.
~dbushellhttps://www.npmjs.com/~dbushell
https://npmx.dev/~dbushell
https://registry.npmjs.org/-/v1/search?text=author:dbushell
You could see that
npmxresults using npmRegistry and direct request toregistry.npm.orgare the same, because npmx uses exactly same query filtering byauthor:{username}(which refers to package publisher). Other 2 results are different.Algolia returns 19 packages, but missing just one that present in npmRegistry output:
@dbushell/hmmarkdown2. You can directly access it on npmx, but you will not see it in npmx+Algolia search results. It seems to be a package published for the first time, so Algolia probably does not have it cached yet.Algolia returns 5 additional packages, where
dbushellneither owner, nor maintainer. But if you check this packages: all are forks from dbushell's repos and his repos is indicated inpackage.links.repository: SeenestableCurrently npmx uses filter
owner.name: {username}to get packages from Algolia by username. This means that Algolia combines two conditions when filtering byowner.name: npmRegistry publisher's name and repository owner. This leads to extra packages in the results output. Also it means that Algolia returns deprecated packages, which npmRegistry filters out.Also it reveals that Algolia does not use
package.maintainersfield onowner.namerequest. Algolia has specific field calledowners: []which storesmaintainers, but this field is not a searchable field. See an output for~fbusername by Algolia and npmRegistry engines on npmx (https://npmx.dev/~fb):Ideal solution to fix Algolia results is to do query like this:
owner.name:{username} OR owners.name:{username}. I checked Algolia index that npmx is using and it seems thatowners.name(manintainers) is not a searchable field. I requested Algolia withfacets: ['*']and it appeared that onlyowner.namecould be indicated in Algolia requestfilters. Probably we could just ask https://github.com/algolia/npm-search to include this parameter into searchable fields. That would be the best solution.I tried different combination of parameters and in this PR there is a search config closest to npmjs.org output. While we understand that npmjs.org is not the final source of truth, but we want to match it as much as possible just not to miss some packages in the output which are considered to be owned by requested username.
There is a comparison between current npmx@npmRegistry vs npmx@Algolia vs npmjs.org vs npmx@thisPrAlgolia for some usernames:
As you could see current Algolia implementation missing a lot of packages and this PR makes it much closer to npmjs.org. But there are still discrepancies.
Search by username:
author:{username}): returns packages bypublisherexcluding deprecatedmaintainer:{username}): returns packages bymaintainersexcluding deprecated (oftenmaintainersincludesauthoralready)~{username}): returns packages bypublisherandmaintainerincluding deprecatedowner:{username}): returns packages bypublisherand git repo url including deprecatedquery: {username}, typeTolerance:false): returns search result in all fields by matching{username}and applies some additional in-app filtering to remove irrelevant packagesSummarizing:
owners.nameto searchable attributes(see Algolia docs)📚 Description
This PR implements a combined logic to do search by username. This is to match packages which are missed using current Algolia search parameters