Skip to content

fix: try different approach with algolia by owner search#1978

Draft
alex-key wants to merge 3 commits intonpmx-dev:mainfrom
alex-key:fix/algolia-search-by-owner-experiment
Draft

fix: try different approach with algolia by owner search#1978
alex-key wants to merge 3 commits intonpmx-dev:mainfrom
alex-key:fix/algolia-search-by-owner-experiment

Conversation

@alex-key
Copy link
Contributor

@alex-key alex-key commented Mar 7, 2026

🔗 Linked issue

Possibly resolves #1882, #1888

🧭 Context

The issues above raised a problem of inconsistency in user packages search results between Algolia and npmRegistry engines.
There is a major difference among: Algolia search, npmRegistry search and npmjs.org search by username. I've done some investigation and testing, so let me explain some points.

  1. There is no single source of truth, because there is no single field, which could be considered as package owner. There are:
  • npm package publisher / author (package.publisher.username field in npmRegistry)
  • npm package maintainers (package.maintainers.username field in npmRegistry)
  • git repo owner (package.links.repository field in npmRegistry)

Combination of these fields is used by different search engines to display search results by username.

  1. Let's see the following example initially mentioned in misattributed packages #1882 by dbushell. There is a combined results table for username ~dbushell
    https://www.npmjs.com/~dbushell
    https://npmx.dev/~dbushell
    https://registry.npmjs.org/-/v1/search?text=author:dbushell
image

You could see that npmx results using npmRegistry and direct request to registry.npm.org are the same, because npmx uses exactly same query filtering by author:{username} (which refers to package publisher). Other 2 results are different.

  1. It is clear that npmjs.org uses some internal logic to retrieve packages by username because it displayed 15 packages, while npmRegistry returns 11. Investigating this further I've found that 4 missing packages are all marked as deprecated. Which means that npmx/npmRegistry cannot return deprecated packages and it's output could be different from npmjs.org results. I did not found a proper npmRegistry query filter to include deprecated packages
image
  1. Algolia returns 19 packages, but missing just one that present in npmRegistry output: @dbushell/hmmarkdown2. You can directly access it on npmx, but you will not see it in npmx+Algolia search results. It seems to be a package published for the first time, so Algolia probably does not have it cached yet.

  2. Algolia returns 5 additional packages, where dbushell neither owner, nor maintainer. But if you check this packages: all are forks from dbushell's repos and his repos is indicated in package.links.repository: See nestable

image
  1. Currently npmx uses filter owner.name: {username} to get packages from Algolia by username. This means that Algolia combines two conditions when filtering by owner.name: npmRegistry publisher's name and repository owner. This leads to extra packages in the results output. Also it means that Algolia returns deprecated packages, which npmRegistry filters out.

  2. Also it reveals that Algolia does not use package.maintainers field on owner.name request. Algolia has specific field called owners: [] which stores maintainers, but this field is not a searchable field. See an output for ~fb username by Algolia and npmRegistry engines on npmx (https://npmx.dev/~fb):

image image
  1. Ideal solution to fix Algolia results is to do query like this: owner.name:{username} OR owners.name:{username}. I checked Algolia index that npmx is using and it seems that owners.name (manintainers) is not a searchable field. I requested Algolia with facets: ['*'] and it appeared that only owner.name could be indicated in Algolia request filters. Probably we could just ask https://github.com/algolia/npm-search to include this parameter into searchable fields. That would be the best solution.

  2. I tried different combination of parameters and in this PR there is a search config closest to npmjs.org output. While we understand that npmjs.org is not the final source of truth, but we want to match it as much as possible just not to miss some packages in the output which are considered to be owned by requested username.

  3. There is a comparison between current npmx@npmRegistry vs npmx@Algolia vs npmjs.org vs npmx@thisPrAlgolia for some usernames:

# Username npm Registry Algolia npmjs.com thisPR
1 ~dbushell 11 19 15 19
2 ~fb 434 1 465 510
3 ~pi0 611 242 654 655
4 ~danielroe 382 106 648 427
5 ~posva 116 85 119 124
6 ~rich_harris 259 39 510 284
7 ~yyx990803 318 61 328 342

As you could see current Algolia implementation missing a lot of packages and this PR makes it much closer to npmjs.org. But there are still discrepancies.

Search by username:

  • npmRegistry (author:{username}): returns packages by publisher excluding deprecated
  • npmRegistry (maintainer:{username}): returns packages by maintainers excluding deprecated (often maintainers includes author already)
  • npmjs.org (~{username}): returns packages by publisher and maintainer including deprecated
  • Algolia (owner:{username}): returns packages by publisher and git repo url including deprecated
  • Algolia this PR (query: {username}, typeTolerance:false ): returns search result in all fields by matching {username} and applies some additional in-app filtering to remove irrelevant packages

Summarizing:

  • we need to define own rules on how to perform search by username
  • we need to ask https://github.com/algolia/npm-search to add owners.name to searchable attributes(see Algolia docs)
  • we could continue investigation on matching npmjs.org output by playing with Algolia request, multiple requests + combine and filter request results etc

📚 Description

This PR implements a combined logic to do search by username. This is to match packages which are missed using current Algolia search parameters

@vercel
Copy link

vercel bot commented Mar 7, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
docs.npmx.dev Ready Ready Preview, Comment Mar 8, 2026 3:42am
npmx.dev Ready Ready Preview, Comment Mar 8, 2026 3:42am
1 Skipped Deployment
Project Deployment Actions Updated (UTC)
npmx-lunaria Ignored Ignored Mar 8, 2026 3:42am

Request Review

@codecov
Copy link

codecov bot commented Mar 7, 2026

Codecov Report

❌ Patch coverage is 0% with 1 line in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
app/composables/npm/useAlgoliaSearch.ts 0.00% 0 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

@alex-key
Copy link
Contributor Author

alex-key commented Mar 8, 2026

Another good example of how Algolia search might be completely different from npmRegistry is @graphieros. He has different github username (graphieros) and npm registry username (aleclloydprobert). See how different search results are:

npmx (Algolia):
image
image

image image

npmx(npmRegistry):
image
image

vue-data-ui-doc is the only package for ~aleclloydprobert by Algolia because it prioritizes repository ownership over npmRegistry publisher name in case of conflict.

{
      ...
      "package": {
        "name": "vue-data-ui-doc",
      ...
        "publisher": {
          "email": "alec.lloyd.probert@gmail.com",
          "username": "aleclloydprobert"
        },
        "maintainers": [
          {
            "email": "alec.lloyd.probert@gmail.com",
            "username": "aleclloydprobert"
          }
        ],
        "links": {                                      <<<<<<<<<<<<<<<<<<< ! No "graphieros" for repo link
          "npm": "https://www.npmjs.com/package/vue-data-ui-doc"  
        }
      ...
    }

Any other his repo has this:

    {
      ...
      "package": {
        "name": "turbo-spark",
      ...
        "publisher": {
          "email": "alec.lloyd.probert@gmail.com",
          "username": "aleclloydprobert"                           <<<<<<<<<<<<<<<<<<< !
        },
        "maintainers": [
          {
            "email": "alec.lloyd.probert@gmail.com",
            "username": "aleclloydprobert"
          }
        ],
      ...
        "links": {
          "homepage": "https://turbo-spark.graphieros.com/",
          "repository": "git+https://github.com/graphieros/TS.git", <<<<<<<<<<<<<<<<<<< ! overwrites publisher
          "bugs": "https://github.com/graphieros/TS/issues",
          "npm": "https://www.npmjs.com/package/turbo-spark"
        }
      },
      ...
    },

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

misattributed packages

1 participant