Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ext-firestore-semantic-search never finished deplying the index? #360

Open
fabiopolimeni opened this issue Feb 7, 2024 · 26 comments
Open

Comments

@fabiopolimeni
Copy link

fabiopolimeni commented Feb 7, 2024

Describe your configuration

  • Extension name: firestore-semantic-search
  • Extension version: 0.1.7

I have installed this extension, but I don't seem to be able to get anything out of it, and I am not sure where to look.
Whenever I insert a new document into DB, which is supposed to be indexed with a backfill trigger, I then can see on the log the following error.

{
  "textPayload": "Error: Queue does not exist.\n    at FunctionsApiClient.toFirebaseError (/workspace/node_modules/firebase-admin/lib/functions/functions-api-client-internal.js:305:16)\n    at FunctionsApiClient.enqueue (/workspace/node_modules/firebase-admin/lib/functions/functions-api-client-internal.js:146:32)\n    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n    at async streamUpdateDatapointHandler (/workspace/lib/functions/stream_update_datapoint.js:34:9)",
  "insertId": "65c3a6b7000e922b5d1da738",
  "resource": {
    "type": "cloud_function",
    "labels": {
      "region": "europe-west1",
      "function_name": "ext-firestore-semantic-search-streamUpdateDatapoint",
      "project_id": "..."
    }
  },
  "timestamp": "2024-02-07T15:50:15.954923Z",
  "severity": "ERROR",
  "labels": {
    "runtime_version": "nodejs18_20240128_18_19_0_RC00",
    "instance_id": "0087599d42edfa70f394c3040b4c3f8d626b6c6535e565b62cb4ef75f8b7bfba726bfee1d143c4566ab596b41a7b094ed7c665295191f0275e570f63320eda3307d3",
    "execution_id": "9bn3cx0fh2w9"
  },
  "logName": "projects/<project>/logs/cloudfunctions.googleapis.com%2Fcloud-functions",
  "trace": "projects/<project>/traces/c3281ec1ba9e02526f23b88e4fcf62fa",
  "receiveTimestamp": "2024-02-07T15:50:16.256618357Z",
  "errorGroups": [
    {
      "id": "CPTJ2fj3s5D8zwE"
    }
  ]
}

Also, some log earlier I can spot and entry saying

{
  "textPayload": "Index not deployed yet, skipping...",
  "insertId": "65c3a6b6000993c36cddc8c1",
  "resource": {
    "type": "cloud_function",
    "labels": {
      "region": "europe-west1",
      "project_id": "...",
      "function_name": "ext-firestore-semantic-search-streamUpdateDatapoint"
    }
  },
  "timestamp": "2024-02-07T15:50:14.627651Z",
  "severity": "INFO",
  "labels": {
    "instance_id": "0087599d42edfa70f394c3040b4c3f8d626b6c6535e565b62cb4ef75f8b7bfba726bfee1d143c4566ab596b41a7b094ed7c665295191f0275e570f63320eda3307d3",
    "runtime_version": "nodejs18_20240128_18_19_0_RC00",
    "execution_id": "9bn3cx0fh2w9"
  },
  "logName": "projects/<project>/logs/cloudfunctions.googleapis.com%2Fcloud-functions",
  "trace": "projects/<project>/traces/c3281ec1ba9e02526f23b88e4fcf62fa",
  "receiveTimestamp": "2024-02-07T15:50:14.922855876Z"
}

But has been more than 5 hours now, so I am not sure the issue is the index has not be built, or somehow that building task failed. The collection it watches is very small so far, 10 documents only.

Also, trying the queryIndex call I get this error

{
  "textPayload": "Function execution took 4271 ms, finished with status code: 404",
  "insertId": "tow4ugfdsx96s",
  "resource": {
    "type": "cloud_function",
    "labels": {
      "function_name": "ext-firestore-semantic-search-queryIndex",
      "region": "europe-west1",
      "project_id": "..."
    }
  },
  "timestamp": "2024-02-07T15:45:25.594660766Z",
  "severity": "DEBUG",
  "labels": {
    "runtime_version": "nodejs18_20240128_18_19_0_RC00",
    "execution_id": "in40qx2h2rdz"
  },
  "logName": "projects/<project>/logs/cloudfunctions.googleapis.com%2Fcloud-functions",
  "trace": "projects/<project>/traces/2f0a1d082dd270bbe5fd0421a56db792",
  "receiveTimestamp": "2024-02-07T15:45:25.682539320Z"
}

The function is there so I am not sure what it is failing on.
The metadata status field on my DB says INDEX_BUILDING, but I am not sure this is really doing anything at this point.

Can anyone help me figure what's possibly going wrong?

@pr-Mais
Copy link
Collaborator

pr-Mais commented Feb 12, 2024

@fabiopolimeni Deploying indexes usually takes few hours, to check the status of an index (if it failed) go to https://console.cloud.google.com/vertex-ai/matching-engine/indexes (make sure to pick your location) and see if the index is there

@fabiopolimeni
Copy link
Author

OK thanks. Unfortunately, I have uninstalled the extension by now, but I can see an index available on the region I was using. Were I supposed to deploy that to an endpoint manually?

@pr-Mais
Copy link
Collaborator

pr-Mais commented Feb 12, 2024

If you click on the index you can see the endpoints it's deployed to, the extension is supposed to the do the deployment for you, so if that was successful you should see an "Index Endpoint"

@fabiopolimeni
Copy link
Author

Then it was not successful. It was not deployed.

@pr-Mais
Copy link
Collaborator

pr-Mais commented Feb 12, 2024

With no error logs? If you would like to try again, I will try to help you go through why is it happening.

@frnndwrms
Copy link

same here #369

Still wating some hours to see if a problem of time, but It seems it is not.

@cabljac cabljac added the type: bug Something isn't working label Feb 22, 2024
@hygu98
Copy link

hygu98 commented Mar 8, 2024

This doesn't seem to be the same issue as #369. I changed my embeddings and re-installed but it never created the index. Trying to trigger this from the CLI with:

gcloud functions --project xxxxx call ext-firestore-semantic-search-queryIndex --data '{"data": {"query":["some query"]}}'

I get: error: '{"error":{"message":"Endpoint or index endpoint is not found.","status":"NOT_FOUND"}}'

@hygu98
Copy link

hygu98 commented Mar 9, 2024

I have now waited for two days and the index never completed. I also tried uninstalling and re-installing to no avail. Interestingly in _ext-firestore-semantic-search -> tasks -> totalTasks I see that the totalTasks is 73433 and it gets to 72500 and stops. Anyone know why that might happen?

@pr-Mais
Copy link
Collaborator

pr-Mais commented Mar 9, 2024

Thanks for letting us know, this is likely an issue with trigger a deployment, we're investigating it.

@jauntybrain
Copy link
Collaborator

jauntybrain commented Mar 11, 2024

Hi @hygu98! Are there any error logs around the time when the processed tasks number approaches 72500 and stops?

@antman786
Copy link

I am having the same issue @pr-Mais.

@hygu98
Copy link

hygu98 commented Mar 11, 2024

@jauntybrain I think you were referring to me so I will reply. Around the time that it seems to stop processing, I don't see errors in any of the 10 function logs. Interestingly, I do see this:

"Embeddings uploaded to the bucket <app_name>-ext-firestore-semantic-search in datapoints/ext-firestore-semantic-search-task1450.json 🚀"

And if I go to that bucket, it is completely empty so if data is being uploaded to that bucket, it doesn't seem to be successful. I looked at the on onIndexCreated and onIndexDeployed and I don't see any errors there that I can tell.

If I go into Vertex AI I don't see any Vector Search Indexes or End Points either so, it seems like the logs show the index was successfully created and deployed but it wasn't?
downloaded-logs-20240311-083308.json
downloaded-logs-20240311-083330.json
downloaded-logs-20240311-083437.json
downloaded-logs-20240311-083544.json
downloaded-logs-20240311-083651.json
downloaded-logs-20240311-083702.json
downloaded-logs-20240311-083708.json
downloaded-logs-20240311-083743.json
downloaded-logs-20240311-083749.json
downloaded-logs-20240311-083914.json

@pr-Mais
Copy link
Collaborator

pr-Mais commented Mar 14, 2024

@hygu98 On the vertex ai console, you have a dropdown menu with locations, and only the indexes on that location is showed, so probably you're not viewing the correct location?

@pr-Mais
Copy link
Collaborator

pr-Mais commented Mar 14, 2024

I see in the last json file you sent theres this error:

Unhandled error Error: Index undefined is not deployed or does not exist.\n    at updateIndexConfigHandler (/workspace/lib/functions/update_index_config.js:31:19)\n    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n    at async /workspace/node_modules/firebase-functions/lib/common/providers/tasks.js:53:17

So thats why the tasks are stacking up. We will investigate it further.

@jauntybrain
Copy link
Collaborator

Hi @antman786, could you please tell us more about the issue you're facing? Could you please also share your extension configuration and error logs, if there are any?

@hygu98
Copy link

hygu98 commented Mar 21, 2024

@pr-Mais - Sorry for the delay, been wrapped up. I'm guessing you are referring to the Region select? I verified that I was in the right region and no, there is no index there. In addition, when I try to run the function via command line it still says there is no index. To be sure I checked all 31 regions and there is no index in any of them.

Any luck investigating the error?

@pr-Mais
Copy link
Collaborator

pr-Mais commented May 27, 2024

Sorry for the delay, I will be looking into the issue this week.

@ElementTech
Copy link

ElementTech commented May 27, 2024

It took me ages but I managed to solve and use the query endpoint. I had all the problems listed above and on similar issues. The following are things that I did:

  1. The GCS Bucket that I had by default for my firebase app was "Multi Region". The functions expect a single region bucket (specifically us-central1 I think, because of Vertex). I had to delete my default buckets and let the functions create them from scratch with the correct naming.
  2. I had to allow unauthenticated access to both CloudRun indexCreate and indexUpdate services. I also added allUsers<>Cloud Invoker Role<>Policy to their function counterparts.
  3. I had a total documents of around 42700, and the backFill got stuck at 42500. I manually edited the max to 42500 and the status to 'DONE' in order to trigger then next functions.
  4. After the Vertex index was created, the Index Endpoint won't create until it allows the unauthenticated trigger I mentioned before. If your'e stuck and want to re-trigger indexCreate (Index) or indexUpdate (Endpoint), you'd have to go to CloudRun and "test" them by creating a mock Vertex index created event.

@pr-Mais
Copy link
Collaborator

pr-Mais commented May 27, 2024

Thank you @ElementTech for providing these details, super helpful.

@exaby73
Copy link

exaby73 commented Jun 6, 2024

Hello @hygu98. Can you confirm if you had completed the post install steps highlighted after installation of the extension? You can find the steps here

@TTrapper
Copy link

TTrapper commented Jun 15, 2024

Hi folks I'm having similar issues. Spent a couple of weeks going back and forth with google cloud support and they finally suggested I submit a bug report. Deciding to post here rather than make a new bug report because the issue looks to be similar: backfill pausing/not working and not seeing a vector index created in Vertex AI. The behaviour I'm seeing is:

  1. Installed the extension with backfill enabled
  2. first 50 entries were embedded, then stalled indefinetely
  3. uninstalled and reinstalled extension
  4. The Firestore UI has been showing this for the last few days:
backfillJobsFailed
0
backfillJobsProcessed
0
backfillJobsSkipped
50
backfillJobsTotal
3233
backfillStatus
"RUNNING"

@pr-Mais
Copy link
Collaborator

pr-Mais commented Jun 15, 2024

@TTrapper Can you confirm you have followed the steps mentioned in @exaby73 comment, by having your audit logs turned on for Vertex AI API?

@TTrapper
Copy link

@TTrapper Can you confirm you have followed the steps mentioned in @exaby73 comment, by having your audit logs turned on for Vertex AI API?

Ah I hadn't actually. Just did though, do I need to reinstall or something?

@pr-Mais
Copy link
Collaborator

pr-Mais commented Jun 15, 2024

@TTrapper yeah, though in your case the backfill has been skipped after the first try, the extension attempts to create the index after backfill is completed, but try that first and let me know.

@pr-Mais
Copy link
Collaborator

pr-Mais commented Jun 15, 2024

Could you also check for any error logs in the function named backfillTask, could be found at this link https://console.cloud.google.com/functions/details/us-central1/ext-firestore-vector-search-backfillTask?env=gen1&cloudshell=false&tab=logs&project=

@TTrapper
Copy link

Could you also check for any error logs in the function named backfillTask, could be found at this link https://console.cloud.google.com/functions/details/us-central1/ext-firestore-vector-search-backfillTask?env=gen1&cloudshell=false&tab=logs&project=

Thanks. After reinstalling I'm back to the behaviour of the first 50 documents being embedded and then it gets stuck. The first 50 look good though.

  • Nothing unusual in the logs for the the backfillTask function. The last 4 entries are:
    • Handling 50 documents
    • Handling 50 documents
    • Task ext-firestore-vector-search-task-1 completed with 50 success(es)
    • Current state: 50 processed, 0 skipped, 0 failed out of 3233 total tasks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants