Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding nullable column to postgres images table with type Vector to s… #357

Open
wants to merge 12 commits into
base: develop
Choose a base branch
from

Conversation

ahmednasserswe
Copy link
Contributor

Description

Adding nullable column to postgres images table with type Vector to store sscd embedding vectors.
This included changes to both Postgress dockerfile and Alegre dockerfile.

Reference: TICKET-ID CV2-3917

How has this been tested?

tested locally

Have you considered secure coding practices when writing this code?

No potential security threads

@skyemeedan
Copy link
Contributor

This looks really promising, probably lots of cool things we can do with direct access to the vector. But it looks like we need to first get Alegre migrated to python 3.8? Build is failing with

ERROR: Ignored the following versions that require a different python version: 0.2.0 Requires-Python >=3.8; 0.2.1 Requires-Python >=3.8; 0.2.2 Requires-Python >=3.8; 0.2.3 Requires-Python >=3.8

ERROR: Could not find a version that satisfies the requirement pgvector==0.2.3 (from versions: 0.1.0, 0.1.1, 0.1.2, 0.1.3, 0.1.4, 0.1.5, 0.1.6, 0.1.7, 0.1.8)

ERROR: No matching distribution found for pgvector==0.2.3

@DGaffney
Copy link
Contributor

DGaffney commented Nov 9, 2023

Upgrading to 3.8 is a good idea - I'm usually against over-eager version bumps, but this code is now very long in the tooth and 3.8 is very old.

@skyemeedan I wonder if we should hold on the version bump until we release all the dependencies off alegre? That would very likely make a bump much easier...

@skyemeedan
Copy link
Contributor

@ahmednasserswe are there earlier versions of pgvector that provide functionality but don't require python 3.8?

…ions, making sscd vector size 512, and removing version number from pgvector in requirements.tx
@@ -43,6 +44,7 @@ def from_url(url, doc_id, context={}, created_at=None):
raw = remote_response.read()
im = Image.open(io.BytesIO(raw)).convert('RGB')
phash = compute_phash_int(im)
sscd = None
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused variable 'sscd'

@@ -19,6 +19,7 @@
from app.main.lib.language_analyzers import init_indices
from app.main.lib.image_hash import compute_phash_int
from PIL import Image
from sqlalchemy import text
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imports from package sqlalchemy are not grouped


from app.main.lib.presto import Presto
import json
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

standard import import json should be placed before from flask import current_app as app


from app.main.lib.presto import Presto
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imports from package app are not grouped

# models = ['phash']
try:
models = [m.lower() for m in models]
except:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No exception type(s) specified

if not context:
context = {}
if not threshold:
threshold = 0.9
if url:
image = ImageModel.from_url(url, None)
result = []
image = ImageModel.from_url(url, None,models = models)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No space allowed around keyword argument assignment

if not context:
context = {}
if not threshold:
threshold = 0.9
if url:
image = ImageModel.from_url(url, None)
result = []
image = ImageModel.from_url(url, None,models = models)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly one space required after comma

@@ -31,23 +34,51 @@ class ImageModel(db.Model):
)

@staticmethod
def from_url(url, doc_id, context={}, created_at=None):
def from_url(url, doc_id, context={}, created_at=None, models=None):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too many local variables (18/15)

Copy link

codeclimate bot commented Nov 27, 2023

Code Climate has analyzed commit 8b59feb and detected 9 issues on this pull request.

Here's the issue category breakdown:

Category Count
Complexity 1
Style 5
Bug Risk 3

The test coverage on the diff in this pull request is 41.3% (50% is the threshold).

This pull request will bring the total coverage in the repository to 86.4%.

View more on Code Climate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants