Getting started

For the following, we assume Milvus is installed. We provide code examples in Python and Node. The code can be run by copy/pasting it. Getting some data.

Getting some data

To run insert and search in Milvus, we need two matrices:

xb for the database, that contains the vectors that must be inserted to Milvus collection, and that we are going to search in it. Its size is nb-by-d
xq for the query vectors, for which we need to find the nearest neighbors. Its size is nq-by-d. If we have a single query vector, nq=1. In the following examples we are going to work with vectors that are drawn form a uniform distribution in d=128 dimensions.

In Python

import numpy as np
d = 128                          # dimension
nb = 100000                      # database size
nq = 1000                        # nb of queries
np.random.seed(1234)             # make reproducible
xb = np.random.random((nb, d)).astype('float32').tolist()
xq = np.random.random((nq, d)).astype('float32').tolist()

In node

const d=128;
const nb=100000;
const nq=1000;
const entities = Array.from({ length: nb }, () => ({
  [FIELD_NAME]: Array.from({ length: nq }, () => Math.floor(Math.random() * nb)),
}));
const xq = Array.from({ length: d }, () => Math.floor(Math.random() * nq));

Connecting Milvus

To use Milvus, you need to connect Milvus server first.

In Python

from pymilvus import connections, FieldSchema, CollectionSchema, DataType, Collection
connections.connect(host='localhost', port='19530')

In node

import { MilvusClient } from "@zilliz/milvus2-sdk-node";
const milvusClient = new MilvusClient("localhost:19530");

Creating a collection

Before inserting data into Milvus, you need to create a collection in Milvus and know some Milvus glossary as follows:

collection: A collection in Milvus is equivalent to a table in a relational database management system (RDBMS). In Milvus, collections are used to store and manage entities.
entity: An entity consists of a group of fields that represent real world objects. Each entity in Milvus is represented by a unique row ID.

You can customize row IDs. If you do not configure manually, Milvus automatically assigns row IDs to entities. If you choose to configure your own customized row IDs, note that Milvus does not support row ID de-duplication for now. Therefore, there can be duplicate row IDs in the same collection.

filed: Fields are the units that make up entities. Fields can be structured data (e.g., numbers, strings) or vectors. In Python

collection_name = "hello_milvus"
default_fields = [
    FieldSchema(name="id", dtype=DataType.INT64, is_primary=True, auto_id=True),
    FieldSchema(name="vector", dtype=DataType.FLOAT_VECTOR, dim=d)
]
default_schema = CollectionSchema(fields=default_fields, description="test collection")
print(f"\nCreate collection...")
collection = Collection(name= collection_name, schema=default_schema)
print(f"\nInsert data")
mr = collection.insert([xb])
# flush data
pymilvus.utility.flush([collection_name])
# show the number of the entities that insert into Milvus
print(collection.num_entities)
# view the id that Milvus auto genarate
print(mr.primary_keys)

In node

const collection_name = "hello_milvus"
const params = {
  collection_name: collection_name,
  fields: [
    {
      name: "vector",
      description: "vector field",
      data_type: DataType.FloatVector,

      type_params: {
        dim: d,
      },
    },
    {
      name: "id",
      data_type: DataType.Int64,
      autoID: true,
      is_primary_key: true,
      description: "",
    },
  ],
};

await milvusClient.collectionManager.createCollection(params);

await milvusClient.dataManager.insert({{
  collection_name: collection_name,
  fields_data: entities,
});

await milvusClient.dataManager.flush({ collection_names: [collection_name] });