now.executor.indexer.elastic.elastic_indexer module#

class now.executor.indexer.elastic.elastic_indexer.FieldEmbedding(encoder, embedding_size, fields)#

Bases: tuple

Create new instance of FieldEmbedding(encoder, embedding_size, fields)

property embedding_size#: Alias for field number 1

property encoder#: Alias for field number 0

property fields#: Alias for field number 2

class now.executor.indexer.elastic.elastic_indexer.NOWElasticIndexer(document_mappings, metric='cosine', limit=10, max_values_per_tag=10, es_mapping=None, es_config=None, *args, **kwargs)[source]#

Bases: NOWAuthExecutor

NOWElasticIndexer indexes Documents into an Elasticsearch instance. To do this, it uses helper functions from es_converter, converting documents to and from the accepted Elasticsearch format. It also uses the score calculation to combine the scores of different fields/encoders, allowing multi-modal documents to be indexed and searched with multi-modal queries.

Parameters

document_mappings (List[Tuple[str, int, List[str]]]) – list of FieldEmbedding tuples that define which encoder encodes which fields, and the embedding size of the encoder.
metric (str) – Distance metric type. Can be ‘euclidean’, ‘inner_product’, or ‘cosine’
limit (int) – Number of results to get for each query document in search
max_values_per_tag (int) – Maximum number of values per tag
es_mapping (Optional[Dict]) – Mapping for new index. If none is specified, this will be generated from document_mappings and metric.
hosts – host configuration of the Elasticsearch node or cluster
es_config (Optional[Dict[str, Any]]) – Elasticsearch cluster configuration object
index_name – ElasticSearch Index name used for the storage

generate_es_mapping()[source]#

Creates Elasticsearch mapping for the defined document fields.

Return type: Dict

index(**kwargs)#

search(**kwargs)#

update(**kwargs)#

list(**kwargs)#

count(**kwargs)#

delete(**kwargs)#

filters(**kwargs)#

curate(**kwargs)#

update_curated_ids(search_filter)[source]#

update_tags()[source]#: The indexer keeps track of which tags are indexed and what their possible values are, which is stored in self.filters_val_dict. This method queries the elasticsearch index for the current es_mapping to find the current tags on all indexed documents. It then queries elasticsearch for an aggregation of all values inside this field, and updates the self.filters_val_dict dictionary with tags as keys, and values as values in the dictionary.

now.executor.indexer.elastic.elastic_indexer.aggregate_embeddings(docs_map)[source]#

Aggregate embeddings of cc level to c level.

Parameters: docs_map (Dict[str, DocumentArray]) – a dictionary of `DocumentArray`s, where the key is the embedding space aka encoder name.