now.executor.indexer.elastic.elastic_indexer module#
- class now.executor.indexer.elastic.elastic_indexer.FieldEmbedding(encoder, embedding_size, fields)#
Bases:
tuple
Create new instance of FieldEmbedding(encoder, embedding_size, fields)
- property embedding_size#
Alias for field number 1
- property encoder#
Alias for field number 0
- property fields#
Alias for field number 2
- class now.executor.indexer.elastic.elastic_indexer.NOWElasticIndexer(document_mappings, dim=None, metric='cosine', limit=10, max_values_per_tag=10, es_mapping=None, hosts='http://localhost:9200', es_config=None, index_name='now-index', *args, **kwargs)[source]#
Bases:
NOWAuthExecutor
NOWElasticIndexer indexes Documents into an Elasticsearch instance. To do this, it uses helper functions from es_converter, converting documents to and from the accepted Elasticsearch format. It also uses the semantic scores to combine the scores of different fields/encoders, allowing multi-modal documents to be indexed and searched with multi-modal queries.
- Parameters
document_mappings (
List
[Tuple
[str
,int
,List
[str
]]]) – list of FieldEmbedding tuples that define which encoder encodes which fields, and the embedding size of the encoder.dim (
Optional
[int
]) – Dimensionality of vectors to index.metric (
str
) – Distance metric type. Can be ‘euclidean’, ‘inner_product’, or ‘cosine’limit (
int
) – Number of results to get for each query document in searchmax_values_per_tag (
int
) – Maximum number of values per tages_mapping (
Optional
[Dict
]) – Mapping for new index. If none is specified, this will be generated from document_mappings and metric.hosts (
Union
[str
,List
[Union
[str
,Mapping
[str
,Union
[str
,int
]]]],None
]) – host configuration of the Elasticsearch node or clusteres_config (
Optional
[Dict
[str
,Any
]]) – Elasticsearch cluster configuration objectindex_name (
str
) – ElasticSearch Index name used for the storage
- generate_es_mapping()[source]#
Creates Elasticsearch mapping for the defined document fields.
- Return type
Dict
- index(**kwargs)#
- search(**kwargs)#
- update(**kwargs)#
- list(**kwargs)#
- delete(**kwargs)#
- tags(**kwargs)#
- curate(**kwargs)#
- update_tags()[source]#
The indexer keeps track of which tags are indexed and what their possible values are, which is stored in self.doc_id_tags. This method queries the elasticsearch index for the current es_mapping to find the current tags on all indexed documents. It then queries elasticsearch for an aggregation of all values inside this field, and updates the self.doc_id_tags dictionary with tags as keys, and values as values in the dictionary.