now_common.preprocess module#

now_common.preprocess.preprocess_images(da)[source]#

Loads all documents into memory to thumbnail them.

Return type

DocumentArray

now_common.preprocess.preprocess_text(da, split_by_sentences=False)[source]#

If necessary, loads text for all documents. If asked for, splits documents by sentences.

Return type

DocumentArray

now_common.preprocess.preprocess_nested_docs(da, user_input)[source]#

Process a DocumentArray with Document`s that have `chunks of nested Document`s. It constructs `Document`s containg two chunks: one containing image data and another containing text data. Fields for indexing should be specified in the `UserInput.

Parameters
  • da (DocumentArray) – A DocumentArray containing nested chunks.

  • user_input (UserInput) – The configured user input.

Return type

DocumentArray

Returns

A DocumentArray with `Document`s containing text and image chunks.