now.data_loading.data_loading module#

now.data_loading.data_loading.load_data(user_input, print_callback=<built-in function print>)[source]#

Based on the user input, this function will pull the configured DocumentArray dataset ready for the preprocessing executor.

Parameters
  • user_input (UserInput) – The configured user object. Result from the Jina Now cli dialog.

  • print_callback – The callback function that should be used to print the status.

Return type

DocumentArray

Returns

The loaded DocumentArray.

now.data_loading.data_loading.from_files_local(path, fields, field_names_to_dataclass_fields, data_class)[source]#

Creates a Multi Modal documentarray over a list of file path or the content of the files.

Parameters
  • path (str) – The path to the directory

  • fields (List[str]) – The fields to search for in the directory

  • field_names_to_dataclass_fields (Dict) – The mapping of the field names to the dataclass fields

  • data_class (Type) – The dataclass to use for the document

Return type

DocumentArray

Returns

A DocumentArray with the documents

now.data_loading.data_loading.create_docs_from_subdirectories(file_paths, fields, field_names_to_dataclass_fields, data_class, path=None, is_s3_dataset=False)[source]#

Creates a Multi Modal documentarray over a list of subdirectories.

Parameters
  • file_paths (List) – The list of file paths

  • fields (List[str]) – The fields to search for in the directory

  • field_names_to_dataclass_fields (Dict) – The mapping of the field names to the dataclass fields

  • data_class (Type) – The dataclass to use for the document

  • path (Optional[str]) – The path to the directory

  • is_s3_dataset (bool) – Whether the dataset is stored on s3

Return type

List[Document]

Returns

The list of documents

now.data_loading.data_loading.create_docs_from_files(file_paths, fields, field_names_to_dataclass_fields, data_class, path=None, is_s3_dataset=False)[source]#

Creates a Multi Modal documentarray over a list of files.

Parameters
  • file_paths (List) – List of file paths

  • fields (List[str]) – The fields to search for in the directory

  • field_names_to_dataclass_fields (Dict) – The mapping of the files to the dataclass fields

  • data_class (Type) – The dataclass to use for the document

  • path (Optional[str]) – The path to the directory

  • is_s3_dataset (bool) – Whether the dataset is stored on s3

Return type

List[Document]

Returns

A list of documents

now.data_loading.data_loading.set_modality_da(documents)[source]#

Set document’s modality based on its modality or mime_type attributes.

Parameters

documents (DocumentArray) – The DocumentArray to set the modality for.

Returns

The DocumentArray with the modality set.