now.data_loading.data_loading module#
- now.data_loading.data_loading.load_data(user_input, data_class=None, print_callback=<built-in function print>)[source]#
Based on the user input, this function will pull the configured DocumentArray dataset ready for the preprocessing executor.
- Parameters
user_input (
UserInput
) – The configured user object. Result from the Jina Now cli dialog.data_class – The dataclass that should be used for the DocumentArray.
print_callback – The callback function that should be used to print the status.
- Return type
DocumentArray
- Returns
The loaded DocumentArray.
- now.data_loading.data_loading.from_files_local(path, fields, field_names_to_dataclass_fields, data_class)[source]#
Creates a Multi Modal documentarray over a list of file path or the content of the files.
- Parameters
path (
str
) – The path to the directoryfields (
List
[str
]) – The fields to search for in the directoryfield_names_to_dataclass_fields (
Dict
) – The mapping of the field names to the dataclass fieldsdata_class (
Type
) – The dataclass to use for the document
- Return type
DocumentArray
- Returns
A DocumentArray with the documents
- now.data_loading.data_loading.create_docs_from_subdirectories(file_paths, fields, field_names_to_dataclass_fields, data_class, path=None, is_s3_dataset=False)[source]#
Creates a Multi Modal documentarray over a list of subdirectories.
- Parameters
file_paths (
List
) – The list of file pathsfields (
List
[str
]) – The fields to search for in the directoryfield_names_to_dataclass_fields (
Dict
) – The mapping of the field names to the dataclass fieldsdata_class (
Type
) – The dataclass to use for the documentpath (
Optional
[str
]) – The path to the directoryis_s3_dataset (
bool
) – Whether the dataset is stored on s3
- Return type
List
[Document
]- Returns
The list of documents
- now.data_loading.data_loading.create_docs_from_files(file_paths, fields, field_names_to_dataclass_fields, data_class, path=None, is_s3_dataset=False)[source]#
Creates a Multi Modal documentarray over a list of files.
- Parameters
file_paths (
List
) – List of file pathsfields (
List
[str
]) – The fields to search for in the directoryfield_names_to_dataclass_fields (
Dict
) – The mapping of the files to the dataclass fieldsdata_class (
Type
) – The dataclass to use for the documentpath (
Optional
[str
]) – The path to the directoryis_s3_dataset (
bool
) – Whether the dataset is stored on s3
- Return type
List
[Document
]- Returns
A list of documents