pyslk.core package

Contents

pyslk.core package#

Submodules#

pyslk.core.gen_queries module#

pyslk.core.gen_queries.gen_file_query(resources: str | list[str] | set[str] | Path | list[pathlib.Path] | set[pathlib.Path], recursive: bool = False, cached_only: bool = False, not_cached: bool = False, tape_barcodes: list[str] | str | None = None) str#

Generates a search query that searches for the listed resources

A search query will be generated which connects all elements of ‘resources’ with and ‘or’. A ‘resource’ (an element of ‘resources’) might be one of these:

  • a filename with full path (e.g. /arch/bm0146/k204221/INDEX.txt)

  • a filename without full path (e.g. INDEX.txt)

  • a regex describing a filename (e.g. /arch/bm0146/k204221/.*.txt)

  • a namespace (e.g. /arch/bm0146/k204221 or /arch/bm0146/k204221/)

Details are given in the slk_helpers documentation at https://docs.dkrz.de

Parameters:
  • resources (str or list or set or Path) – list of resources to be searched for

  • recursive (bool) – do recursive search in the namespaces

  • cached_only (bool) – do recursive search in the namespaces

  • not_cached (bool) – do recursive search in the namespaces

  • tape_barcodes (list[str]) – do recursive search in the namespaces

Returns:

generated search query

Return type:

str

pyslk.core.gen_queries.gen_file_query_as_dict(resources: str | list[str] | set[str] | Path | list[pathlib.Path] | set[pathlib.Path], recursive: bool = False, cached_only: bool = False, not_cached: bool = False, tape_barcodes: list[str] | str | None = None) dict#

Generates a search query that searches for the listed resources

A search query will be generated which connects all elements of ‘resources’ with and ‘or’. A ‘resource’ (an element of ‘resources’) might be one of these:

  • a filename with full path (e.g. /arch/bm0146/k204221/INDEX.txt)

  • a filename without full path (e.g. INDEX.txt)

  • a regex describing a filename (e.g. /arch/bm0146/k204221/.*.txt)

  • a namespace (e.g. /arch/bm0146/k204221 or /arch/bm0146/k204221/)

Details are given in the slk_helpers documentation at https://docs.dkrz.de

Parameters:
  • resources (str or list or set or Path) – list of resources to be searched for

  • recursive (bool) – do recursive search in the namespaces

  • cached_only (bool) – do recursive search in the namespaces

  • not_cached (bool) – do recursive search in the namespaces

  • tape_barcodes (list[str]) – do recursive search in the namespaces

Returns:

generated search query

Return type:

dict

pyslk.core.gen_queries.gen_search_query(key_value_pairs: str | list[str] | set[str], recursive: bool = False, search_query: str | None = None) str#

Generates a search query that searches for the listed resources

A search query will be generated which connects all elements of ‘resources’ with and ‘or’. A ‘resource’ (an element of ‘resources’) might be one of these:

  • a filename with full path (e.g. /arch/bm0146/k204221/INDEX.txt)

  • a filename without full path (e.g. INDEX.txt)

  • a regex describing a filename (e.g. /arch/bm0146/k204221/.*.txt)

  • a namespace (e.g. /arch/bm0146/k204221 or /arch/bm0146/k204221/)

Details are given in the slk_helpers documentation at https://docs.dkrz.de

Parameters:
  • key_value_pairs (str or list or set) – list of key-value pairs connected via an operator

  • recursive (bool) – do recursive search in the namespaces

  • search_query (str) – an existing search query to be extended

Returns:

generated search query

Return type:

str

pyslk.core.gfbt module#

pyslk.core.gfbt.count_tapes(resource_path: str | list | Path | None = None, search_id: str | int | None = None, search_query: str | None = None, recursive: bool = False) dict[str, int]#

Count number of tapes onto which provided files are stored; distinguishes between multi-tape and single-tape files

Parameters:
  • resource_path (str or path-like or list) – a resource path (str or Path) or multiple resource paths (in a list)

  • search_id (int, str) – id of a search

  • search_query (str) – a search query

  • recursive (bool) – set whether resource should be evaluated recursively or not

Returns:

dictionary containing the two tape counts

Return type:

dict

pyslk.core.gfbt.count_tapes_with_multi_tape_files(resource_path: str | int | Path | None = None, search_id: str | int | None = None, search_query: str | None = None) int#

Count number of tapes onto which provided files are stored which are split onto multiple tapes per file

Internally calls pyslk.count_tapes()

Parameters:
  • resource_path (str or int or path-like) – a resource path

  • search_id (int or str) – id of a search

  • search_query (str) – a search query

Returns:

number of tapes

Return type:

int

pyslk.core.gfbt.count_tapes_with_single_tape_files(resource_path: str | int | Path | None = None, search_id: str | int | None = None, search_query: str | None = None) int#

Count number of tapes onto which provided files are stored which are not split onto multiple tapes per file

Internally calls pyslk.count_tapes()

Parameters:
  • resource_path (str or int or path-like) – a resource path

  • search_id (int or str) – id of a search

  • search_query (str) – a search query

Returns:

number of tapes

Return type:

int

pyslk.core.gfbt.group_files_by_tape(resource_path: Path | str | list | None = None, search_id: str | int | None = None, search_query: str | None = None, recursive: bool = False, max_tape_number_per_search: int = -1, run_search_query: bool = False) list[dict]#

Group files by tape id.

Group a list of files by their tape id. Has not all arguments of the slk_helpers group_files_by_tape cli call. Please us pyslk.count_tapes() to count the number of tapes onto which files are stored on.

Parameters:
  • resource_path (str, list, Path) – list of files or a namespaces with files that should be grouped.

  • search_id (int, str) – id of a search

  • search_query (str) – a search query

  • recursive (bool) – do recursive search in the namespaces

  • max_tape_number_per_search (int) – number of tapes per search; if ‘-1’ => the parameter is not set

  • run_search_query (bool) – generate and run (a) search query strings instead of the lists of files per tape and print the search i, Default: False

Returns:

A list of dictionaries containing group and tape info.

Return type:

list[dict]

Examples

>>> import pyslk as slk
>>> slk.group_files_by_tape(["/test/test3/ingest_01_102", "/test/test3/ingest_01_339"])
[{'id': -1,
  'location': 'cache',
  'label': '',
  'status': '',
  'file_count': 2,
  'files': ['/test/test3/ingest_01_102', '/test/test3/ingest_01_339'],
  'search_query': '{"$and":[{"path":{"$gte":"/test/test3","$max_depth":1}},
                    {"resources.name":{"$regex":"ingest_01_102|ingest_01_339"}}]}',
  'search_id': 416837}]

pyslk.core.metadata module#

pyslk.core.metadata.get_metadata(resource: str | Path, print_hidden: bool = False, print_raw_values: bool = False) dict[str, Union[str, int, float, dict]] | None#

Get metadata

Parameters:
  • resource (str or Path) – resource (full path)

  • print_hidden (bool) – print read-only not-searchable metadata fields (sidecar file) [default: False]

  • print_raw_values (bool) – print metadata values without trying to convert them to int/float/dict [default: False]

Returns:

dictionary with the metadata

Return type:

dict or None

See also

pyslk.core.metadata.get_tag(path_or_id: str | int, recursive: bool = False) dict | None#

Apply metadata to the namespace and child resources

Parameters:
  • path_or_id (str or int) – search id or gns path of resources to retrieve

  • recursive (bool) – use the -R flag to tag recursively, Default: False

Returns:

metadata of the target files

Return type:

dict or None

pyslk.core.metadata.hsm2json(resources: str | Path | list | None = None, search_id: int = -1, recursive: bool = False, outfile: str | Path | None = None, restart_file: str | Path | None = None, schema: str | list | None = None, write_json_lines: bool = False, write_mode: str | None = None, instant_metadata_record_output: bool = False, print_hidden: bool = False) dict[str, Union[dict, list, NoneType]]#

Extract metadata from HSM file(s) and return them in JSON structure

Parameters:
  • resources (str or Path or list) – list of resources to be searched for

  • search_id (int) – id of a search

  • recursive (bool) – export metadata from all files in gns_path recursively

  • outfile (str or Path or None) – Write the output into a file instead to the stdout

  • restart_file (str or Path or None) – set a restart file in which the processed metadata entries are listed

  • schema (str or list or None) – import only metadata fields of listed schemata; if str: comma-separated list without spaces

  • write_json_lines (bool = False) – write JSON-lines instead of JSON

  • write_mode (str = None) – applies when ‘output’ is set; possible values: OVERWRITE, ERROR

  • instant_metadata_record_output (bool) – False (default): read metadata of all files and write/print out afterward; True: write/print each metadata record after it has been read (requires ‘write_json_lines’)

  • print_hidden (bool) – print read-only not-searchable metadata fields (sidecar file) [default: False]

Returns:

dictionary with keys ‘header’ (summary report), ‘metadata’ (actual metadata) and ‘file’ (JSON file); either ‘metadata’ or ‘file’ is none depending on the value of input argument ‘outfile’

Return type:

dict

pyslk.core.metadata.hsm2json_dict(resources: str | list = '', search_id: int = -1, recursive: bool = False, restart_file: str | None = None, schema: str | list | None = None, print_hidden: bool = False) dict[str, Union[dict, list, NoneType]]#

Extract metadata from HSM file(s) and return them in JSON structure

Parameters:
  • resources (str or list) – list of resources to be searched for

  • search_id (int) – id of a search

  • recursive (bool) – export metadata from all files in gns_path recursively

  • restart_file (str = None,) – set a restart file in which the processed metadata entries are listed

  • schema (str, list or None) – import only metadata fields of listed schemata; if str: comma-separated list without spaces

  • print_hidden (bool) – print read-only not-searchable metadata fields (sidecar file) [default: False]

Returns:

dictionary with keys ‘header’ (summary report), ‘metadata’ (actual metadata) and ‘file’ (None)

Return type:

dict

pyslk.core.metadata.hsm2json_file(outfile: str, resources: str | Path | list = '', search_id: int = -1, recursive: bool = False, restart_file: str | None = None, schema: str | list | None = None, write_json_lines: bool = False, write_mode: str | None = None, instant_metadata_record_output: bool = False, print_hidden: bool = False) None#

Extract metadata from HSM file(s) and return them in JSON structure

Parameters:
  • outfile (str or Path) – Write the output into a file instead to the stdout

  • resources (str or Path or list) – list of resources to be searched for

  • search_id (int) – id of a search

  • recursive (bool) – export metadata from all files in gns_path recursively

  • restart_file (str = None,) – set a restart file in which the processed metadata entries are listed

  • schema (str, list or None) – import only metadata fields of listed schemata; if str: comma-separated list without spaces

  • write_json_lines (bool = False) – write JSON-lines instead of JSON

  • write_mode (str = None) – applies when ‘output’ is set; possible values: OVERWRITE, ERROR

  • instant_metadata_record_output (bool) – False (default): read metadata of all files and write/print out afterward; True: write/print each metadata record after it has been read (requires ‘write_json_lines’)

  • print_hidden (bool) – print read-only not-searchable metadata fields (sidecar file) [default: False]

Returns:

nothing; throws an error if writing failed

Return type:

None

pyslk.core.metadata.json2hsm(json_file: str | None = None, restart_file: str | None = None, schema: str | list | None = None, expect_json_lines: bool = False, verbose: bool = False, quiet: bool = False, ignore_non_existing_metadata_fields: bool = False, write_mode: str | None = None, instant_metadata_record_update: bool = False, use_res_id: bool = False, skip_bad_metadata_sets: bool = False, json_string: str | None = None) dict#

Read metadata from JSON file and write them to archived files into HSM. Use absolute paths from metadata records to identify target files.

Parameters:
  • json_file (str) – JSON input file containing metadata

  • restart_file (str = None,) – set a restart file in which the processed metadata entries are listed

  • schema (str or list) – import only metadata fields of listed schemata; if str: comma-separated list without spaces

  • expect_json_lines (bool) – read JSON-lines from file instead of JSON

  • verbose (bool) – verbose mode

  • quiet (bool) – quiet mode

  • ignore_non_existing_metadata_fields (bool) – do not throw an error if a metadata field is used, which does not exist in StrongLink

  • write_mode (str) – select write mode for metadata: OVERWRITE, KEEP, ERROR, CLEAN

  • use_res_id (bool) – use resource_id instead of path to identify file

  • skip_bad_metadata_sets (bool) – skip damaged / incomplete metadata sets [default: throw error]

  • instant_metadata_record_update (bool) – False (default): read metadata records of all files and import into StrongLink afterward; True: import each metadata record after it has been read

  • json_string (str) – provide a json string instead of a json file; incompatible with json_file

Returns:

metadata import summary (key ‘header’)

Return type:

dict

pyslk.core.metadata.json_dict2hsm(json_dict: dict, restart_file: str | None = None, schema: str | list | None = None, expect_json_lines: bool = False, verbose: bool = False, quiet: bool = False, ignore_non_existing_metadata_fields: bool = False, write_mode: str | None = None, instant_metadata_record_update: bool = False, use_res_id: bool = False, skip_bad_metadata_sets: bool = False) dict#

Read metadata from JSON dictionary and write them to archived files into HSM. Use absolute paths from metadata records to identify target files.

Parameters:
  • json_dict (dict) – a dictionary representing JSON

  • restart_file (str = None,) – set a restart file in which the processed metadata entries are listed

  • schema (str or list) – import only metadata fields of listed schemata; if str: comma-separated list without spaces

  • expect_json_lines (bool) – read JSON-lines from file instead of JSON

  • verbose (bool) – verbose mode

  • quiet (bool) – quiet mode

  • ignore_non_existing_metadata_fields (bool) – do not throw an error if a metadata field is used, which does not exist in StrongLink

  • write_mode (str) – select write mode for metadata: OVERWRITE, KEEP, ERROR, CLEAN

  • use_res_id (bool) – use resource_id instead of path to identify file

  • skip_bad_metadata_sets (bool) – skip damaged / incomplete metadata sets [default: throw error]

  • instant_metadata_record_update (bool) – False (default): read metadata records of all files and import into StrongLink afterward; True: import each metadata record after it has been read

Returns:

metadata import summary (key ‘header’)

Return type:

dict

pyslk.core.metadata.json_file2hsm(json_file: str, restart_file: str | None = None, schema: str | list | None = None, expect_json_lines: bool = False, verbose: bool = False, quiet: bool = False, ignore_non_existing_metadata_fields: bool = False, write_mode: str | None = None, instant_metadata_record_update: bool = False, use_res_id: bool = False, skip_bad_metadata_sets: bool = False) dict#

Read metadata from JSON file and write them to archived files into HSM. Use absolute paths from metadata records to identify target files.

Parameters:
  • json_file (str) – JSON input file containing metadata

  • restart_file (str = None,) – set a restart file in which the processed metadata entries are listed

  • schema (str or list) – import only metadata fields of listed schemata; if str: comma-separated list without spaces

  • expect_json_lines (bool) – read JSON-lines from file instead of JSON

  • verbose (bool) – verbose mode

  • quiet (bool) – quiet mode

  • ignore_non_existing_metadata_fields (bool) – do not throw an error if a metadata field is used, which does not exist in StrongLink

  • write_mode (str) – select write mode for metadata: OVERWRITE, KEEP, ERROR, CLEAN

  • use_res_id (bool) – use resource_id instead of path to identify file

  • skip_bad_metadata_sets (bool) – skip damaged / incomplete metadata sets [default: throw error]

  • instant_metadata_record_update (bool) – False (default): read metadata records of all files and import into StrongLink afterward; True: import each metadata record after it has been read

Returns:

metadata import summary (key ‘header’)

Return type:

dict

pyslk.core.metadata.json_str2hsm(json_string: str, restart_file: str | None = None, schema: str | list | None = None, expect_json_lines: bool = False, verbose: bool = False, quiet: bool = False, ignore_non_existing_metadata_fields: bool = False, write_mode: str | None = None, instant_metadata_record_update: bool = False, use_res_id: bool = False, skip_bad_metadata_sets: bool = False) dict#

Read metadata from JSON file and write them to archived files into HSM. Use absolute paths from metadata records to identify target files.

Parameters:
  • json_string (str) – JSON string containing metadata

  • restart_file (str = None,) – set a restart file in which the processed metadata entries are listed

  • schema (str or list) – import only metadata fields of listed schemata; if str: comma-separated list without spaces

  • expect_json_lines (bool) – read JSON-lines from file instead of JSON

  • verbose (bool) – verbose mode

  • quiet (bool) – quiet mode

  • ignore_non_existing_metadata_fields (bool) – do not throw an error if a metadata field is used, which does not exist in StrongLink

  • write_mode (str) – select write mode for metadata: OVERWRITE, KEEP, ERROR, CLEAN

  • use_res_id (bool) – use resource_id instead of path to identify file

  • skip_bad_metadata_sets (bool) – skip damaged / incomplete metadata sets [default: throw error]

  • instant_metadata_record_update (bool) – False (default): read metadata records of all files and import into StrongLink afterward; True: import each metadata record after it has been read

Returns:

metadata import summary (key ‘header’)

Return type:

dict

pyslk.core.metadata.set_tag(path_or_id: str | int, metadata: dict, recursive: bool = False) dict | None#

Apply metadata to the namespace and child resources

Parameters:
  • path_or_id (str or int) – search id or gns path of resources to retrieve

  • metadata (dict) – dict that holds as keys “[metadata schema].[field]” and as values the metadata values

  • recursive (bool) – use the -R flag to tag recursively, Default: False

Returns:

new metadata of the target files

Return type:

dict or None

pyslk.core.resource module#

pyslk.core.resource.access_hsm(resource: list[str] | list[pathlib.Path] | str | Path, mode: int) bool | list[bool]#
pyslk.core.resource.arch_size(path_or_id: str | int, unit: str = 'B') dict[str, Union[str, float, int]]#

Get archive size from search id or GNS path by recursively listing all files of archive and adding file sizes

Parameters:
  • path_or_id (str) – search id or gns path

  • unit (str) – Prefix of returned size must be one of B, K, M, G, T, P or h for Byte, Kilobyte, Megabyte Gigabyte, Terrabyte, Petabyte or “human-readable”; default: B

Returns:

archive size, in key “value” contains size without unit and key “unit” contains unit

Return type:

dict

pyslk.core.resource.chgrp(gns_path: str, group: str | int, recursive: bool = False) dict | None#

Change the group of a resource or namespace

Parameters:
  • gns_path (str) – namespace or file (full GNS path)

  • group (str or int) – new group of a file (group name or gid)

  • recursive (bool) – use the -R flag to delete recursively, Default: False

Returns:

dict with stdout/stderr, exit code and lists of files with correct and incorrect group; None if gns_path does not exist

Return type:

dict or None

pyslk.core.resource.chmod(gns_path: str | list, mode: str | int, recursive: bool = False) bool | None#

Change the access mode of a resource or namespace

Parameters:
  • gns_path (str or list) – namespace or file (full GNS path); can be file list

  • mode (str or int) – new mode/permissions of a file (as known from bash’s chmod)

  • recursive (bool) – use the -R flag to delete recursively, Default: False

Returns:

True if successful, None if target does not exist, PySlkException if fails

Return type:

bool or None

pyslk.core.resource.chown(gns_path: [<class 'str'>, <class 'pathlib.Path'>], owner: str | int, recursive: bool = False) dict | None#

Change the owner of a resource or namespace

Parameters:
  • gns_path (str or Path) – namespace or file (full GNS path)

  • owner (str or int) – new owner of a file (username or uid)

  • recursive (bool) – use the -R flag to delete recursively, Default: False

Returns:

dict of Paths of the modified files ‘PATH: TRUE-IF-OWNER-CORRECT’; None if gns_path does not exist

Return type:

dict or None

pyslk.core.resource.delete(gns_path: str | Path | list, recursive: bool = False) None#
Soft delete a namespace (optionally all child objects of a non-empty

namespace) or a specific file

Parameters:
  • gns_path (str or list or Path) – namespace or file (full GNS path); can be file list

  • recursive (bool) – use the -R flag to delete recursively, Default: False

Returns:

nothing is returned (void function)

Return type:

None

pyslk.core.resource.get_checksum(resource: str | Path, checksum_type: str | None = None) dict[str, str] | None#

Get a checksum of a resource

Parameters:
  • resource (str or Path) – resource (full path)

  • checksum_type (str) – checksum_type (possible values: None, “sha512”, “adler32”; None => print all)

Returns:

dictionary with checksum type as key(s) and checksum(s) as value(s); empty keys if no checksum; ‘None’ if resource does not exist

Return type:

dict or None

pyslk.core.resource.get_resource_id(resource_path: str | Path) int | None#

returns resource_id to a resource path

Parameters:

resource_path (str or path-like) – namespace or resource

Returns:

resource_id if the file exists; None otherwise

Return type:

int or None

pyslk.core.resource.get_resource_permissions(resource: str | int | Path | None = None, as_octal_number: bool = False) str | bool | None#

Get path for a resource id

Parameters:
  • resource (str or int or path-like) – a resource id or a resource path

  • as_octal_number (bool) – Do not return the permissions as combination of x/w/r/- but as three digit octal number

Returns:

permissions string; False if resource does not exist

Return type:

str or bool or None

pyslk.core.resource.get_resource_size(resource: str | int | Path, recursive: bool = False) int | None#

Returns file size in byte

Parameters:
  • resource (str or int or path-like) – a resource id or a resource path

  • recursive (bool) – use the -R to calculate size recursively

Returns:

size in byte; None if resource does not exist

Return type:

int or None

pyslk.core.resource.get_resource_tape(resource_path: str | Path) dict[int, str] | None#

returns tape on which resource with given path is stored on

Parameters:

resource_path (str or path-like) – namespace or resource

Returns:

tape(s) on which a file is/are stored on as dict; None otherwise

Return type:

dict[int, str] or None

pyslk.core.resource.has_no_flag_partial(resource: str | int | Path | list | None = None, search_id: str | int | None = None) bool#

Check if whether file(s) is/are flagged as partial; return True/False

Parameters:
  • resource (str or int or path-like) – a resource id or a resource path

  • search_id (int or str) – id of a search

Returns:

True if no files is flagged; False otherwise

Return type:

bool

pyslk.core.resource.has_no_flag_partial_details(resource: str | int | Path | list | None = None, search_id: str | int | None = None) dict[str, list[pathlib.Path]]#

Check if whether file(s) is/are flagged as partial; returns dict with keys ‘flag_partial’ and ‘no_flag_partial’

Parameters:
  • resource (str or int or path-like) – a resource id or a resource path

  • search_id (int or str) – id of a search

Returns:

dictionary with two keys ‘flag_partial’ and ‘no_flag_partial’ which each have a list of files as value

Return type:

dict[str, list[Path]]

pyslk.core.resource.makedirs(gns_path: str | Path, exist_ok: bool = False) int#

Create a directory like ‘mkdir()’ but create parent directories recursively, if they do not exist

If exist_ok is False (the default), a FileExistsError is raised if the target directory already exists.

Parameters:
  • gns_path (str or Path) – gns path to create

  • exist_ok (bool) – throw no error if folder already exists (like ‘mkdir -p’)

Returns:

namespace/resource id of the created namespace

Return type:

int

See also

pyslk.core.resource.mkdir(gns_path: str | Path) int#

Create a directory

If the directory already exists, FileExistsError is raised. If a parent directory in the path does not exist, FileNotFoundError is raised.

Parameters:

gns_path (str or Path) – gns path to create

Returns:

namespace/resource id of the created namespace

Return type:

int

pyslk.core.resource.move(src_path: str, dst_gns: str, no_overwrite: bool) int#

Move namespaces/files from one parent folder to another; renaming is not possible

Parameters:
  • src_path (str) – namespace or file (full GNS path)

  • dst_gns (str) – new parent namespace

  • no_overwrite (bool) – do not overwrite target file if it exists

Returns:

return resource id of the moved resource

Return type:

int

pyslk.core.resource.rename(old_name: str, new_name: str) int#

Rename a folder or file; moving is not possible

Parameters:
  • old_name (str) – folder or file name (full GNS path)

  • new_name (str) – new name (only name; no full GNS path)

Returns:

return resource id of the renamed resource

Return type:

int

pyslk.core.resource_extras module#

pyslk.core.resource_extras.get_resource_type(resource: str | int | Path) str | None#

Get type of resource

Parameters:

resource (str or int or path-like) – a resource id or a resource path

Returns:

type of the resource; None if resource does not exist

Return type:

str or None

pyslk.core.resource_extras.is_file(resource: str | int | Path | None = None) bool | None#

Returns True if resource is a file

Parameters:

resource (str or int or path-like) – a resource id or a resource path

Returns:

True if resource is a file; else False

Return type:

str or None

pyslk.core.resource_extras.is_namespace(resource: str | int | Path | None = None) bool | None#

Returns True if resource is a namespace

Parameters:

resource (str or int or path-like) – a resource id or a resource path

Returns:

True if resource is a namespace; else False

Return type:

str or None

pyslk.core.resource_extras.resource_exists(resource: str | Path | int) bool#

Check if resource exists and return True/False

Parameters:

resource (str or path-like) – namespace or resource

Returns:

True if file exists; False otherwise

Return type:

bool

pyslk.core.storage module#

pyslk.core.storage.get_rcrs(resource: str | int | Path) dict | None#

prints resource content record (rcr) information

Parameters:

resource (str or int or path-like) – a resource id or a resource path

Returns:

storage information when exists; None otherwise

Return type:

dict

pyslk.core.storage.get_storage_information(resource: str | int | Path) dict | None#

prints resource content record (rcr) information

Parameters:

resource (str or int or path-like) – a resource id or a resource path

Returns:

True if tape exists; False otherwise

Return type:

bool

pyslk.core.storage.get_tape_barcode(tape_id: int | str) str | None#

return tape barcode for provided tape id

Parameters:

tape_id (int or str) – id of a tape in the tape library

Returns:

True if tape exists; False otherwise

Return type:

bool

pyslk.core.storage.get_tape_id(tape_barcode: str) int | None#

return tape id for provided tape barcode

Parameters:

tape_barcode (str) – barcode of a tape in the tape library

Returns:

tape id; None otherwise

Return type:

int

pyslk.core.storage.is_cached(resource: str | int | Path | None = None, search_id: str | int | None = None) bool#

Check if whether file(s) is/are in HSM cache or not; returns True/False

Parameters:
  • resource (str or int or path-like) – a resource id or a resource path

  • search_id (int or str) – id of a search

Returns:

True if all files are in cache; False otherwise

Return type:

bool

pyslk.core.storage.is_cached_details(resource: str | int | Path | None = None, search_id: str | int | None = None) dict[str, list[pathlib.Path]]#

Check if whether file(s) is/are in HSM cache or not; returns dict with keys ‘cached’ and ‘not_cached’

Parameters:
  • resource (str or int or path-like) – a resource id or a resource path

  • search_id (int or str) – id of a search

Returns:

dictionary with two keys ‘cached’ and ‘not_cached’ which each have a list of files as value

Return type:

dict[str, list[Path]]

pyslk.core.storage.is_on_tape(resource: str | int | Path | None = None, search_id: str | int | None = None) bool#

Check if whether file(s) is/are stored on tape; returns True/False

Parameters:
  • resource (str or int or path-like) – a resource id or a resource path

  • search_id (int or str) – id of a search

Returns:

True if all files are stored on tape; False otherwise

Return type:

bool

pyslk.core.storage.is_on_tape_details(resource: str | int | Path | None = None, search_id: str | int | None = None) dict[str, list[pathlib.Path]]#

Check if whether file(s) is/are stored on tape or not; returns dict with keys ‘on_tape’ and ‘not_on_tape’

Parameters:
  • resource (str or int or path-like) – a resource id or a resource path

  • search_id (int or str) – id of a search

Returns:

dictionary with two keys ‘on_tape’ and ‘not_on_tape’ which each have a list of files as value

Return type:

dict[str, list[Path]]

pyslk.core.storage.is_tape_available(tape: int) bool | None#

Check if tape is available

Parameters:

tape (int or str) – id or barcode of a tape in the tape library

Returns:

True if tape is available for recalls/retrievals; else False; None if tape does not exist

Return type:

bool or None

pyslk.core.storage.tape_exists(tape: int | str) bool#

Check if tape exists

Parameters:

tape (int or str) – id or barcode of a tape in the tape library

Returns:

True if tape exists; False otherwise

Return type:

bool

pyslk.core.storage.tape_status(tape: int | str, details: bool = False) str | None#

Check the status of a tape

Parameters:
  • tape (int or str) – id or barcode of a tape in the tape library

  • details (bool) – print a more detailed description of the retrieval status

Returns:

status of the tape; None if tape does not exist

Return type:

str or None

Module contents#

pyslk.core.access_hsm(resource: list[str] | list[pathlib.Path] | str | Path, mode: int) bool | list[bool]#
pyslk.core.arch_size(path_or_id: str | int, unit: str = 'B') dict[str, Union[str, float, int]]#

Get archive size from search id or GNS path by recursively listing all files of archive and adding file sizes

Parameters:
  • path_or_id (str) – search id or gns path

  • unit (str) – Prefix of returned size must be one of B, K, M, G, T, P or h for Byte, Kilobyte, Megabyte Gigabyte, Terrabyte, Petabyte or “human-readable”; default: B

Returns:

archive size, in key “value” contains size without unit and key “unit” contains unit

Return type:

dict

pyslk.core.chgrp(gns_path: str, group: str | int, recursive: bool = False) dict | None#

Change the group of a resource or namespace

Parameters:
  • gns_path (str) – namespace or file (full GNS path)

  • group (str or int) – new group of a file (group name or gid)

  • recursive (bool) – use the -R flag to delete recursively, Default: False

Returns:

dict with stdout/stderr, exit code and lists of files with correct and incorrect group; None if gns_path does not exist

Return type:

dict or None

pyslk.core.chmod(gns_path: str | list, mode: str | int, recursive: bool = False) bool | None#

Change the access mode of a resource or namespace

Parameters:
  • gns_path (str or list) – namespace or file (full GNS path); can be file list

  • mode (str or int) – new mode/permissions of a file (as known from bash’s chmod)

  • recursive (bool) – use the -R flag to delete recursively, Default: False

Returns:

True if successful, None if target does not exist, PySlkException if fails

Return type:

bool or None

pyslk.core.chown(gns_path: [<class 'str'>, <class 'pathlib.Path'>], owner: str | int, recursive: bool = False) dict | None#

Change the owner of a resource or namespace

Parameters:
  • gns_path (str or Path) – namespace or file (full GNS path)

  • owner (str or int) – new owner of a file (username or uid)

  • recursive (bool) – use the -R flag to delete recursively, Default: False

Returns:

dict of Paths of the modified files ‘PATH: TRUE-IF-OWNER-CORRECT’; None if gns_path does not exist

Return type:

dict or None

pyslk.core.count_tapes(resource_path: str | list | Path | None = None, search_id: str | int | None = None, search_query: str | None = None, recursive: bool = False) dict[str, int]#

Count number of tapes onto which provided files are stored; distinguishes between multi-tape and single-tape files

Parameters:
  • resource_path (str or path-like or list) – a resource path (str or Path) or multiple resource paths (in a list)

  • search_id (int, str) – id of a search

  • search_query (str) – a search query

  • recursive (bool) – set whether resource should be evaluated recursively or not

Returns:

dictionary containing the two tape counts

Return type:

dict

pyslk.core.count_tapes_with_multi_tape_files(resource_path: str | int | Path | None = None, search_id: str | int | None = None, search_query: str | None = None) int#

Count number of tapes onto which provided files are stored which are split onto multiple tapes per file

Internally calls pyslk.count_tapes()

Parameters:
  • resource_path (str or int or path-like) – a resource path

  • search_id (int or str) – id of a search

  • search_query (str) – a search query

Returns:

number of tapes

Return type:

int

pyslk.core.count_tapes_with_single_tape_files(resource_path: str | int | Path | None = None, search_id: str | int | None = None, search_query: str | None = None) int#

Count number of tapes onto which provided files are stored which are not split onto multiple tapes per file

Internally calls pyslk.count_tapes()

Parameters:
  • resource_path (str or int or path-like) – a resource path

  • search_id (int or str) – id of a search

  • search_query (str) – a search query

Returns:

number of tapes

Return type:

int

pyslk.core.delete(gns_path: str | Path | list, recursive: bool = False) None#
Soft delete a namespace (optionally all child objects of a non-empty

namespace) or a specific file

Parameters:
  • gns_path (str or list or Path) – namespace or file (full GNS path); can be file list

  • recursive (bool) – use the -R flag to delete recursively, Default: False

Returns:

nothing is returned (void function)

Return type:

None

pyslk.core.gen_file_query(resources: str | list[str] | set[str] | Path | list[pathlib.Path] | set[pathlib.Path], recursive: bool = False, cached_only: bool = False, not_cached: bool = False, tape_barcodes: list[str] | str | None = None) str#

Generates a search query that searches for the listed resources

A search query will be generated which connects all elements of ‘resources’ with and ‘or’. A ‘resource’ (an element of ‘resources’) might be one of these:

  • a filename with full path (e.g. /arch/bm0146/k204221/INDEX.txt)

  • a filename without full path (e.g. INDEX.txt)

  • a regex describing a filename (e.g. /arch/bm0146/k204221/.*.txt)

  • a namespace (e.g. /arch/bm0146/k204221 or /arch/bm0146/k204221/)

Details are given in the slk_helpers documentation at https://docs.dkrz.de

Parameters:
  • resources (str or list or set or Path) – list of resources to be searched for

  • recursive (bool) – do recursive search in the namespaces

  • cached_only (bool) – do recursive search in the namespaces

  • not_cached (bool) – do recursive search in the namespaces

  • tape_barcodes (list[str]) – do recursive search in the namespaces

Returns:

generated search query

Return type:

str

pyslk.core.gen_file_query_as_dict(resources: str | list[str] | set[str] | Path | list[pathlib.Path] | set[pathlib.Path], recursive: bool = False, cached_only: bool = False, not_cached: bool = False, tape_barcodes: list[str] | str | None = None) dict#

Generates a search query that searches for the listed resources

A search query will be generated which connects all elements of ‘resources’ with and ‘or’. A ‘resource’ (an element of ‘resources’) might be one of these:

  • a filename with full path (e.g. /arch/bm0146/k204221/INDEX.txt)

  • a filename without full path (e.g. INDEX.txt)

  • a regex describing a filename (e.g. /arch/bm0146/k204221/.*.txt)

  • a namespace (e.g. /arch/bm0146/k204221 or /arch/bm0146/k204221/)

Details are given in the slk_helpers documentation at https://docs.dkrz.de

Parameters:
  • resources (str or list or set or Path) – list of resources to be searched for

  • recursive (bool) – do recursive search in the namespaces

  • cached_only (bool) – do recursive search in the namespaces

  • not_cached (bool) – do recursive search in the namespaces

  • tape_barcodes (list[str]) – do recursive search in the namespaces

Returns:

generated search query

Return type:

dict

pyslk.core.gen_search_query(key_value_pairs: str | list[str] | set[str], recursive: bool = False, search_query: str | None = None) str#

Generates a search query that searches for the listed resources

A search query will be generated which connects all elements of ‘resources’ with and ‘or’. A ‘resource’ (an element of ‘resources’) might be one of these:

  • a filename with full path (e.g. /arch/bm0146/k204221/INDEX.txt)

  • a filename without full path (e.g. INDEX.txt)

  • a regex describing a filename (e.g. /arch/bm0146/k204221/.*.txt)

  • a namespace (e.g. /arch/bm0146/k204221 or /arch/bm0146/k204221/)

Details are given in the slk_helpers documentation at https://docs.dkrz.de

Parameters:
  • key_value_pairs (str or list or set) – list of key-value pairs connected via an operator

  • recursive (bool) – do recursive search in the namespaces

  • search_query (str) – an existing search query to be extended

Returns:

generated search query

Return type:

str

pyslk.core.get_checksum(resource: str | Path, checksum_type: str | None = None) dict[str, str] | None#

Get a checksum of a resource

Parameters:
  • resource (str or Path) – resource (full path)

  • checksum_type (str) – checksum_type (possible values: None, “sha512”, “adler32”; None => print all)

Returns:

dictionary with checksum type as key(s) and checksum(s) as value(s); empty keys if no checksum; ‘None’ if resource does not exist

Return type:

dict or None

pyslk.core.get_metadata(resource: str | Path, print_hidden: bool = False, print_raw_values: bool = False) dict[str, Union[str, int, float, dict]] | None#

Get metadata

Parameters:
  • resource (str or Path) – resource (full path)

  • print_hidden (bool) – print read-only not-searchable metadata fields (sidecar file) [default: False]

  • print_raw_values (bool) – print metadata values without trying to convert them to int/float/dict [default: False]

Returns:

dictionary with the metadata

Return type:

dict or None

See also

pyslk.core.get_rcrs(resource: str | int | Path) dict | None#

prints resource content record (rcr) information

Parameters:

resource (str or int or path-like) – a resource id or a resource path

Returns:

storage information when exists; None otherwise

Return type:

dict

pyslk.core.get_resource_id(resource_path: str | Path) int | None#

returns resource_id to a resource path

Parameters:

resource_path (str or path-like) – namespace or resource

Returns:

resource_id if the file exists; None otherwise

Return type:

int or None

pyslk.core.get_resource_permissions(resource: str | int | Path | None = None, as_octal_number: bool = False) str | bool | None#

Get path for a resource id

Parameters:
  • resource (str or int or path-like) – a resource id or a resource path

  • as_octal_number (bool) – Do not return the permissions as combination of x/w/r/- but as three digit octal number

Returns:

permissions string; False if resource does not exist

Return type:

str or bool or None

pyslk.core.get_resource_size(resource: str | int | Path, recursive: bool = False) int | None#

Returns file size in byte

Parameters:
  • resource (str or int or path-like) – a resource id or a resource path

  • recursive (bool) – use the -R to calculate size recursively

Returns:

size in byte; None if resource does not exist

Return type:

int or None

pyslk.core.get_resource_tape(resource_path: str | Path) dict[int, str] | None#

returns tape on which resource with given path is stored on

Parameters:

resource_path (str or path-like) – namespace or resource

Returns:

tape(s) on which a file is/are stored on as dict; None otherwise

Return type:

dict[int, str] or None

pyslk.core.get_resource_type(resource: str | int | Path) str | None#

Get type of resource

Parameters:

resource (str or int or path-like) – a resource id or a resource path

Returns:

type of the resource; None if resource does not exist

Return type:

str or None

pyslk.core.get_storage_information(resource: str | int | Path) dict | None#

prints resource content record (rcr) information

Parameters:

resource (str or int or path-like) – a resource id or a resource path

Returns:

True if tape exists; False otherwise

Return type:

bool

pyslk.core.get_tag(path_or_id: str | int, recursive: bool = False) dict | None#

Apply metadata to the namespace and child resources

Parameters:
  • path_or_id (str or int) – search id or gns path of resources to retrieve

  • recursive (bool) – use the -R flag to tag recursively, Default: False

Returns:

metadata of the target files

Return type:

dict or None

pyslk.core.get_tape_barcode(tape_id: int | str) str | None#

return tape barcode for provided tape id

Parameters:

tape_id (int or str) – id of a tape in the tape library

Returns:

True if tape exists; False otherwise

Return type:

bool

pyslk.core.get_tape_id(tape_barcode: str) int | None#

return tape id for provided tape barcode

Parameters:

tape_barcode (str) – barcode of a tape in the tape library

Returns:

tape id; None otherwise

Return type:

int

pyslk.core.group_files_by_tape(resource_path: Path | str | list | None = None, search_id: str | int | None = None, search_query: str | None = None, recursive: bool = False, max_tape_number_per_search: int = -1, run_search_query: bool = False) list[dict]#

Group files by tape id.

Group a list of files by their tape id. Has not all arguments of the slk_helpers group_files_by_tape cli call. Please us pyslk.count_tapes() to count the number of tapes onto which files are stored on.

Parameters:
  • resource_path (str, list, Path) – list of files or a namespaces with files that should be grouped.

  • search_id (int, str) – id of a search

  • search_query (str) – a search query

  • recursive (bool) – do recursive search in the namespaces

  • max_tape_number_per_search (int) – number of tapes per search; if ‘-1’ => the parameter is not set

  • run_search_query (bool) – generate and run (a) search query strings instead of the lists of files per tape and print the search i, Default: False

Returns:

A list of dictionaries containing group and tape info.

Return type:

list[dict]

Examples

>>> import pyslk as slk
>>> slk.group_files_by_tape(["/test/test3/ingest_01_102", "/test/test3/ingest_01_339"])
[{'id': -1,
  'location': 'cache',
  'label': '',
  'status': '',
  'file_count': 2,
  'files': ['/test/test3/ingest_01_102', '/test/test3/ingest_01_339'],
  'search_query': '{"$and":[{"path":{"$gte":"/test/test3","$max_depth":1}},
                    {"resources.name":{"$regex":"ingest_01_102|ingest_01_339"}}]}',
  'search_id': 416837}]
pyslk.core.has_no_flag_partial(resource: str | int | Path | list | None = None, search_id: str | int | None = None) bool#

Check if whether file(s) is/are flagged as partial; return True/False

Parameters:
  • resource (str or int or path-like) – a resource id or a resource path

  • search_id (int or str) – id of a search

Returns:

True if no files is flagged; False otherwise

Return type:

bool

pyslk.core.has_no_flag_partial_details(resource: str | int | Path | list | None = None, search_id: str | int | None = None) dict[str, list[pathlib.Path]]#

Check if whether file(s) is/are flagged as partial; returns dict with keys ‘flag_partial’ and ‘no_flag_partial’

Parameters:
  • resource (str or int or path-like) – a resource id or a resource path

  • search_id (int or str) – id of a search

Returns:

dictionary with two keys ‘flag_partial’ and ‘no_flag_partial’ which each have a list of files as value

Return type:

dict[str, list[Path]]

pyslk.core.hsm2json(resources: str | Path | list | None = None, search_id: int = -1, recursive: bool = False, outfile: str | Path | None = None, restart_file: str | Path | None = None, schema: str | list | None = None, write_json_lines: bool = False, write_mode: str | None = None, instant_metadata_record_output: bool = False, print_hidden: bool = False) dict[str, Union[dict, list, NoneType]]#

Extract metadata from HSM file(s) and return them in JSON structure

Parameters:
  • resources (str or Path or list) – list of resources to be searched for

  • search_id (int) – id of a search

  • recursive (bool) – export metadata from all files in gns_path recursively

  • outfile (str or Path or None) – Write the output into a file instead to the stdout

  • restart_file (str or Path or None) – set a restart file in which the processed metadata entries are listed

  • schema (str or list or None) – import only metadata fields of listed schemata; if str: comma-separated list without spaces

  • write_json_lines (bool = False) – write JSON-lines instead of JSON

  • write_mode (str = None) – applies when ‘output’ is set; possible values: OVERWRITE, ERROR

  • instant_metadata_record_output (bool) – False (default): read metadata of all files and write/print out afterward; True: write/print each metadata record after it has been read (requires ‘write_json_lines’)

  • print_hidden (bool) – print read-only not-searchable metadata fields (sidecar file) [default: False]

Returns:

dictionary with keys ‘header’ (summary report), ‘metadata’ (actual metadata) and ‘file’ (JSON file); either ‘metadata’ or ‘file’ is none depending on the value of input argument ‘outfile’

Return type:

dict

pyslk.core.hsm2json_dict(resources: str | list = '', search_id: int = -1, recursive: bool = False, restart_file: str | None = None, schema: str | list | None = None, print_hidden: bool = False) dict[str, Union[dict, list, NoneType]]#

Extract metadata from HSM file(s) and return them in JSON structure

Parameters:
  • resources (str or list) – list of resources to be searched for

  • search_id (int) – id of a search

  • recursive (bool) – export metadata from all files in gns_path recursively

  • restart_file (str = None,) – set a restart file in which the processed metadata entries are listed

  • schema (str, list or None) – import only metadata fields of listed schemata; if str: comma-separated list without spaces

  • print_hidden (bool) – print read-only not-searchable metadata fields (sidecar file) [default: False]

Returns:

dictionary with keys ‘header’ (summary report), ‘metadata’ (actual metadata) and ‘file’ (None)

Return type:

dict

pyslk.core.hsm2json_file(outfile: str, resources: str | Path | list = '', search_id: int = -1, recursive: bool = False, restart_file: str | None = None, schema: str | list | None = None, write_json_lines: bool = False, write_mode: str | None = None, instant_metadata_record_output: bool = False, print_hidden: bool = False) None#

Extract metadata from HSM file(s) and return them in JSON structure

Parameters:
  • outfile (str or Path) – Write the output into a file instead to the stdout

  • resources (str or Path or list) – list of resources to be searched for

  • search_id (int) – id of a search

  • recursive (bool) – export metadata from all files in gns_path recursively

  • restart_file (str = None,) – set a restart file in which the processed metadata entries are listed

  • schema (str, list or None) – import only metadata fields of listed schemata; if str: comma-separated list without spaces

  • write_json_lines (bool = False) – write JSON-lines instead of JSON

  • write_mode (str = None) – applies when ‘output’ is set; possible values: OVERWRITE, ERROR

  • instant_metadata_record_output (bool) – False (default): read metadata of all files and write/print out afterward; True: write/print each metadata record after it has been read (requires ‘write_json_lines’)

  • print_hidden (bool) – print read-only not-searchable metadata fields (sidecar file) [default: False]

Returns:

nothing; throws an error if writing failed

Return type:

None

pyslk.core.is_cached(resource: str | int | Path | None = None, search_id: str | int | None = None) bool#

Check if whether file(s) is/are in HSM cache or not; returns True/False

Parameters:
  • resource (str or int or path-like) – a resource id or a resource path

  • search_id (int or str) – id of a search

Returns:

True if all files are in cache; False otherwise

Return type:

bool

pyslk.core.is_cached_details(resource: str | int | Path | None = None, search_id: str | int | None = None) dict[str, list[pathlib.Path]]#

Check if whether file(s) is/are in HSM cache or not; returns dict with keys ‘cached’ and ‘not_cached’

Parameters:
  • resource (str or int or path-like) – a resource id or a resource path

  • search_id (int or str) – id of a search

Returns:

dictionary with two keys ‘cached’ and ‘not_cached’ which each have a list of files as value

Return type:

dict[str, list[Path]]

pyslk.core.is_file(resource: str | int | Path | None = None) bool | None#

Returns True if resource is a file

Parameters:

resource (str or int or path-like) – a resource id or a resource path

Returns:

True if resource is a file; else False

Return type:

str or None

pyslk.core.is_namespace(resource: str | int | Path | None = None) bool | None#

Returns True if resource is a namespace

Parameters:

resource (str or int or path-like) – a resource id or a resource path

Returns:

True if resource is a namespace; else False

Return type:

str or None

pyslk.core.is_on_tape(resource: str | int | Path | None = None, search_id: str | int | None = None) bool#

Check if whether file(s) is/are stored on tape; returns True/False

Parameters:
  • resource (str or int or path-like) – a resource id or a resource path

  • search_id (int or str) – id of a search

Returns:

True if all files are stored on tape; False otherwise

Return type:

bool

pyslk.core.is_on_tape_details(resource: str | int | Path | None = None, search_id: str | int | None = None) dict[str, list[pathlib.Path]]#

Check if whether file(s) is/are stored on tape or not; returns dict with keys ‘on_tape’ and ‘not_on_tape’

Parameters:
  • resource (str or int or path-like) – a resource id or a resource path

  • search_id (int or str) – id of a search

Returns:

dictionary with two keys ‘on_tape’ and ‘not_on_tape’ which each have a list of files as value

Return type:

dict[str, list[Path]]

pyslk.core.is_tape_available(tape: int) bool | None#

Check if tape is available

Parameters:

tape (int or str) – id or barcode of a tape in the tape library

Returns:

True if tape is available for recalls/retrievals; else False; None if tape does not exist

Return type:

bool or None

pyslk.core.json2hsm(json_file: str | None = None, restart_file: str | None = None, schema: str | list | None = None, expect_json_lines: bool = False, verbose: bool = False, quiet: bool = False, ignore_non_existing_metadata_fields: bool = False, write_mode: str | None = None, instant_metadata_record_update: bool = False, use_res_id: bool = False, skip_bad_metadata_sets: bool = False, json_string: str | None = None) dict#

Read metadata from JSON file and write them to archived files into HSM. Use absolute paths from metadata records to identify target files.

Parameters:
  • json_file (str) – JSON input file containing metadata

  • restart_file (str = None,) – set a restart file in which the processed metadata entries are listed

  • schema (str or list) – import only metadata fields of listed schemata; if str: comma-separated list without spaces

  • expect_json_lines (bool) – read JSON-lines from file instead of JSON

  • verbose (bool) – verbose mode

  • quiet (bool) – quiet mode

  • ignore_non_existing_metadata_fields (bool) – do not throw an error if a metadata field is used, which does not exist in StrongLink

  • write_mode (str) – select write mode for metadata: OVERWRITE, KEEP, ERROR, CLEAN

  • use_res_id (bool) – use resource_id instead of path to identify file

  • skip_bad_metadata_sets (bool) – skip damaged / incomplete metadata sets [default: throw error]

  • instant_metadata_record_update (bool) – False (default): read metadata records of all files and import into StrongLink afterward; True: import each metadata record after it has been read

  • json_string (str) – provide a json string instead of a json file; incompatible with json_file

Returns:

metadata import summary (key ‘header’)

Return type:

dict

pyslk.core.json_dict2hsm(json_dict: dict, restart_file: str | None = None, schema: str | list | None = None, expect_json_lines: bool = False, verbose: bool = False, quiet: bool = False, ignore_non_existing_metadata_fields: bool = False, write_mode: str | None = None, instant_metadata_record_update: bool = False, use_res_id: bool = False, skip_bad_metadata_sets: bool = False) dict#

Read metadata from JSON dictionary and write them to archived files into HSM. Use absolute paths from metadata records to identify target files.

Parameters:
  • json_dict (dict) – a dictionary representing JSON

  • restart_file (str = None,) – set a restart file in which the processed metadata entries are listed

  • schema (str or list) – import only metadata fields of listed schemata; if str: comma-separated list without spaces

  • expect_json_lines (bool) – read JSON-lines from file instead of JSON

  • verbose (bool) – verbose mode

  • quiet (bool) – quiet mode

  • ignore_non_existing_metadata_fields (bool) – do not throw an error if a metadata field is used, which does not exist in StrongLink

  • write_mode (str) – select write mode for metadata: OVERWRITE, KEEP, ERROR, CLEAN

  • use_res_id (bool) – use resource_id instead of path to identify file

  • skip_bad_metadata_sets (bool) – skip damaged / incomplete metadata sets [default: throw error]

  • instant_metadata_record_update (bool) – False (default): read metadata records of all files and import into StrongLink afterward; True: import each metadata record after it has been read

Returns:

metadata import summary (key ‘header’)

Return type:

dict

pyslk.core.json_file2hsm(json_file: str, restart_file: str | None = None, schema: str | list | None = None, expect_json_lines: bool = False, verbose: bool = False, quiet: bool = False, ignore_non_existing_metadata_fields: bool = False, write_mode: str | None = None, instant_metadata_record_update: bool = False, use_res_id: bool = False, skip_bad_metadata_sets: bool = False) dict#

Read metadata from JSON file and write them to archived files into HSM. Use absolute paths from metadata records to identify target files.

Parameters:
  • json_file (str) – JSON input file containing metadata

  • restart_file (str = None,) – set a restart file in which the processed metadata entries are listed

  • schema (str or list) – import only metadata fields of listed schemata; if str: comma-separated list without spaces

  • expect_json_lines (bool) – read JSON-lines from file instead of JSON

  • verbose (bool) – verbose mode

  • quiet (bool) – quiet mode

  • ignore_non_existing_metadata_fields (bool) – do not throw an error if a metadata field is used, which does not exist in StrongLink

  • write_mode (str) – select write mode for metadata: OVERWRITE, KEEP, ERROR, CLEAN

  • use_res_id (bool) – use resource_id instead of path to identify file

  • skip_bad_metadata_sets (bool) – skip damaged / incomplete metadata sets [default: throw error]

  • instant_metadata_record_update (bool) – False (default): read metadata records of all files and import into StrongLink afterward; True: import each metadata record after it has been read

Returns:

metadata import summary (key ‘header’)

Return type:

dict

pyslk.core.json_str2hsm(json_string: str, restart_file: str | None = None, schema: str | list | None = None, expect_json_lines: bool = False, verbose: bool = False, quiet: bool = False, ignore_non_existing_metadata_fields: bool = False, write_mode: str | None = None, instant_metadata_record_update: bool = False, use_res_id: bool = False, skip_bad_metadata_sets: bool = False) dict#

Read metadata from JSON file and write them to archived files into HSM. Use absolute paths from metadata records to identify target files.

Parameters:
  • json_string (str) – JSON string containing metadata

  • restart_file (str = None,) – set a restart file in which the processed metadata entries are listed

  • schema (str or list) – import only metadata fields of listed schemata; if str: comma-separated list without spaces

  • expect_json_lines (bool) – read JSON-lines from file instead of JSON

  • verbose (bool) – verbose mode

  • quiet (bool) – quiet mode

  • ignore_non_existing_metadata_fields (bool) – do not throw an error if a metadata field is used, which does not exist in StrongLink

  • write_mode (str) – select write mode for metadata: OVERWRITE, KEEP, ERROR, CLEAN

  • use_res_id (bool) – use resource_id instead of path to identify file

  • skip_bad_metadata_sets (bool) – skip damaged / incomplete metadata sets [default: throw error]

  • instant_metadata_record_update (bool) – False (default): read metadata records of all files and import into StrongLink afterward; True: import each metadata record after it has been read

Returns:

metadata import summary (key ‘header’)

Return type:

dict

pyslk.core.makedirs(gns_path: str | Path, exist_ok: bool = False) int#

Create a directory like ‘mkdir()’ but create parent directories recursively, if they do not exist

If exist_ok is False (the default), a FileExistsError is raised if the target directory already exists.

Parameters:
  • gns_path (str or Path) – gns path to create

  • exist_ok (bool) – throw no error if folder already exists (like ‘mkdir -p’)

Returns:

namespace/resource id of the created namespace

Return type:

int

See also

pyslk.core.mkdir(gns_path: str | Path) int#

Create a directory

If the directory already exists, FileExistsError is raised. If a parent directory in the path does not exist, FileNotFoundError is raised.

Parameters:

gns_path (str or Path) – gns path to create

Returns:

namespace/resource id of the created namespace

Return type:

int

pyslk.core.move(src_path: str, dst_gns: str, no_overwrite: bool) int#

Move namespaces/files from one parent folder to another; renaming is not possible

Parameters:
  • src_path (str) – namespace or file (full GNS path)

  • dst_gns (str) – new parent namespace

  • no_overwrite (bool) – do not overwrite target file if it exists

Returns:

return resource id of the moved resource

Return type:

int

pyslk.core.rename(old_name: str, new_name: str) int#

Rename a folder or file; moving is not possible

Parameters:
  • old_name (str) – folder or file name (full GNS path)

  • new_name (str) – new name (only name; no full GNS path)

Returns:

return resource id of the renamed resource

Return type:

int

pyslk.core.resource_exists(resource: str | Path | int) bool#

Check if resource exists and return True/False

Parameters:

resource (str or path-like) – namespace or resource

Returns:

True if file exists; False otherwise

Return type:

bool

pyslk.core.set_tag(path_or_id: str | int, metadata: dict, recursive: bool = False) dict | None#

Apply metadata to the namespace and child resources

Parameters:
  • path_or_id (str or int) – search id or gns path of resources to retrieve

  • metadata (dict) – dict that holds as keys “[metadata schema].[field]” and as values the metadata values

  • recursive (bool) – use the -R flag to tag recursively, Default: False

Returns:

new metadata of the target files

Return type:

dict or None

pyslk.core.tape_exists(tape: int | str) bool#

Check if tape exists

Parameters:

tape (int or str) – id or barcode of a tape in the tape library

Returns:

True if tape exists; False otherwise

Return type:

bool

pyslk.core.tape_status(tape: int | str, details: bool = False) str | None#

Check the status of a tape

Parameters:
  • tape (int or str) – id or barcode of a tape in the tape library

  • details (bool) – print a more detailed description of the retrieval status

Returns:

status of the tape; None if tape does not exist

Return type:

str or None